You are on page 1of 4

List of speech recognition software

From Wikipedia, the free encyclopedia

Jump to: navigation, search


This section needs additional citations for verification.
Please help improve this article by adding reliable references. Unsourced material may be challenged
and removed. (July 2009)
This article's tone or style may not be appropriate for Wikipedia. Specific concerns
may be found on the talk page. See Wikipedia's guide to writing better articles for
suggestions. (July 2009)

Modern speech recognition software enables a single computer user to speak text and/or
commands to the computer, largely, but not entirely, bypassing the use of the keyboard
and mouse interface.

The idea has been portrayed in science fiction for many decades, quite frequently
depicting computers that do not even have keyboards or mice. Such computers are also
typically depicted as being able to keep up no matter how fast a person speaks, and
without regard to who the speaker is, the language spoken, or even how many speakers
there are. In other words, they're depicting a computer that hears in like manner as a
multilingual person.

Attempts to develop usable speech recognition software began in the mid-1900s, and
proved to be far more daunting than anyone had imagined. It also turned out to require so
much computing power that only the most modern computers are now able to perform the
functions required in real time (i.e., as fast as you can speak).

The first commercially practical products became available around 1990, (e.g. the Voice
Navigator, a standalone computer dedicated 100% to speech recognition) and used up all
the available computing power of the machine, which would send its output to a second
computer. They weren't particularly accurate and could only understand a single person at
a time, requiring retraining, not of the operator but of the machine itself, to work for
another person. Despite these limitations, they could type so rapidly that even after taking
time to make corrections, a person with disabilities could easily accomplish more work
with the machine than without it. For persons with physical disabilities, the ability to
simply talk to your computer could be a priceless asset. Consider for instance, an author
with Parkinson's disease who can barely control his hands, yet is conveniently able to
create an article.

There are other scenarios where the deficiencies of the equipment are easily outweighed.

Consider a facility where corrosive materials, or high-voltage equipment, are being


handled... The massive gloves required for that type of work typically preclude using a
keyboard.
Most modern telephones now include voice dialing -- with the simplified requirements
associated with voice dialing, it is easily accomplished without training the computer for
a specific user.

The current state-of-the-art in 2008 is that a properly trained computer, operated by a


normal healthy adult (i.e. no speech impediments), with an Intel Core Duo 1.5 GHz CPU
(or faster), can achieve approximately 99% accuracy while transcribing up to about 150
words per minute (while using most of the computing power available). Superficially this
might sound very good. Note however, a very stable voice is required. A successful
operator, upon developing a nasty head cold, may suddenly find that his machine does
not understand him at all. And yet most humans have no trouble at all understanding even
in that difficult situation.

Consider for example, the machines do not have enough intelligence yet to properly
process a child's voice. Obstacles include the fact that most children don't yet fully
understand how language is used (e.g. proper construction of a complete sentence) and as
they are growing their voices are continuously changing.

There are now both proprietary and open source systems on the market, with
development emphasis being placed upon serving the legal and medical markets.

Contents
[hide]

• 1 Free software
o 1.1 Free speech corpus and acoustic model repositories
• 2 Proprietary software

• 3 References

[edit] Free software


• CMU Sphinx — open source under a BSD license
• Julius — BSD-style license

[edit] Free speech corpus and acoustic model repositories

• VoxForge — open source, GPL

[edit] Proprietary software


• AT&T WATSON
• HTK — copyrighted by Microsoft, but altering the software for the Licensee's
internal use is allowed.
• CSLU Toolkit
• Dragon NaturallySpeaking from Nuance Communications is the continuous-
speech successor to the older DragonDictate product, and appears to be the focus
of all their current development effort (in the dictation area). Since version 10.1 it
runs on 64-bit Windows, too.
• IBM ViaVoice - Control and development as it pertains to embedded processors
remain in the hands of IBM. Linux, Mac OS, and Windows products were
licensed to Nuance Communications (formerly ScanSoft) which has since
discontinued the product. The Nuance website provides a list of which legacy
systems can run the final versions.
• MacSpeech Dictate - Mac OS X speech recognition using the Dragon
NaturallySpeaking engine. This replaces MacSpeech's former iListen product
which is based on Philips Speech Technology.
• Microsoft Windows Speech Recognition - Windows Vista and Windows 7
includes version 8.0 of the Microsoft speech recognition engine along with a
completely new end user speech experience, known as Windows Speech
Recognition.
• Microsoft Speech API - Speech recognition functionality included as part of
Microsoft Office and on Tablet PCs running Microsoft Windows XP Tablet PC
Edition. It may also be downloaded as part of the Speech SDK 5.1 for Windows
applications, but since that is aimed at developers building speech applications,
the pure SDK form lacks any user interface, and thus is unsuitable for end users.
• Philips SpeechMagic - Market leader within the medical industry according to
Frost & Sullivan, Philips SpeechMagic is a recognition engine that may be run
either as a stand-alone product or integrated into other applications.[1][2]
• Proteus Conversational Interface
• Simmortel Voice
• Quack.com (acquired by AOL)
• SpeechWorks
• Tellme Networks (acquired by Microsoft)
• Loquendo ASR

[edit] References
1. ^ [1]
2. ^ Philips SpeechMagic named European Technology Leader by Frost & Sullivan

Retrieved from "http://en.wikipedia.org/wiki/List_of_speech_recognition_software"


Categories: Speech recognition | Lists of software
Hidden categories: Articles needing additional references from July 2009 | All articles
needing additional references | Wikipedia articles needing style editing from July 2009 |
All articles needing style editing

Views

• Article
• Discussion
• Edit this page
• History

Personal tools

• Try Beta
• Log in / create account

Navigation

• Main page
• Contents
• Featured content
• Current events
• Random article

Search

Go Search

Interaction

• About Wikipedia
• Community portal
• Recent changes
• Contact Wikipedia
• Donate to Wikipedia
• Help

Toolbox

• What links here


• Related changes
• Upload file
• Special pages
• Printable version
• Permanent link
• Cite this page

You might also like