Dragon FAQs
Q: How does speech recognition work?
Speech recognition software products like Dragon NaturallySpeaking use the human voice as the main interface between the user and the computer. While relatively simple to use, speech recognition software is a highly sophisticated technology that leverages “language modeling” to recognise and differentiate among the millions of human utterances that make up any language. Using statistical models, speech recognition programs analyse an incoming stream of sound and interpret those sounds as commands and dictation. This process of interpretation is called speech recognition, and its success is measured by the percentage of correct interpretations.
Dragon NaturallySpeaking is an example of a speaker dependent speech recognition system. Dragon creates a voice profile for each user of the system that contains information about the unique characteristics of each person's voice along with a customised set of words, known as a vocabulary, and user specific information including software settings and personalised macros. When Dragon users create and train their user profile, they start with a standard set of models and then customise them for the way they speak (acoustic model) and the way they use words (vocabulary and associated language model). This approach accommodates users with varying accents and speech patterns. The software employs the customised user profile to guess the words spoken. Every time an individual uses Dragon and corrects his recognition errors, the software updates his user profile to enable better recognition accuracy over time.
In most cases, speech recognition is used in conjunction with other input devices including keyboards and mice. However, users can leverage speech recognition to control 100 percent of their computing environment, making this technology ideal for employees with physical challenges, repetitive strain injuries or other reasons to operate information systems completely hands-free.
Q: Do I need to talk in a monotone voice at a slow pace for Dragon NaturallySpeaking to understand me?
No. Speak at your normal pace without slowing down. Your accuracy will actually be best if you speak in long, well-enunciated phrases or sentences. Speaking slowly and deliberately, in short phrases or single words, can actually result in more recognition errors. Longer phrases provide more context, which helps Dragon NaturallySpeaking recognise individual words. To understand what it means to speak both clearly and naturally, listen to the way newscasters read the news. If you copy this style when you dictate, the program should successfully recognise what you say.
In general conversations, many people may mumble, slur their words, or leave words out altogether. They assume, usually correctly, that their listeners will be able to fill in the gaps. Unfortunately, computers won't understand mumbled speech or missing words. They only understand what was actually spoken and don't know enough to fill in the gaps by guessing what was meant.
Q: Is talking to a computer the same as talking to a person?
No. What the computer does when it listens to speech is different from what a person does. Understanding spoken language is something that people often take for granted. Most of us develop the ability to recognise speech when we're very young. We're already experts at speech recognition by the age of three or so.
The first challenge in speech recognition is to identify what is speech and what is just noise. People can filter out noise fairly easily, which lets us talk to each other almost anywhere. We have conversations in busy train stations, across the dance floor, and in crowded restaurants. It would be very dull if we had to sit in a quiet room every time we wanted to talk to each other! Unlike people, computers need help separating speech sounds from other sounds. When you speak to a computer, you should be in a place without too much noise. Then, you must speak clearly into a microphone that has been placed in the right position. If you do this, the computer will hear you just fine, and not get confused by the other noises around you.
Another challenge is how to distinguish between two or more phrases that sound alike. People use common sense and context — knowledge of the topic being talked about — to decide whether a speaker said "ice cream" or "I scream." Speech-recognition programs don't understand what words mean, so they can't use common sense the way people do. Instead, they keep track of how frequently words occur by themselves and in the context of other words. This information helps the computer choose the most likely word or phrase from among several possibilities.
Q. Can I use Dragon NaturallySpeaking to transcribe interviews or meetings?
No, Dragon NaturallySpeaking is a speaker-dependent system, meaning that it trained to recognise the voice of a single user and cannot distinguish speech from more than one speaker. People have no problem understanding both Aunt Grace, who has a high, thin voice, and Cousin Paul, who has a voice like a foghorn, because people can easily adjust to the unique characteristics of every voice. Speech-recognition software, on the other hand, works best when the computer has a chance to adjust to each new speaker. The process of teaching the computer to recognize your voice is called "training."
Q. Does Dragon work on Macintosh computers?
No. Dragon is designed for Windows operating systems.
Q. Does Dragon support a 64-bit OS?
No. Nuance plans to release a 64-bit version of Dragon NaturallySpeaking in a future version.
Q: What wireless headsets are compatible with Dragon for dictation?
Certified Bluetooth devices are listed under the category: "Wireless microphones" on the Nuance hardware compatibility web site: http://support.nuance.com/compatibility/default.asp. Each device lists the wireless technology — RF or Bluetooth. The certified Bluetooth device lists the drivers and adapters/dongles used in the product bundle that were tested. The BlueParrott from VXI is plug and play and comes with a factory paired base station and driver software. The XCommunicator5 consists of the Sony Ericsson Akono headset HBH-300 microphone and XCommunicator 5 drivers and Bluetooth USB adapter/dongle.
Q. What does “no training required” mean? If I don’t need product training, why is there a tutorial?
"No Training Required," a feature introduced in Dragon NaturallySpeaking 9, is focused on voice training as opposed to product training. It is a valuable and important positioning statement to say that Dragon NaturallySpeaking is the first product to eliminate voice training and still deliver high accuracy levels. This strategic message is intended for those who have previously spent a long time in the enrollment process and still received poor results." Training" is the word most often associated with describing "enrollment" or "script reading." The use of "training" in context with "high accuracy" or "eliminates the need to train the software to a user's voice" does not eliminate the need for product training.
|