The crew of the starship Enterprise thought
nothing of talking to the computer that ran the ship and its systems. Windows
Speech Recognition, a feature introduced in Windows Vista and available
in all editions of Windows 7, comes closest to fulfilling that
futuristic vision of computing. You won't be able to blast Klingon
warships into space dust with voice commands, but if you set slightly
more realistic expectations, we predict you'll be extremely impressed
with Windows Speech Recognition.
Before you can get started,
you'll need to have the right gear. The most important piece of
equipment, naturally, is a high-quality microphone. Microsoft recommends
a USB headset model for best performance. The headset ensures a
consistent distance between your mouth and the microphone, and a USB
connection has an all-digital signal path, unlike direct connections to
an onboard sound card. Both factors increase your chances of success in
accurate speech recognition.
1. Tuning and
Tweaking Windows Speech Recognition
After installing your
hardware and any required drivers, you're ready to begin using Windows Speech
Recognition. You need to run through a quick setup routine, which in
turn strongly encourages you to complete the Windows Speech Recognition
tutorial. Even if you normally prefer to dive right in to a new feature,
we recommend that you make an exception for this tutorial. In small
part, that's because the tutorial does an excellent job of introducing
the Speech Recognition feature. Much more important, though, is the fact
that the speech recognition engine uses your responses during the
tutorial to train itself to recognize your voice
and phrasing. (And it's really not that long, honest.)
With the tutorial out of
the way, you can start Windows Speech Recognition using its shortcut on
the Start menu. If you need to adjust any setup options, you can do so
from Speech Recognition in Control Panel.
When Speech Recognition
is running, you see the capsule-shaped microphone interface pinned to
the top of the screen. When the microphone icon is blue and the word
Listening appears, the speech recognition engine is hanging on your
every word—or for that matter, on stray sounds, which it will try to
convert into text or commands. If you're not actively dictating, click
the microphone button (or say "Stop listening"). The microphone icon
turns gray. If you chose manual activation in the initial setup, you'll
need to click the microphone icon again (or press Ctrl+Windows logo key)
to resume; if you chose voice activation mode, the word "Sleeping"
appears to indicate that it is listening only for the magic phrase
"Start listening" to begin again.
If you find that
the speech recognition interface covers up important information when
docked at the top of the screen, you have several choices. You can move
it so that it floats on the screen. Or you can hide it, by speaking the
command "Hide speech recognition." To make the interface visible once
again, just say "Show speech recognition."
|
To see a list of all
options that you can adjust for Windows Speech Recognition, say "Show speech
options" or right-click the microphone interface.
Without question,
speech recognition embodies a learning curve. A modest amount of time
and effort expended up front pays substantial dividends in the long run.
One technique that can improve your skills and simultaneously improve
the accuracy of the speech recognition engine is to run through the
Speech Recognition Voice Training sessions. Each module includes tips,
suggestions, and background information that you read out loud. The more
modules you complete, the more information the computer has to work
with when you begin speaking next time.
When you speak to the
computer, it parses the sounds and tries to determine whether they
represent commands (which control movement of on-screen objects and the
behavior of programs) or dictation (which represents text you want to
insert in an editing window or a text box). Windows Speech Recognition
has an extensive vocabulary, and it's smart enough to limit the commands
it listens for to those that are applicable to the activity you're
currently engaged in. By learning the words and phrases it is most
likely to respond to, you increase the odds of having it carry out your
commands properly. At any time, you can say "What can I say?" This
all-purpose command opens the Windows Speech Recognition Quick Reference
Card, a Help And Support dialog box that breaks most commands down into
related groups.
2.
Controlling a PC with Voice Commands
The guiding principle
for working with windows, dialog boxes, menus, and other on-screen
objects is simple: "Say what you see." So, for example, you can say
"Start," and Windows Speech Recognition will display the Start menu. You
can then say "All Programs" to open that menu, and continue working
your way to the program you want by saying the names of objects and menu
items you see on the screen. If you know the name of the program you
want to open, you can skip that navigation and just say "Open program."
You can also "click
what you see" (or double-click or right-click). If a window has menus
available, you can speak the names of those menus ("File," "Open") just
as if you were clicking them.
If you can't figure out
what to say to get Windows Speech Recognition to click an object on the screen, make a note
of where the object you want to see is located, and then say "Show
numbers." This command enumerates every clickable object on the screen
and overlays a number on each one, as shown in Figure 1, which depicts what happens to
Control Panel when you choose this option.
Show Numbers works
equally well with webpages, identifying clickable regions and objects on
the page. It also works with the Start menu and the taskbar, offering
an easy way to open and switch programs. If you prefer, you can use the
"Switch to program"
command, substituting the text in the title bar for the program you
want to switch to. To work with individual windows, you can use the
"minimize," "maximize," and "close" commands, followed by the name of the program. For
the currently selected window, use the shortcut "that," as in "Minimize
that." To minimize all open windows, say "Show desktop."
To scroll through text in a
window, say "Scroll up" or "Scroll down." For more control over
scrolling, add a number from 1 through 20 after the command (the larger
the number, the greater the scrolling).
3. Using
Speech to Enter and Edit Data
If it can't interpret what
you say as a command, Windows Speech Recognition assumes that you're trying to
dictate. It then inserts its best guess at what you meant to say at the
current insertion point. The accuracy of speech recognition is
reasonably good after a short period of training, and it gets much
better after time and practice. But it's not perfect, nor are you likely
to dictate smooth sentences with perfect syntax. As a result, you'll
want to master the basics of text editing using the voice commands in this section.
To delete the most
recent word or phrase you dictated, say "Undo" or "Undo that."
If you want to change a
word, phrase, or sentence, start by saying "Select word" or, for a phrase, "Select word through word," substituting the actual text for the italicized
entries here. "Select next [or previous] sentence" works, as does
"Select previous five words" or "Select next two sentences." After you
make a selection, you can delete it or copy it to the Clipboard ("Copy
that").
The "Go to" command is
powerful. If you follow it with a unique word that appears in the text,
the word you spoke will be selected immediately. If the word appears
multiple times, each one is highlighted with a number. Say the number
and then say "OK." You can say "Go to before" or "Go to after" a
particular word, and Windows Speech Recognition will obey your commands. To go to
the top or bottom of the current editing window, say "Go to the start
of the document" or "Go to the end of the document."
If you need to correct a
word that was misrecognized, say "Correct word." When you do, Windows Speech
Recognition reexamines what you said and displays a list of words or
phrases that might be a better match, as shown in Figure 2. If the word you spoke is on the list,
say its number, followed by "OK." If the word isn't on the list, try
saying it again. Or say "Spell it" and then recite each letter, with or
without phonetic helpers ("A as in apple").
Punctuation is easy: to
insert a period, comma, colon, semicolon, or apostrophe, just say the
word. Literally. The Quick Reference Card has a long list of punctuation
marks the speech recognition engine will translate.
To enter a carriage return, say "New paragraph"
or "New line."
You can simulate the action of pressing any key by saying "Press key," substituting the
name of the key for the italicized word. To repeat a key, say "Press key nn times," substituting a number for nn. A handful of
special keys are recognized without the magic introductory word "press":
Home, End, Space, Tab, Enter, and Backspace all fall into this
category.