Common sense boosts speech software
By
Eric Smalley,
Technology Research News
There's
nothing like losing an ability you take for granted for inspiring creative
solutions.
A researcher at the Massachusetts Institute of Technology's Media
Lab found that out when a bicycle accident broke both hands, leaving him
unable to type for a few months.
"I decided it was a good time to learn about speech recognition,"
said Henry Lieberman. "I realized that the work we were doing in common
sense reasoning could help. We had already done an interface for predictive
typing and I realized the same principles would apply," he said.
Speech recognition software matches strings of phonemes -- the sounds
that make up words -- to words in a vocabulary database. The software finds
close matches and presents the best one. The software does not understand
word meaning, however. This makes it difficult to distinguish among words
that sound the same or similar.
The Open Mind Common Sense Project database contains more than 700,000
facts that MIT Media Lab researchers have been collecting from the public
since the fall of 2000. These are based on common sense like the knowledge
that a dog is a type of pet rather than the knowledge that a dog is a type
of mammal.
The researchers used the phrase database to reorder the close matches
returned by speech recognition software. In the example 'My bike has a squeaky
brake', ordinary speech recognition software might have trouble distinguishing
between "brake" and "break", but the researchers' system knows that bicycles
have brakes, and so makes the correct choice, said Lieberman.
The researchers evaluated their common sense speech recognition
technique by logging the errors and dictation times of users who dictated
text that contained topics covered by the Open Mind database. It prevented
17 percent of the errors and reduced dictation time by 7.5 percent, said
Lieberman.
In addition to reducing errors, the approach improves error correction.
When a user indicates an error while dictating using speech recognition
software, the software presents a menu of alternatives and the user selects
one. The researchers found that users often gave up searching the menu before
reaching the end, and so dictated phrases over again even though the correct
word or phrase was available at the end of the menu. The common sense filtering
assures that the correct word is more likely to appear at the top of the
list, he said.
Researchers have used other ways of improving the choices speech
recognition software makes, including methods that put more emphasis on
the most common English words, words that commonly occur together, and the
speaker's most recent words, said Lieberman. However, none of these can
tell if a word makes sense in a given context, he said.
"One surprising thing about testing interfaces like this is that
sometimes, even if they don't get the absolutely correct answer, users like
them a lot better," said Lieberman. "This is because they make plausible
mistakes, for example 'tennis clay court' for 'tennis player', rather than
completely arbitrary mistakes that a statistical recognizer might make,
for example 'tennis slayer'," he said.
"This suggests that there ought to be more research into how to
get computers to make better mistakes," said Lieberman.
The researchers are working on an improved interface for correcting
speech recognition mistakes, said Lieberman. Menu correction takes 10 times
as long as saying a single word, so directly inserting one of a few likely
correction alternatives chosen using the common sense technique would improve
throughput, he said.
The software could be used with today's commercial speech recognition
technology, according to Lieberman. "Certainly with a year or so of development
work, people could see substantial improvements," he said.
Lieberman's research colleagues were Alexander Faaborg, Waseem Dahera
and José Espinoza. They presented the research at the Intelligent User Interfaces
Conference (IUI 2005), held in San Diego, January 9 to 12, 2005. The research
was funded by the MIT Media Lab's corporate and government sponsors.
Timeline: Now
Funding: Corporate, Government
TRN Categories: Human-Computer Interaction
Story Type: News
Related Elements: Technical paper, "How to Wreck a Nice Beach
You Sing Calm Incense," Intelligent User Interfaces Conference (IUI 2005),
San Diego, January 9-12, 2005
Advertisements:
|
March 23/30, 2005
Page
One
Stories:
Tool turns English to code
Common sense boosts
speech software
Inkjet prints human cells
How it Works: Biochips
Briefs:
Nanowires
track molecular activity
Microdroplet
makes mighty microscope
Cheap material
makes speedy memory
Tiny crystals
adjust laser colors
Electricity
controls biomolecules
Nanotubes juice
super batteries
Layers promise
cheap circuits
News:
Research News Roundup
Research Watch blog
Features:
View from the High Ground Q&A
How It Works
RSS Feeds:
News | Blog
| Books
Ad links:
Buy an ad link
Advertisements:
|
|
|
|