Conversational engagement tracked

By Kimberly Patch, Technology Research News

It would be useful if a computer could sense ebbs and flows in conversation in order to automatically adjust remote communications systems. It would be useful, for instance if a system automatically switched from a walkie-talkie-type push-to-talk system to a telephone-like full duplex audio connection when the participants become highly engaged in a conversation.

Language is often fairly cryptic, however. The phrase "I am interested in this conversation", for instance, can signal enjoyment or polite boredom.

Researchers from the University of Rochester and Palo Alto Research Center are aiming to allow computers to automatically assess peoples' engagement in a conversation by analyzing the way they speak rather than what they say.

The researchers' system analyzes tone of voice and prosodic style, which includes changes in strength, pitch and rhythm. "We do not look at what users say, but how involved they are in the conversation when they say it -- how into the conversation they are," said Chen Yu, a University of Rochester researcher who is now an assistant professor of psychology and cognitive science at Indiana University.

As voice communication shifts from traditional telephone networks to the more flexible Internet it is becoming easier to seamlessly shift between different communication channels, said Paul Aoki, a research scientist at the Palo Alto Research Center. The system could be used to automatically adapt voice channels on-the-fly.

The system could also make it possible for computers to adjust to users in other ways, said Aoki. "If your computer can detect that you are deeply engaged in conversation with another person, whether on the telephone or the same room... it might defer a loud announcement that you have new email, or it might set your instant messaging status to busy," he said.

Although humans are social animals, machine understanding of users' social states has received relatively little attention, said Yu.

Detecting how engaged people are from the sound of their voices is not straightforward, said Aoki. Previous research has tried to glean information about engagement by detecting emotion. But engagement is not the same as emotion. "You can be highly engaged in sad... or angry conversations as well as happy ones," he said.

The researchers' system adds the ability to sense characteristics of conversational engagement to previous methods of recognizing speech emotion, taking into consideration changes in emotion over time and the influence of participants on each other.

The system measures the prosodic aspects for individual users and feeds the results into a first-level module that has been trained to recognize patterns in these measurements, associating certain patterns with particular emotional states, said Aoki. The system measures the strength of emotion, whether the emotion is positive or negative, and emotion type -- anger, panic, sadness, happiness, interest, boredom, and the absence of emotion. This first-level measurement only reflects an individual's state at a moment in time.

To decide how engaged the user is in the conversation, the second level looks at patterns in the stream of emotion states over time, and at the emotion states of the other person in the conversation, said Aoki. "We added this consideration of both time and other people because we wanted to model the fact that conversation is a social interaction," he said. "Whether or not you are engaged in a particular conversation at a given moment is part of a social process that changes over time and involves all of the participants in the conversation."

The system measures five levels of engagement. The researchers' used recorded phone conversations to test the system. Using just the first level emotion detector they were able to rank the levels of engagement with a 47 percent accuracy rate, which is more than double the 20 percent accuracy that would result from random choices. The method to track emotion over time boosted the accuracy rate to 61 percent. Adding emotion tracking of the person the subject was talking to boosted the accuracy rate to 63 percent.

One technical challenge in building the system was finding methods that categorize emotional states accurately and worked well across different speakers, said Yu. "People's emotional responses and the ways in which they convey emotion using speech vary widely across individuals," he said.

The Palo Alto Research Center scientists are working to add the software to their existing voice communication system in order to do real-world tests.

The overall goal of the research is to build voice communication systems that respond to the way people talk, said Aoki. Now that lots of people have mobile phones, talk within tight social groups like teenage or young adult friends can be very frequent. At the same time, frequent phone calls can be annoying. "We're trying to build systems that let people ease in and out of remote conversations, just as you can when people are physically together," said Aoki. "Determining how engaged users are in the conversation is one part of that research."

The method could be used in practical applications in three to six years, said Yu.

Yu and Aoki's research colleague was Allison Woodruff, who is now at Intel Research. The work appeared in the proceedings of the 8th International Conference on Spoken Language Processing (ICSLP) held October 4 to 8, 2004 on Jeju Island in Korea. The research was funded by the Palo Alto Research Center.

Timeline:   3-6 years
Funding:   Corporate
TRN Categories:  Human-Computer Interaction; Pattern Recognition
Story Type:   News
Related Elements:  Technical paper, "Detecting User Engagement in Everyday Conversations," proceedings of the 8th International Conference on Spoken Language Processing (ICSLP) on Jeju Island in Korea October 4-8, 2004 and posted on the Computing Research Repository (CoRR)at arxiv.org/PS_cache/cs/pdf/0410/0410027.pdf.




Advertisements:



December 1/8, 2004

Page One

For pure nanotubes add water

Solar cell doubles as battery

Conversational engagement tracked

Pure silicon laser debuts

Briefs:
Tight twist toughens nanotube fiber
Multicamera surveillance automated
Chemical keeps hydrogen on ice
Smart dust gets magnetic
Short nanotubes carry big currents
Demo advances quantum networking

News:

Research News Roundup
Research Watch blog

Features:
View from the High Ground Q&A
How It Works

RSS Feeds:
News  | Blog  | Books 



Ad links:
Buy an ad link

Advertisements:







Ad links: Clear History

Buy an ad link

 
Home     Archive     Resources    Feeds     Offline Publications     Glossary
TRN Finder     Research Dir.    Events Dir.      Researchers     Bookshelf
   Contribute      Under Development     T-shirts etc.     Classifieds
Forum    Comments    Feedback     About TRN


© Copyright Technology Research News, LLC 2000-2006. All rights reserved.