Alexander Rudnicky

Research Professor

Research Interests:

Multimodal Computing and Interaction

Speech Processing

Spoken Interfaces and Dialogue Processing

My research centers on interactive systems that use speech. I am currently interested in the following problems:

Implicit Learning : Human-computer interaction generates information that the system could use to modify its behavior. For example, a speech recognition error that is repaired leaves information about a misclassification that could be used to improve subsequent accuracy. Exploiting such situations (and in fact contriving to create them) can provide a rich set of experiences that drive self-improving systems

Automatic detection and recovery from error: Humans easily detect and recover from breakdowns in communication. Automatic systems are less successful at this, however we can use features of recognition, understanding and dialog to predict the likelihood of misunderstanding at a given point in an interaction, and then apply heuristic strategies for guiding the conversation back onto track.

A theory of language design for speech-based interactive systems: Speech-mode communication predisposes the user to make certain word choices and to exhibit certain grammatical preferences. An understanding of the principles that underlie these preferences (and how these can be influenced by the system's language) leads to better language design for interactive systems.

The role of speech in the computer interface: Speech is an effective means of communication, but it is not always suitable for all types of interaction. Ideally we can analyze an interface in terms of the task(s) it will be used for, the costs of specific interactions and the value perceived by the user. To date we've studied models based on time, system error and task structure. These models turn out to be useful for simple systems and appear to be extensible to more complex systems.

Many of these issues are being explored in the context of working systems, for example a language interface for a team of humans and robots working together (Treasure Hunt) or a information access for conferences (ConQuest) or for scheduling court time (Let’s Play).