Current Projects in Human Language Technology Group |
Learning
and Optimization for Multimodal Interpretation in Conversation Systems (PI: J. Chai, NSF Career Award) Multimodal systems allow users to interact with computers through multiple modalities such as speech, gesture, and gaze. These systems are designed to support transparent, efficient, and natural means of human computer interaction. Understanding what the user intends to communicate is one of the most significant challenges for multimodal systems. Despite recent progress in multimodal interpretation, when unexpected inputs (e.g., inputs that are outside of system knowledge) or unreliable inputs (e.g., inputs that are not correctly recognized) are encountered, these systems tend to fail. Variations in vocabulary and multimodal synchronization patterns, disfluencies in speech utterances, and ambiguities in gestures can seriously impair interpretation performance. This project seeks to improve the robustness of multimodal interpretation by adapting system interpretation capability over time through automated knowledge acquisition and optimizing interpretation through probabilistic reasoning. (Picture: Ph.D. student Zahar Prasov interacts with a system using speech and gesture) Recent Papers: * Salience Modeling based on Non-verbal Modalities for Spoken Language Understanding. S. Qu and J. Chai. ACM 8th International Conference on Multimodal Interfaces (ICMI), pp. 193-200, Banff, Canada, November 2-4, 2006. * Cognitive Principles in Robust Multimodal Interpretation. J. Chai, Z. Prasov, and S. Qu. Journal of Artificial Intelligence Research, Vol 27, pp. 55-83, 2006. * Linguistic Theories in Efficient Multimodal Reference Resolution: an Empirical Investigation. J. Chai, Z. Prasov, J. Blaim, and R. Jin. The 10th International Conference on Intelligent User Interfaces (IUI-05), pp. 43-50, ACM, San Diego, CA, January 9-12, 2005. * Optimization in Multimodal Interpretation. J. Chai, P. Hong, M. Zhou, and Z. Prasov. The 42nd Annual Conference of Association of Computational Linguistics (ACL), pp. 1-8, Barcelona, Spain. July 22-24, 2004.
=============================================================================================
(PI:
J. Chai, Co-PI: F. Ferreira, Funded by NSF) Previous psycholinguistic work has shown that eye gaze is tightly linked to human language processing. Almost immediately after hearing a word, the eyes move to the corresponding real-world referent. And right before speaking a word, the eyes also move to the mentioned object. Not only is eye gaze highly reliable, it is also an implicit, subconscious reflex of speech. The user does not need to make a conscious decision; the eye automatically moves towards the relevant object, without the user even being aware. Motivated by these psycholinguistic findings, our hypothesis is that during human machine conversation, user eye gaze information coupled with conversation context can signal a part of the physical world (related to the domain and the graphic interface) that is most salient at each point of communication. This salience in the physical world will in turn prime what users communicate to the system, and thus can be used to tailor the interpretation of speech input. Based on this hypothesis, this project examines the role of eye gaze in human language production during human machine conversation and develops algorithms and systems that incorporates gaze-based salience modeling to robust spoken language understanding. (Picture: Smoothed eye gaze fixations on the graphic display recorded during a user talks to the system) * Automated Vocabulary Acquisition and Interpretation in Multimodal Conversational Systems. Y. Liu, J. Chai, and R. Jin. The 45th Annual Meeting of the Association of Computational Linguistics (ACL), Prague, Czech Republic, June 23-30, 2007. *
An
Exploration of Eye Gaze in Spoken Language Processing for Multimodal
Conversational Interfaces. S. Qu and J. Chai. 2007 Meeting of
the North American Chapter of the Association of Computational Linguistics
(NAACL-07), *
Eye
Gaze for Attention Prediction in Multimodal Human Machine Conversation.
Z. Prasov, J. Chai, and H. Jeong. The
AAAI 2007 Spring Symposium on Interaction Challenges for Artificial
Assistants,
========================================================================================================
Discourse
Processing in Conversational QA (PI:
J. Chai, Co-PI: R. Jin, Funded by DTO)
Recent papers: * Discourse Processing for Context Question Answering based on Linguistic Knowledge. M. Sun and J. Chai. Knowledge-based Systems, Special Issues on Intelligent User Interfaces, Volume 20, Issue 6, pp. 511-526, August 2007. * Towards
Conversational QA: Automated Identification of Problematic Situations and
User Intent. J. Chai, C.
Zhang, and T. Baldwin. Proceedings of the International Conference on
Computational Linguistics/Association for Computational Linguistics (COLING/ACL-06)
Poster Session, pp. 57-64, Sydney
, * Automated Performance Assessment in Interactive Question Answering. J. Chai, T. Baldwin, and C. Zhang. Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development on Information Retrieval (SIGIR2006), pp. 631-632, Seattle, USA, August 6-11, 2006. |
|