Learning to Read, Ground, and Reason in Multimodal Text

Web data, news and textbooks offer informative but unstructured multimodal text. The ability to translate multimodal text into a semantic representation that is amenable to further reasoning is a fundamental problem in modern AI. In this project we design systems that can understand and use multimodal text through multiple interconnected components: semantic interpretation, multimodal alignment, knowledge acquisition and reasoning.

Researchers

Hannaneh Hajishirzi

Research Areas

Data Science
Biosystems
Computing and Networking
Electronic, Photonic, and Integrated Quantum Systems (EPIQS)
Power and Energy Systems
Robotics and Controls