Humans are good at engaging in conversation and understanding written text, so much so that these sorts of natural language processing (NLP) tasks are widely considered essential requirements for any truly intelligent artificial agent. The NLP community has made a great deal of progress over the last few years on certain challenges (for example, the SuperGLUE benchmarks and Winograd Schemas), thanks to the power of neural networks and pre-trained models like BERT and GPT-3. Nevertheless, existing systems still don’t approach human performance for useful tasks such as answering general, open-ended questions about stories or participating in dialogue.
In this talk, I analyze why NLP’s recent achievements do not generalize to true language understanding. I suggest a broad remedy – automated large-scale extraction of knowledge from text and reasoning with that knowledge, together with existing background information, to achieve deeper understanding – and propose methods to achieve this. I focus on texts that have exploitable structure, including bulleted lists and tables, and discuss the techniques I’ve used for automating understanding of regulatory and molecular biology texts. Finally, I discuss how comprehending the goal-action-effect structure implicit in narrative texts, including news articles and business case studies, can help extract useful general knowledge about actions and plans, and consider how these techniques could contribute to AI succeeding at the difficult tasks of story understanding and analysis.
Leora Morgenstern is Principal Scientist at PARC, specializing in synthesizing automated techniques in knowledge representation and natural language understanding for deep understanding of text. Leora received her PhD in computer science/artificial intelligence from Courant Institute of Mathematical Sciences at NYU, and her BA from CCNY in mathematics and philosophy. She has previously worked at Brown University, IBM Watson Research, SAIC/Leidos, Nuance Communications, and STR. Dr. Morgenstern's research has focused on extending state-of-the-art AI and formal KR techniques for commercial and real-world applications. She has served as technical lead for developing and deploying applications for Fortune-500 applications in insurance, banking, telephony, software sales, and business continuity. She was principal investigator of DARPA and IARPA programs and seedlings, including Big Mechanism (automated reading and understanding molecular biology articles), TAILCM (automated legal reasoning), and Machine Reading (automated deep understanding of news stories). A founder and leader of the formal commonsense reasoning community, she has authored / edited over 80 technical publications.