Reasoning in the Wild
No Access Until
Permanent Link(s)
Collections
Other Titles
Author(s)
Abstract
Teaching machines to reason has been a longstanding goal in artificial intelligence. Recently, the rapid advancement of language modeling has advanced this vision, opening up new possibilities for automated reasoning. Although existing benchmarks have demonstrated strong reasoning performance of LMs, it remains unclear how effectively such models reason in real-world scenarios, where queries differ significantly in complexity and style from the standard evaluation datasets. This dissertation identifies two main obstacles that prevent LLMs from reasoning effectively in realistic settings: (1) distributional mismatches between standard training data and the user queries encountered in the wild, and (2) the difficulty and cost associated with collecting expert-annotated training data for complex reasoning tasks. To better assess reasoning performance under realistic conditions, we first introduce data sources and evaluation benchmarks directly collected from real-world use cases, establishing representative real-world reasoning challenges. Analysis of these benchmarks reveals significant limitations in contemporary language models, highlighting areas that require progress. Subsequently, we explore training approaches using alternative supervision that enable reasoning without reliance on manually annotated data. We investigate structural supervision, an approach that incorporates prior knowledge about the underlying structure of reasoning tasks into latent variable models, enabling them to better handle different reasoning scenarios, such as multi-hop inference and abductive reasoning. Additionally, we explore using language agents for complex reasoning tasks. Language agents utilize environmental feedback, where they learn iteratively by interacting with an external environment rather than from explicit annotations.
Journal / Series
Volume & Issue
Description
Sponsorship
Date Issued
Publisher
Keywords
Location
Effective Date
Expiration Date
Sector
Employer
Union
Union Local
NAICS
Number of Workers
Committee Chair
Committee Co-Chair
Committee Member
Ellis, Kevin
Kleinberg, Robert