eCommons

 

Reasoning in the Wild

Other Titles

Abstract

Teaching machines to reason has been a longstanding goal in artificial intelligence. Recently, the rapid advancement of language modeling has advanced this vision, opening up new possibilities for automated reasoning. Although existing benchmarks have demonstrated strong reasoning performance of LMs, it remains unclear how effectively such models reason in real-world scenarios, where queries differ significantly in complexity and style from the standard evaluation datasets. This dissertation identifies two main obstacles that prevent LLMs from reasoning effectively in realistic settings: (1) distributional mismatches between standard training data and the user queries encountered in the wild, and (2) the difficulty and cost associated with collecting expert-annotated training data for complex reasoning tasks. To better assess reasoning performance under realistic conditions, we first introduce data sources and evaluation benchmarks directly collected from real-world use cases, establishing representative real-world reasoning challenges. Analysis of these benchmarks reveals significant limitations in contemporary language models, highlighting areas that require progress. Subsequently, we explore training approaches using alternative supervision that enable reasoning without reliance on manually annotated data. We investigate structural supervision, an approach that incorporates prior knowledge about the underlying structure of reasoning tasks into latent variable models, enabling them to better handle different reasoning scenarios, such as multi-hop inference and abductive reasoning. Additionally, we explore using language agents for complex reasoning tasks. Language agents utilize environmental feedback, where they learn iteratively by interacting with an external environment rather than from explicit annotations.

Journal / Series

Volume & Issue

Description

201 pages

Sponsorship

Date Issued

2025-05

Publisher

Keywords

Location

Effective Date

Expiration Date

Sector

Employer

Union

Union Local

NAICS

Number of Workers

Committee Chair

Cardie, Claire

Committee Co-Chair

Committee Member

Rush, Alexander
Ellis, Kevin
Kleinberg, Robert

Degree Discipline

Computer Science

Degree Name

Ph. D., Computer Science

Degree Level

Doctor of Philosophy

Related Version

Related DOI

Related To

Related Part

Based on Related Item

Has Other Format(s)

Part of Related Item

Related To

Related Publication(s)

Link(s) to Related Publication(s)

References

Link(s) to Reference(s)

Previously Published As

Government Document

ISBN

ISMN

ISSN

Other Identifiers

Rights

Attribution 4.0 International

Types

dissertation or thesis

Accessibility Feature

Accessibility Hazard

Accessibility Summary

Link(s) to Catalog Record

https://newcatalog.library.cornell.edu/catalog/16938219