Cornell University
Library
Cornell UniversityLibrary

eCommons

Help
Log In(current)
  1. Home
  2. Cornell University Graduate School
  3. Cornell Theses and Dissertations
  4. LANGUAGE MODELS FOR ROBOT TASK PLANNING WITH HUMAN DEMONSTRATIONS

LANGUAGE MODELS FOR ROBOT TASK PLANNING WITH HUMAN DEMONSTRATIONS

File(s)
Sharma_cornell_0058O_12081.pdf (35.77 MB)
Permanent Link(s)
https://doi.org/10.7298/aywm-eb05
https://hdl.handle.net/1813/115862
Collections
Cornell Theses and Dissertations
Author
Sharma, Yash
Abstract

Two major challenges exist with high-level robot task planning when the goal is under specified: humans have implicit assumptions and preferences they may not articulate when specifying a goal/reward, and visual demonstrations arehard to ground in a form robots can understand. This thesis addresses these challenges leveraging Large Language Models (LLMs) and Vision-Language Models (VLMs) to convert task demonstrations to robot code offline, then adapt to changes planning online. This work is divided into three connected components. Firstly, we developed DEMO2CODE, that generates robot code from demonstrations assumed to be grounded in text form. We focus here on its quality in capturing preferences from real-world demonstrations. Secondly, we delve into grounding visual input in text form with VIDEO2DEMO, focusing on open-vocabulary predicate and action recognition in kitchen tasks. Lastly, we build and deploy an AI task planner to allow collaborative cooking in our MOSAIC framework. We focus here on evaluating the task planner on its safety violations while interacting with participants in our user study. In summary, this thesis focuses on robot task planning and human-robot interaction by using demonstrations for effective code generation, grounding visual information in text, capturing and adhering to preferences of the human, all using generative language models.

Description
78 pages
Date Issued
2024-05
Keywords
ai in decision making
•
deep learning
•
large language models
•
machine learning
Committee Chair
Choudhury, Sanjiban
Committee Member
Edelman, Shimon
Degree Discipline
Computer Science
Degree Name
M.S., Computer Science
Degree Level
Master of Science
Rights
Attribution-ShareAlike 4.0 International
Rights URI
https://creativecommons.org/licenses/by-sa/4.0/
Type
dissertation or thesis
Link(s) to Catalog Record
https://newcatalog.library.cornell.edu/catalog/16575485

Site Statistics | Help

About eCommons | Policies | Terms of use | Contact Us

copyright © 2002-2026 Cornell University Library | Privacy | Web Accessibility Assistance