Cornell University
Library
Cornell UniversityLibrary

eCommons

Help
Log In(current)
  1. Home
  2. Cornell University Graduate School
  3. Cornell Theses and Dissertations
  4. Pushing the Boundaries of 3D Spatial Understanding

Pushing the Boundaries of 3D Spatial Understanding

File(s)
Cai_cornellgrad_0058F_15215.pdf (63.84 MB)
Permanent Link(s)
https://doi.org/10.7298/848s-9e12
https://hdl.handle.net/1813/120906
Collections
Cornell Theses and Dissertations
Author
Cai, Ruojin
Abstract

Understanding the 3D world from data collected by sensors such as cameras and LiDAR is a fundamental problem in computer vision, with applications in robotics, augmented and virtual reality, and autonomous systems. Although current algorithms can produce high-quality 3D reconstructions, they often struggle under real-world conditions, such as when data is sparse, visual overlap is limited, or scenes contain strong ambiguities like repetition and symmetry. This thesis explores how learned priors can help overcome these challenges and improve 3D spatial understanding. I begin by addressing the problem of shape generation and reconstruction from sparse point clouds, proposing a method that learns shape priors through modeling gradient fields over 3D point cloud distributions. Next, I tackle the challenge of extreme camera pose estimation between image pairs with little or no overlap, using dense correlation volumes to extract semantic and geometric cues. Building on this, I further improve extreme pose estimation by leveraging visual world priors from generative video models, which hallucinate plausible intermediate frames to provide useful context. Finally, I address visual ambiguity in structure-from-motion by using 3D priors from feature-matching models to disambiguate visually similar but incorrect matches, what we call “doppelgangers”, in symmetric scenes like the Arc de Triomphe.

Description
209 pages
Date Issued
2025-08
Keywords
3D Computer Vision
•
3D Reconstruction
•
Camera Pose Estimation
•
Generative Model
Committee Chair
Snavely, Keith
Committee Member
Hariharan, Bharath
Marschner, Stephen
Zhang, Cheng
Degree Discipline
Computer Science
Degree Name
Ph. D., Computer Science
Degree Level
Doctor of Philosophy
Rights
Attribution 4.0 International
Rights URI
https://creativecommons.org/licenses/by/4.0/
Type
dissertation or thesis

Site Statistics | Help

About eCommons | Policies | Terms of use | Contact Us

copyright © 2002-2026 Cornell University Library | Privacy | Web Accessibility Assistance