eCommons

 

Towards Expressive and Robust Learning with Hyperbolic Geometry

dc.contributor.authorYu, Tao
dc.contributor.chairDe Sa, Christopheren_US
dc.contributor.committeeMemberWeinberger, Kilianen_US
dc.contributor.committeeMemberStephens-Davidowitz, Noahen_US
dc.date.accessioned2025-01-14T20:01:22Z
dc.date.available2025-01-14T20:01:22Z
dc.date.issued2024-08
dc.description293 pagesen_US
dc.description.abstractMachine learning models traditionally operate within the confines of Euclidean space, assuming the Euclidean nature of data. However, there is a growing interest in learning within non-Euclidean hyperbolic space, particularly in scenarios where data exhibits explicit or implicit hierarchies, such as in natural languages (with taxonomies and lexical entailment) or in tree-like and graphical data (as seen in biological and social networks). Embracing the geometry of the data not only leads to more expressive models but also offers deeper insights into the underlying mechanisms governing complex datasets. An important foundation of machine learning lies in representing data as continuous values, a process known as embedding. Recent studies have demonstrated both theoretically and empirically that hyperbolic space can embed hierarchical data with lower dimensionality compared to Euclidean space. This insight has spurred the development of various hyperbolic networks, despite the challenge that hyperbolic space is not a vector space. To address this, we propose an end-to-end approach that adopts hyperbolic geometry from a manifold perspective. This approach includes an embedding framework that directly encodes data hierarchies, a method for hyperbolic-isometries-aware learning, and a demonstration of how our framework can enhance the performance of attention models, such as transformers, by capturing implicit hierarchies. While hyperbolic geometry offers theoretical advantages, its practical implementation faces challenges due to numerical errors stemming from floating-point computations, further exacerbated by the ill-conditioned hyperbolic metrics. This issue, often referred to as the ``NaN'' problem, arises when practitioners encounter Not-a-Number while running hyperbolic models. To address this, we introduce several robust and accurate representations using integer-based tilings and multi-component floating-point methods, which offer provably bounded numerical errors for the first time. Additionally, we present MCTensor, a PyTorch library that enables general-purpose and high-precision training of machine learning models. We demonstrate the effectiveness of our approach by applying multi-component floating-point to train large language models at low precision, mitigating the issue of reduced numerical accuracy and producing models of better performances. In conclusion, our work aims to empower individuals and organizations to leverage the potential of hyperbolic geometry in machine learning, drawing a broad audience towards this promising and evolving research direction.en_US
dc.identifier.doihttps://doi.org/10.7298/77zx-a395
dc.identifier.otherYu_cornellgrad_0058F_14417
dc.identifier.otherhttp://dissertations.umi.com/cornellgrad:14417
dc.identifier.urihttps://hdl.handle.net/1813/116636
dc.language.isoen
dc.rightsAttribution 4.0 International*
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/*
dc.subjectHierarchical Data Modelingen_US
dc.subjectHyperbolic Geometryen_US
dc.subjectMachine Learningen_US
dc.subjectNumerical Precisionen_US
dc.subjectRepresentation Learningen_US
dc.titleTowards Expressive and Robust Learning with Hyperbolic Geometryen_US
dc.typedissertation or thesisen_US
dcterms.licensehttps://hdl.handle.net/1813/59810.2
thesis.degree.disciplineComputer Science
thesis.degree.grantorCornell University
thesis.degree.levelDoctor of Philosophy
thesis.degree.namePh. D., Computer Science

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Yu_cornellgrad_0058F_14417.pdf
Size:
8.23 MB
Format:
Adobe Portable Document Format