Towards Expressive and Robust Learning with Hyperbolic Geometry
No Access Until
Permanent Link(s)
Collections
Other Titles
Author(s)
Abstract
Machine learning models traditionally operate within the confines of Euclidean space, assuming the Euclidean nature of data. However, there is a growing interest in learning within non-Euclidean hyperbolic space, particularly in scenarios where data exhibits explicit or implicit hierarchies, such as in natural languages (with taxonomies and lexical entailment) or in tree-like and graphical data (as seen in biological and social networks). Embracing the geometry of the data not only leads to more expressive models but also offers deeper insights into the underlying mechanisms governing complex datasets. An important foundation of machine learning lies in representing data as continuous values, a process known as embedding. Recent studies have demonstrated both theoretically and empirically that hyperbolic space can embed hierarchical data with lower dimensionality compared to Euclidean space. This insight has spurred the development of various hyperbolic networks, despite the challenge that hyperbolic space is not a vector space. To address this, we propose an end-to-end approach that adopts hyperbolic geometry from a manifold perspective. This approach includes an embedding framework that directly encodes data hierarchies, a method for hyperbolic-isometries-aware learning, and a demonstration of how our framework can enhance the performance of attention models, such as transformers, by capturing implicit hierarchies. While hyperbolic geometry offers theoretical advantages, its practical implementation faces challenges due to numerical errors stemming from floating-point computations, further exacerbated by the ill-conditioned hyperbolic metrics. This issue, often referred to as the ``NaN'' problem, arises when practitioners encounter Not-a-Number while running hyperbolic models. To address this, we introduce several robust and accurate representations using integer-based tilings and multi-component floating-point methods, which offer provably bounded numerical errors for the first time. Additionally, we present MCTensor, a PyTorch library that enables general-purpose and high-precision training of machine learning models. We demonstrate the effectiveness of our approach by applying multi-component floating-point to train large language models at low precision, mitigating the issue of reduced numerical accuracy and producing models of better performances. In conclusion, our work aims to empower individuals and organizations to leverage the potential of hyperbolic geometry in machine learning, drawing a broad audience towards this promising and evolving research direction.
Journal / Series
Volume & Issue
Description
Sponsorship
Date Issued
Publisher
Keywords
Location
Effective Date
Expiration Date
Sector
Employer
Union
Union Local
NAICS
Number of Workers
Committee Chair
Committee Co-Chair
Committee Member
Stephens-Davidowitz, Noah