eCommons

 

Towards Expressive and Robust Learning with Hyperbolic Geometry

Other Titles

Author(s)

Abstract

Machine learning models traditionally operate within the confines of Euclidean space, assuming the Euclidean nature of data. However, there is a growing interest in learning within non-Euclidean hyperbolic space, particularly in scenarios where data exhibits explicit or implicit hierarchies, such as in natural languages (with taxonomies and lexical entailment) or in tree-like and graphical data (as seen in biological and social networks). Embracing the geometry of the data not only leads to more expressive models but also offers deeper insights into the underlying mechanisms governing complex datasets. An important foundation of machine learning lies in representing data as continuous values, a process known as embedding. Recent studies have demonstrated both theoretically and empirically that hyperbolic space can embed hierarchical data with lower dimensionality compared to Euclidean space. This insight has spurred the development of various hyperbolic networks, despite the challenge that hyperbolic space is not a vector space. To address this, we propose an end-to-end approach that adopts hyperbolic geometry from a manifold perspective. This approach includes an embedding framework that directly encodes data hierarchies, a method for hyperbolic-isometries-aware learning, and a demonstration of how our framework can enhance the performance of attention models, such as transformers, by capturing implicit hierarchies. While hyperbolic geometry offers theoretical advantages, its practical implementation faces challenges due to numerical errors stemming from floating-point computations, further exacerbated by the ill-conditioned hyperbolic metrics. This issue, often referred to as the ``NaN'' problem, arises when practitioners encounter Not-a-Number while running hyperbolic models. To address this, we introduce several robust and accurate representations using integer-based tilings and multi-component floating-point methods, which offer provably bounded numerical errors for the first time. Additionally, we present MCTensor, a PyTorch library that enables general-purpose and high-precision training of machine learning models. We demonstrate the effectiveness of our approach by applying multi-component floating-point to train large language models at low precision, mitigating the issue of reduced numerical accuracy and producing models of better performances. In conclusion, our work aims to empower individuals and organizations to leverage the potential of hyperbolic geometry in machine learning, drawing a broad audience towards this promising and evolving research direction.

Journal / Series

Volume & Issue

Description

293 pages

Sponsorship

Date Issued

2024-08

Publisher

Keywords

Hierarchical Data Modeling; Hyperbolic Geometry; Machine Learning; Numerical Precision; Representation Learning

Location

Effective Date

Expiration Date

Sector

Employer

Union

Union Local

NAICS

Number of Workers

Committee Chair

De Sa, Christopher

Committee Co-Chair

Committee Member

Weinberger, Kilian
Stephens-Davidowitz, Noah

Degree Discipline

Computer Science

Degree Name

Ph. D., Computer Science

Degree Level

Doctor of Philosophy

Related Version

Related DOI

Related To

Related Part

Based on Related Item

Has Other Format(s)

Part of Related Item

Related To

Related Publication(s)

Link(s) to Related Publication(s)

References

Link(s) to Reference(s)

Previously Published As

Government Document

ISBN

ISMN

ISSN

Other Identifiers

Rights

Attribution 4.0 International

Types

dissertation or thesis

Accessibility Feature

Accessibility Hazard

Accessibility Summary

Link(s) to Catalog Record