Beyond the Black Box: Optimization within Latent Spaces
In the past decade, neural networks have evolved into extraordinarily powerful tools, with wide-ranging applications across many different domains. These models allow us to use immense computational power to learn low-level features and high-level abstract concepts from vast datasets. Neural networks embed data of different forms (text, image, audio, etc.) into high dimensional latent spaces that encode salient features of the data and capture complex relationships between data points. This thesis aims to probe model parameters and latent spaces—to understand not only how information is stored and processed in networks but also how the encoded knowledge can be extracted and harnessed. We leverage these insights to develop novel methods that optimize specific parameters or representations from trained models to perform various downstream tasks. We present three specific methods in this thesis. First, we introduce BERTScore, an algorithm that utilizes representations from pre-trained language models to measure the similarity between two pieces of text. BERTScore approximates a form of transport distance to match tokens in the texts. Then, we focus on an information retrieval setting, where transformers are trained end-to-end to map search queries to corresponding documents. In this setting, we introduce IncDSI, a method to add new documents to a trained retrieval system by solving a constrained convex optimization problem to obtain new document representations. Finally, we present Fixed Neural Network Steganography (FNNS), a technique for image steganography that hides information by exploiting a neural network's sensitivity to imperceptible perturbations.