Embedding Methods For Generative Modeling And Visual Data Analysis
In this work, we develop probabilistic embedding models for generative modeling and visual data analysis tasks. Embedding models are a class of models that assume the existence of a latent vector space representation for each of the modeled objects (where an object could be a word, a user, or a song or movie, depending on the task). In our models, we assume that this vector space defines a probability distribution over user choices or recommendations given a context of recent behavior. Furthermore, the scoring function which links the space to the goodness of fit between a choice and a context is chosen to focus on Euclidean distances in the embedding space. This allows us to build modular, interpretable models for a number of tasks. Although these models are generally applicable to many kinds of data, we mainly explore our applications of the models to tasks in the domain of Music Information Retrieval (MIR). First, we consider a playlist generation task which uses historical playlist logs to produce a model of good playlists, as well as a semantic genre space of the modeled songs. We then demonstrate the modularity of these models by adding side information and incorporating social tags to generalize to songs not seen in training, or out-of-vocabulary songs. We also demonstrate the power the interpretability of the semantic spaces of objects that result from our models. In the first case, we add temporal dynamics to our model in order to analyze trends and events in long-term music listening behavior logs. Finally, we apply our model to a global data set of music plays in order to detect geographical, cultural, and linguistic patterns in music listening behavior in cities around the world.
Embedding Methods; Music Information Retrieval; Machine Learning
Harbert,Wayne Eugene; Kleinberg,Robert David; Turnbull,Douglas R
Ph. D., Computer Science
Doctor of Philosophy
dissertation or thesis