Improved Learning Of Structural Support Vector Machines: Training With Latent Variables And Nonlinear Kernels
Yu, Chun Nam
Structured output prediction in machine learning is the study of learning to predict complex objects consisting of many correlated parts, such as sequences, trees, or matchings. The Structural Support Vector Machine (Structural SVM) algorithm is a discriminative method for structured output learning that allows flexible feature construction with robust control for overfitting. It provides stateof-art prediction accuracies for many structured output prediction tasks in natural language processing, computational biology, and information retrieval. This thesis explores improving the learning of structured prediction rules with structural SVMs in two main areas: incorporating latent variables to extend their scope of application and speeding up the training of structural SVMs with nonlinear kernels. In particular, we propose a new formulation of structural SVM, called Latent Structural SVM, that allows the use of latent variables, and an algorithm to solve the associated non-convex optimization problem. We demonstrate the generality of our new algorithm through several structured output prediction problems, showing improved prediction accuracies with new alternative problem formulations using latent variables. In addition to latent variables, the use of nonlinear kernels in structural SVMs can also improve their expressiveness and prediction accuracies. However their high computational costs during training limit their wider application. We explore the use of approximate cutting plane models to speed up the training of structural SVMs with nonlinear kernels. We provide a theoretical analysis of their iteration complexity and their approximation quality. Experimental results show improved accuracy-sparsity tradeoff when compared against several state-of-art approximate algorithm for training kernel SVMs, with our algorithm having the advantage that it is readily applicable to structured output prediction problems.
Structured Output Learning; Support Vector Machines; Kernels
Todd, Michael Jeremy; Siepel, Adam Charles
Ph. D., Computer Science
Doctor of Philosophy
dissertation or thesis