Supervised k-Means Clustering
dc.contributor.author | Finley, Thomas | |
dc.contributor.author | Joachims, Thorsten | |
dc.date.accessioned | 2008-11-04T05:29:19Z | |
dc.date.available | 2008-11-04T05:29:19Z | |
dc.date.issued | 2008-11-04T05:29:19Z | |
dc.description.abstract | The k-means clustering algorithm is one of the most widely used, effective, and best understood clustering methods. However, successful use of k-means requires a carefully chosen distance measure that reflects the properties of the clustering task. Since designing this distance measure by hand is often difficult, we provide methods for training k-means using supervised data. Given training data in the form of sets of items with their desired partitioning, we provide a structural SVM method that learns a distance measure so that k-means produces the desired clusterings. We propose two variants of the methods -- one based on a spectral relaxation and one based on the traditional k-means algorithm -- that are both computationally efficient. For each variant, we provide a theoretical characterization of its accuracy in solving the training problem. We also provide an empirical clustering quality and runtime analysis of these learning methods on varied high-dimensional datasets. | en_US |
dc.description.sponsorship | NSF Award IIS-0713483 "Learning Structure to Structure Mapping," and a gift from Yahoo! Inc | en_US |
dc.identifier.uri | https://hdl.handle.net/1813/11584 | |
dc.language.iso | en_US | en_US |
dc.subject | machine learning | en_US |
dc.subject | clustering | en_US |
dc.subject | k-means | en_US |
dc.subject | computer science | en_US |
dc.title | Supervised k-Means Clustering | en_US |
dc.type | technical report | en_US |
Files
Original bundle
1 - 1 of 1