Show simple item record

dc.contributor.authorFinley, Thomas
dc.contributor.authorJoachims, Thorsten
dc.date.accessioned2008-11-18T05:14:17Z
dc.date.available2008-11-18T05:14:17Z
dc.date.issued2008-11-18T05:14:17Z
dc.identifier.urihttp://hdl.handle.net/1813/11621
dc.description.abstractThe k-means clustering algorithm is one of the most widely used, effective, and best understood clustering methods. However, successful use of k-means requires a carefully chosen distance measure that reflects the properties of the clustering task. Since designing this distance measure by hand is often difficult, we provide methods for training k-means using supervised data. Given training data in the form of sets of items with their desired partitioning, we provide a structural SVM method that learns a distance measure so that k-means produces the desired clusterings. We propose two variants of the methods -- one based on a spectral relaxation and one based on the traditional k-means algorithm -- that are both computationally efficient. For each variant, we provide a theoretical characterization of its accuracy in solving the training problem. We also provide an empirical clustering quality and runtime analysis of these learning methods on varied high-dimensional datasets.en_US
dc.description.sponsorshipThis work was supported under NSF Award IIS-0713483 ``Learning Structure to Structure Mapping,'' and through a gift from Yahoo! Inc.en_US
dc.language.isoen_USen_US
dc.subjectmachine learningen_US
dc.subjectk-meansen_US
dc.subjectclusteringen_US
dc.subjectcomputer scienceen_US
dc.titleSupervised k-Means Clusteringen_US
dc.typepaper or projecten_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Statistics