Efficient Modeling Of Higher Order And Longer Range Geometry Statistics
The local feature based approaches have become popular in most vision applications. A local feature captures the local appearance of objects or scenes, and is more robust to environment and view-point changes comparing to features extracted from the entire image. The shape and context information is further captured with the spatial relationships of the local features. Modeling more spatial information usually leads to exponential or polynomial increase of the computational cost. Therefore, the spatial modeling of prior work is limited to neighboring or weak geometry relationships of local features, or is not viewpoint invariant. In this thesis, we propose algorithms that model rich geometry information with little sacrifice of the computational cost. We focus on two main vision problems, the whole image representation and the pixel-level image labeling. For each of them, we present an algorithm that incorporate spatial information to its most popular and basic technique: the Bag-of-Words (BoW) representation and Conditional Random Field (CRF) model respectively. Our proposed algorithm is general enough to be applied to or combined with any other advanced technique, which utilizes BoW or CRF as part of it, to further improve its performance with only little increase of the computational cost. We show example usages of the proposed algorithms in several applications, including object recognition, object localization, image retrieval, activity recognition in videos, and object-based image segmentation. Experiment results show that our approaches improve the performance of the state-of-arts for these applications with only little increase of the computation cost.
Computer Vision; Machine Learning; Object Recognition; Event Detection; Image Retrieval; Image Segmentation
Joachims, Thorsten; Reeves, Anthony P; Liu, Xiaoming
Ph.D. of Electrical Engineering
Doctor of Philosophy
dissertation or thesis