Graph-Based Learning From Large Image Collections
With the explosion of online images, it has been increasingly interesting for computer vision researchers to model large collections of images using machine learning techniques. Images, or observations by cameras in the real world, are natural units of information that are correlated to each other. For example, one of the ways that such correlation can be established is to check if two images observe the same part of the world (i.e. geometrically consistent). Hence, it is attractive to model images as well as their relationships with graphs. To achieve this goal, we need to first answer a few questions. First, how do we define such graphs and how do we acquire them? Second, how should we use such graphs to formulate learning such that the results are useful for computer vision tasks? Third, for large image sets, can we find ways to model the information in the set with a much smaller graph? This thesis attempts to answer these questions with three corresponding chapters. In Chapter 2, we define the image graph as images (nodes) connected by an edge if and only if they are geometrically consistent, i.e. they have significant overlap. Chapter 2 will focus on how to acquire such graphs efficiently given a raw set of images. In short, our approach processes a set of images in an iterative manner, alternately performing pairwise feature matching and learning an improved similarity measure. In Chapter 3, we formulate a learning problem making use of such image graph on a set of Internet tourist images for the task of location recog- nition. In particular, starting from a graph based on visual connectivity, we propose a method for selecting a set of overlapping subgraphs and learning a local distance function for each subgraph using discriminative techniques. We demonstrate that our methods improve performance over standard bag-ofwords methods on several existing location recognition datasets. Finally, we propose a method for reducing the size of a Structure from Motion model using an image-point visibility graph in Chapter 4, and we show that this method produces small models that yield better recognition performance than previous model reduction techniques.
Ph. D., Computer Science
Doctor of Philosophy
dissertation or thesis