Analyzing Life-logging Image Sequences
MetadataShow full item record
Moghimi Najafabadi, Mohammad
With the abundance of ubiquitous cameras, it has become easier people take pictures of everything and everywhere. People take pictures of their possessions, interesting subjects and the places they visit. There is a class of passive cameras that let people “be present in the moment” while recording the situation. This act is called Visual Life-logging. Cheap cameras, storage devices and recent advancement in Computer Vision has created a unique experience. Life-logging has many applications besides its unique life recording perspective one of which is health monitoring. A camera can augment other health monitoring systems such as motion, blood pressure and blood sugar levels. We design algorithms to analyze life-logging image sequences to facilitate public health research. Our approach to the analysis is threefold: unsupervised, supervised, human-in-the-loop. We designed an algorithm to extract regions of interest from image sequences based on their occurrences in different scenes. We used the histogram of gradients (HOG) feature and applied a repetitive classification discriminatory approach to finding patches that only appear in a scene but not other scenes. Using our method, we can discover objects such as a monitor in an office setting or bike handles in a biking scene in an unsupervised manner. The next step is to analyze the data in a supervised fashion. After carefully designing a set of labels appropriate for the public health research which includes posture, activities, scenes and social settings, our team has manually annotated the data with these labels, and we implemented visual classification algorithms to classify images using these tags. Our methods include state-of-the-art pre-deep learning models as well as deep convolutional neural networks. We extend the CNN with spatial, temporal and model-level bagging and model-level boosting. Unique characteristics of life-logging image sequences require a custom model to leverage these aspects such as temporal coherence and correlation of images of each person. The annotation of the dataset consisting of millions of images is a cumbersome task. It requires extensive time, money and resources. In this thesis, we present the foundational tools to efficiently annotate the image sequence by leveraging the previously labeled data to minimize annotation time and increase the accuracy. Our experiments show a significant decrease in annotation time.
Boosting; Classification; Deep Learning; Life Logging; computer vision; Computer science; machine learning
Belongie, Serge J.
Chen, Tsuhan; Snavely, Keith Noah
PHD of Computer Science
Doctor of Philosophy
Attribution-ShareAlike 4.0 International
dissertation or thesis
Except where otherwise noted, this item's license is described as Attribution-ShareAlike 4.0 International