Representation Learning for Web Intelligence
The excitement of web-driven machine learning research comes at two different granularity levels: on one hand, it enables us to model the complex interactions among web users and web components in the abstract; and on the other hand, it provides us with an automated means for comprehension of concrete web contents such as texts, images and audios. Through the lens of machine learning and machine vision, this thesis focuses on approaches of learning representations of web-scale data in perspectives of web user interaction and web content comprehension. This thesis work contributes to the field of machine learning in both algorithmic foundations and practical advances. The first part focuses on classification and prediction problems on the web when the interactions of individuals are modeled by graphs. We will leverage graphs and machine learning models as tools to effectively represent how users' interactions are related to each other, and to identify interesting patterns of behavior of web users. The second part of the thesis focuses on methods for understanding and comprehending visual contents on the web. In particular, we turn to deep learning -- a subfield of machine learning that has recently been placed at the core of many web-driven tasks such as image classification and understanding. Despite its success, there has been lacking understanding of the representations learned by deep neural networks. This thesis sheds light on how we can be better interpret the inner representations of deep neural networks, and also leads to new techniques of how we can facilitate learning better representations.