ENHANCING EXPRESSIVITY OF DOCUMENT-CENTERED COLLABORATION WITH MULTIMODAL ANNOTATIONS
As knowledge work moves online, digital documents have become a staple of human collaboration. To communicate beyond the constraints of time and space, remote and asynchronous collaborators create digital annotations over documents, substituting face-to-face meetings with online conversations. However, existing document annotation interfaces depend primarily on text commenting, which is not as expressive or nuanced as in-person communication where interlocutors can speak and gesture over physical documents. To expand the communicative capacity of digital documents, we need to enrich annotation interfaces with face-to-face-like multimodal expressions (e.g., talking and pointing over texts). This thesis makes three major contributions toward multimodal annotation interfaces for enriching collaboration around digital documents. The first contribution is a set of design requirements for multimodal annotations drawn from our user studies and explorative literature surveys. We found that the major challenges were to support lightweight access to recorded voice, to control visual occlusions of graphically rich audio interfaces, and to reduce speech anxiety in voice comment production. Second, to address these challenges, we present RichReview, a novel multimodal annotation system. RichReview is designed to capture natural communicative expressions in face-to-face document descriptions as the combination of multimodal user inputs (e.g., speech, pen-writing, and deictic pen-hovering). To balance the consumption and production of speech comments, the system employs (1) cross-modal indexing interfaces for faster audio navigation, (2) fluid document-annotation layout for reduced visual clutter, and (3) voice synthesis-based speech editing for reduced speech anxiety. The third contribution is a series of evaluations that examines the effectiveness of our design solutions. Results of our lab studies show that RichReview can successfully address the above mentioned interface problems of multimodal annotations. A subsequent series of field deployment studies test the real-world efficacy of RichReview by deploying the system for document-centered conversation activities in classrooms, such as instructor feedback for student assignments and peer discussions about course material. The results suggest that using rich annotation helps students better understand the instructor’s comments, and makes them feel more valued as a person. From the results of the peer-discussion study, we learned that retaining the richness of original speech is the key to the success of speech commenting. What follows is the discussion on the benefits, challenges, and future of multimodal annotation interfaces, and technical innovations required to realize the vision.
Educational technology; Gesture user interfaces; Multimodal interaction; Online collaboration; RichReview; Speech user interfaces; Information science
Guimbretiere, Francois V.
Andersen, Erik; Fussell, Susan R.
PHD of Information Science
Doctor of Philosophy
dissertation or thesis