Quantifying Similarity in Conversation Dynamics using Computational Methods
Access to this document is restricted. Some items have been embargoed at the request of the author, but will be made publicly available after the "No Access Until" date.
During the embargo period, you may request access to the item by clicking the link to the restricted file(s) and completing the request form. If we have contact information for a Cornell author, we will contact the author and request permission to provide access. If we do not have contact information for a Cornell author, or the author denies or does not respond to our inquiry, we will not be able to provide access. For more information, review our policies for restricted content.
Unlike traditional text data, conversations are intricately structured through multiple turn-takings that shape their overall dynamics, and these dynamics are pivotal in defining the nature, effectiveness, and trajectory of the conversations. Traditional textual similarity measures, however, overlook this unique structure, often viewing conversations as sequences of utterances rather than evolving interactive processes. In this work, we present a new conversation-level similarity measure that captures the dynamics of a conversation. To validate our approach, we propose two validation methods that reliably generate conversation similarity labels using simulated conversations. We demonstrate the utility of our measure through multiple applications. We calculate between-group similarity and show how the conversational dynamics leading to toxicity have changed over time on a platform. We also use it as a distance metric for clustering in two settings—identifying different ways a conversation’s dynamics can progress toward being derailed into toxic behavior.