Making Data Work: The Human and Organizational Lifeworlds of Data Science Practices
It takes a lot of human work to do data science, and this thesis explains what that work is, how it is done, why it is needed, and what its implications are. Data science is the computational practice of analyzing data using methods from fields such as machine learning and artificial intelligence. Data science in practice is a difficult undertaking, requiring a deep understanding of algorithmic techniques, computational tools, and statistical methods. Such technical knowledge is often seen as the core, if not the sole, ingredient for doing data science. Unsurprisingly, technical work persists at the center of current data science discourse, practice, and imagination. But doing data science is not the simple task of creating models by applying algorithms to data. Data science is a craft—it has procedures, methods, and tools, but also requires creativity, discretion, and improvisation. Data scientists are often aware of this fact but not all forms of their everyday work are equally visible in scholarly and public discussions and representations of data science. Official narratives tell stories about data attributes, algorithmic workings, model findings, and quantitative metrics. The disproportionate emphasis on technical work sidelines other work, particularly ongoing and messy forms of human and organizational work involving collaboration, negotiation, and experimentation. In this thesis, I focus on the less understood and often unaccounted forms of human and organizational work involved in the everyday practice of data science through ethnographic and qualitative research. Moving away from an individualistic and homogeneous understanding of data science work, I highlight the collaborative and heterogeneous nature of real-world data science work, showing how human and organizational work shape not only the design and working of data science systems but also their wider social implications. This thesis reveals the everyday origins of the social, ethical, and normative issues of data science, and the many ways in which practitioners struggle to define and deal with them in practice, reminding us that not all data science problems are technical in nature—some are deeply human, while others innately organizational.
critical data studies; data science; fairness; accountability; and transparency; invisible work; organizational work; responsible AI
Sengers, Phoebe J.
Barocas, Solon Isaac; Jackson, Steven J.; Mimno, David
Ph. D., Information Science
Doctor of Philosophy
dissertation or thesis