Contributions to Fairness and Transparency
Baer, Benjamin R.
This dissertation presents three varied topics. The first topic concerns text mining. GloVe and Skip-gram word embedding methods learn word vectors by decomposing a denoised matrix of word co-occurrences into a low-rank matrix. In this work, we propose an iterative algorithm for computing word vectors based on modeling word co-occurrence matrices with Symmetric Generalized Low Rank Models. Our algorithm generalizes both Skip-gram and GloVe as well as giving rise to other embedding methods based on the specified co-occurrence matrix, distribution of co-occurences, and the number of iterations in the iterative algorithm. For example, using a Tweedie distribution with one iteration results in GloVe and using a Multinomial distribution with full-convergence mode results in Skip-gram. The second topic concerns algorithmic fairness. A substantial portion of the literature on fairness in algorithms proposes, analyzes, and operationalizes simple formulaic criteria for assessing fairness. Two of these criteria, Equalized Odds and Calibration by Group, have gained significant attention for their simplicity and intuitive appeal, but also for their incompatibility. This chapter provides a perspective on the meaning and consequences of these and other fairness criteria using graphical models which reveals Equalized Odds and related criteria to be ultimately misleading. An assessment of various graphical models suggests that fairness criteria should ultimately be case-specific and sensitive to the nature of the information the algorithm processes. The third topic concerns the fragility index. In recent years there has been a renewed conversation concerning interpretable and proper techniques for statistical hypothesis testing. In the medical literature on clinical trials, the count of patients who must have a different outcome to reverse statistical significance in a 2 by 2 contingency table (the fragility index) has been proposed as a more interpretable supplement to classical p value based testing. We studied the sampling distribution of the fragility index and created a sample size calculation strategy which simultaneously designs for p values and fragility indices. Then, we extended the fragility index to only incorporate sufficiently likely outcome modifications. Next, we redefined what it means for an outcome modification to be sufficiently likely and studied a variant of the fragility index tailored for patients who are lost to follow up. Finally, we generalized the fragility index to any data type and any statistical test.
algorithmic fairness; fragility index; hypothesis testing; interpretability; natural language processing; word embeddings
Wells, Martin Timothy
Basu, Sumanta; Booth, James
Ph. D., Statistics
Doctor of Philosophy
Attribution 4.0 International
dissertation or thesis
Except where otherwise noted, this item's license is described as Attribution 4.0 International