Improving Inclusion in AI-Based Candidate Disparate Impact and Counterfactual Testing
No Access Until
Permanent Link(s)
Other Titles
Author(s)
Abstract
Automated recruitment platforms, such as Workday and iCIMS, increasingly rely on machine learning (ML) models to streamline candidate selection. However, these systems may inadvertently reinforce existing biases in hiring processes. This paper proposes a quantitative and empirical approach to evaluating and improving fairness in AI-based hiring pipelines. We focus on Disparate Impact (DI) measurement and Counterfactual Fairness testing to audit a representative ATS—Workday’s AI screening engine. We compute DI across various demographic groups, demonstrating how current candidate scoring mechanisms fail the "four-fifths" threshold (DI < 0.8) in simulated recruitment scenarios. To mitigate such inequities, we introduce a bias correction model that rebalances feature weights in post-processing. Among several variables, we identify the "education prestige score" as a key contributor to disparate treatment. Experimental results on semi-synthetic datasets show that reweighting or masking this feature significantly reduces DI disparity while preserving predictive performance (AUC drop <1%). This work provides a replicable methodology for fairness auditing in enterprise-grade ATS, offering a pathway to more equitable recruitment practices.