6 years of data from WCM beneficiaries as a predictor of cost and hospitalization
The dataset contains all Weill Cornell health insurance beneficiaries from 2016 through 2021, including employees and their dependents—spouses/partners and children—with at least one submitted claim. Individuals were linked across years using unique de-identified IDs. Beneficiary-level data included year of birth, gender, and beneficiary status (employee, spouse/partner, or child). The claims data encompassed all outpatient and inpatient medical, maternity, behavioral health, and other covered services. Fewer than 0.01% of claims in any given year had service dates more than 12 months prior to the claim year and were excluded from the analysis. Pharmacy claims were not included. Each claim contained the Major Diagnostic Category, the servicing provider’s specialty code, and the line-level procedure code (primarily HCPCS Level I CPT-4). For hospitalizations, admission and discharge dates and the corresponding DRG discharge classifications were recorded. Claims data were used to calculate total payments for services in each year.