Paper Summary
Share...

Direct link:

Variable Selection and Binary Prediction With Incomplete Data: Balance Between Fairness and Precision

Thu, April 11, 9:00 to 10:30am, Pennsylvania Convention Center, Floor: Level 200, Exhibit Hall B

Abstract

Variable selection is crucial to binary predictive models, and missing data makes it less straightforward. This study explores two approaches—Bootstrap Imputation-Stability Selection (BI-SS) and Stacked Elastic Net (SENET)—to handle variable selection with incomplete data via regularization regression and multiple imputation. While both methods perform well in single group samples, their efficacy remains uncertain in samples with heterogeneous subgroups, where subgroups vary in sample size, base rate, and strength of predictor-outcome association, raising potential concern about algorithm bias. This study conducts empirical and simulation studies to evaluate both methods under these conditions, with an emphasis on algorithm fairness.

Authors