AERA Annual Meeting: Machine Learning with Explainable AI: Handling Imbalanced Panel Data through Resampling Techniques

Information Menu
Search Tips

Navigation and Settings Menu
Change Preferences / Time Zone
Sign In

Back Home

Refresh: Off

Paper Summary

Share...

Direct link:

Machine Learning with Explainable AI: Handling Imbalanced Panel Data through Resampling Techniques

In Event: Analyzing Large Scale Data Systems: Methodology and Implications for Practice

Fri, April 10, 9:45 to 11:15am PDT (9:45 to 11:15am PDT), JW Marriott Los Angeles L.A. LIVE, Floor: 2nd Floor, Platinum E

Abstract

This study aims to solve the class imbalance problem that occurs when applying machine learning technology to large-scale datasets by evaluating the performance of various resampling methods (synthetic minority class oversampling technique (SMOTE), adaptive synthetic sampling (ADASYN), SMOTE combined with edited nearest neighbors (SMOTE+ENN), and SMOTE combined with Tomek links (SMOTE+Tomek) with ensemble models such as Extra Trees, Random Forest, XGBoost, and CatBoost. The optimal combination was found to be SMOTE and CatBoost. Additionally, explainable AI (XAI) techniques, particularly Local Interpretable Model-Agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP), were applied to identify key predictive variables, providing both local and global insights.

Machine Learning with Explainable AI: Handling Imbalanced Panel Data through Resampling Techniques

Abstract

Authors