AERA Annual Meeting: Ensuring Fairness in OSCEs Using AI-Based Action Detection Models: A Fairness Approach for Small Sample Sizes

Information Menu
Search Tips

Navigation and Settings Menu
Change Preferences / Time Zone
Sign In

Back Home

Refresh: Off

Paper Summary

Share...

Direct link:

Ensuring Fairness in OSCEs Using AI-Based Action Detection Models: A Fairness Approach for Small Sample Sizes

In Event: AERA Roundtable Session Thursday 3:35 pm Four Seasons Ballroom 2-3
In Roundtable Session: Advancing Assessment Fairness and Equity in Medical Education With Artificial Intelligence (Table 9)

Thu, April 24, 3:35 to 5:05pm MDT (3:35 to 5:05pm MDT), The Colorado Convention Center, Floor: Ballroom Level, Four Seasons Ballroom 2-3

Abstract

The objective structured clinical examination (OSCE) is a pivotal assessment tool in medical education, evaluating a wide range of clinical skills through standardized scenarios. Ensuring the fairness and validity of OSCEs is critical, particularly when these assessments involve hand-washing and glove-wearing, identified by AI models, and small sample sizes. This study proposes an innovative approach to examine and enhance measurement invariance in OSCEs using AI models, specifically within hand-washing and glove-wearing action detection.
This project integrates AI-based action detection to evaluate hand-washing and glove-wearing procedures in medical students. The AI model utilizes deep learning techniques to accurately detect and assess these critical actions, ensuring compliance with clinical standards. Given the small sample size and the multidimensional nature of OSCE items, traditional psychometric methods may struggle to ensure measurement invariance across diverse demographic groups. Therefore, we employ a combination of advanced statistical and AI techniques tailored for small datasets to ensure the reliability and fairness of our assessments.

Methodology
Measurement Invariance and Fairness Analysis in AI Models
1.Differential Item Functioning (DIF) Analysis: Use logistic regression to analyze whether the probabilities of correctly identifying hand-washing and glove-wearing actions differ significantly between demographic groups (e.g., gender, ethnicity), controlling for overall skill level. This helps identify potential biases in how the AI model assesses these items.
2.Fairness Metrics:(1) Demographic Parity: Evaluate whether different demographic groups have similar probabilities of being correctly identified as performing the hand-washing and glove-wearing actions. This can be assessed by comparing the proportion of correct identifications across groups. (2) Equalized Odds: Assess whether the true and false positive rates are similar across demographic groups for both hand-washing and glove-wearing actions.
3.Regularization and Resampling Techniques: (1) Regularization: Implement L1 (Lasso) and L2 (Ridge) regularization techniques during the training of the AI model to prevent overfitting and ensure more stable and generalizable predictions, especially with a small sample size. (2) Bootstrap Aggregating (Bagging): Use bootstrapping to create multiple resampled datasets and train several models. This ensemble approach helps to improve the robustness and fairness of the final AI model.
4.Adversarial Training: Train the AI model with adversarial networks to minimize the dependency of model predictions on demographic attributes. This involves an adversary network attempting to predict the demographic group of the student, while the main model is trained to prevent this prediction, ensuring invariant feature learning.
5.Invariant Risk Minimization (IRM): Apply IRM to train the AI model to find invariant representations across different demographic groups, ensuring the model’s performance is consistent and fair regardless of group membership.

This study demonstrates a comprehensive framework for ensuring measurement invariance in OSCEs using AI models tailored for small sample sizes. By integrating advanced psychometric techniques and fairness-aware AI methodologies, we aim to enhance the fairness and validity of medical education assessments, ensuring that all students are evaluated equitably.

Ensuring Fairness in OSCEs Using AI-Based Action Detection Models: A Fairness Approach for Small Sample Sizes

Abstract

Authors