Search
On-Site Program Calendar
Browse By Day
Browse By Time
Browse By Person
Browse By Room
Browse By Unit
Browse By Session Type
Search Tips
Change Preferences / Time Zone
Sign In
Bluesky
Threads
X (Twitter)
YouTube
Objective Structured Clinical Examinations (OSCEs) using standardized patients are a common method for assessing clinical reasoning skills. A student’s encounter with a standardized patient can produce rich data, but human scoring is resource-intensive. Large language models (LLMs) can aid scoring by identifying essential elements in OSCE encounter transcripts. Subject-matter experts (SMEs) annotating transcripts is a common way to generate training data for an LLM, but this can be expensive. Prompting an LLM to pre-annotate transcripts before manual annotation could reduce the need for resources and increase reliability. This case study explores methods and procedures to evaluate pre-annotations from a prompt-based generative AI using transcripts from an assessment of medical student clinical reasoning skills.