Paper Summary
Share...

Direct link:

A Comprehensive Review and Meta-Analysis of Letter of Recommendation Bias and Criterion Related Validity

Sun, April 14, 9:35 to 11:05am, Philadelphia Marriott Downtown, Floor: Level 4, Franklin 5

Abstract

If you want to predict a person’s behavior, one of the best methods is to ask a close co-worker, friend, or relative to rate that person (Connelly & Ones, 2010). Observer ratings, obtained in a research context, are powerful and are even more predictive than self-ratings obtained in a low stakes setting. That is, people who know us well actually describe us more accurately than we do ourselves. Letters of Recommendation (LOR) are a long standing method that attempts to obtain this powerful information about a person’s typical patterns of behavior.
However, in practice, letters fall short of this ideal scenario in multiple ways. Faculty letter writers don’t always know us as well as our Mom does. This lack of knowledge may lead letter writers to rely on stereotypes to fill in the blanks in their knowledge with the result that the final letter is fundamentally biased. Letters are written in a high stakes setting where letter writers may be motivated to distort the content of the letter to make the applicant appear more desirable. The lack of standardization in most letters means that narrative letters may or may not contain information that is most useful to decision makers. Finally, decision makers interpret the information and subjectively integrate it with other information, an approach that is well known to degrade the available information (Kuncel, et al., 2013).
A modest literature has evaluated both the predictive accuracy of letters of recommendation as well as looking at potential sources of bias in letters. In this talk, we present our review and analysis of this literature with a focus on two questions. The first is simply asking if letters of recommendation actually predict much of anything in academic success. This is an update of the criterion related validity meta-analysis of letters of recommendation conducted by Kuncel et al. (2014). The second investigates whether there are systematic differences in LOR language when written for majority or underrepresented minority (URM) applicants.
Overall results for criterion related validity suggest that LOR are a weak to moderate predictor of both academic supervisor ratings and earned grades. These results are consistent with the Kuncel et al., (2014) meta-analysis and further strengthen the potential value of LOR as a tool in admissions. One surprising result was little difference in criterion related validity for structured versus unstructured letters, a result that warrants future research as it runs counter to the overall literature on psychological assessment.
The results from the review of bias were surprising. Looking across multiple studies, the average LOR length and use of adjectives including references to intelligence, grindstone, standout words, agentic, excellence, and outstanding demonstrate little to no difference between URM and majority students. This is in contrast to other narrative reviews that have focused on the occasional study that reports a difference (e.g., Hessen et al., 2022). These results and future directions will be discussed.

Authors