Search
On-Site Program Calendar
Browse By Day
Browse By Time
Browse By Person
Browse By Room
Browse By Unit
Browse By Session Type
Search Tips
Change Preferences / Time Zone
Sign In
X (Twitter)
There are several challenges in evaluating the impact of community-based STEM education and training programs, namely balancing external and internal validity, choosing appropriate outcomes, selecting measures to accurately assess those outcomes, sampling, and defining impact in these contexts. Precisely what makes these programs successful and different from traditional efforts is what makes rigorous evaluation difficult. For example, experimental designs may not be feasible due to evolving program design, lack of suitable control or comparison groups, free choice in participating as a central aspect of the program, and ethical considerations from program developers about withholding a program from some students. Compounding these issues is that these programs often have a small number of participants, which leads to low power and a small chance of detecting effects. Dosage may also significantly vary across participants due to the nature of programs allowing participants to choose the “dosage”.
Perhaps the most crucial issue in evaluating the impact of these programs is determining appropriate outcomes and how to measure them. Outcomes that are typically examined in formal STEM education contexts are narrow and content-specific (e.g., STEM content knowledge). However, community-based STEM programs, particularly ones that work with underrepresented individuals, are rarely exclusively focused on increasing STEM content knowledge. These programs may focus on increasing curiosity about STEM, a sense of belonging to the science community, or networking skills, all concepts that are harder to define and measure (NRC, 2009). Extant measures of attitudes and beliefs are often not developed with diverse populations, thus compounding the measurement challenge and potentially limiting our capacity to determine what programs are effective and for whom. Finally, many programs like BULB and FirstHand aim to have long-term effects (i.e., employment in the STEM industry), which cannot be measured in the short term. This raises the question of what metrics researchers should use to determine the promise of these types of programs.
Despite all these measurement challenges, rigorous evaluation of these programs is crucial for determining program effectiveness and scaling and/or adapting a program (Fu et al., 2016). Therefore researchers need to identify solutions that can increase the rigor of research and evaluations in this context. In the session, researchers from AnLar will lead a discussion with the audience on approaches that can address these challenges and result in meaningful inferences for participants, stakeholders, and other researchers.