Search
Browse By Day
Browse By Time
Browse By Person
Browse By Room
Browse By Committee or SIG
Browse By Session Type
Browse By Keywords
Browse By Geographic Descriptor
Partner Organizations
Search Tips
Personal Schedule
Change Preferences / Time Zone
Sign In
Educators do not know the locations of true benchmarks on an assessment score scale. We usually reveal them at a policy linking workshop through high-quality item performance ratings of the panelists, consistency of their ratings within a panelist (intra-rater), and consistency of ratings across the panelists (inter-rater). Based on Round 1 ratings, facilitators must locate panelists with poor item performance ratings and help them with more information to improve their ratings in Round 2; Round 2 is often used for setting the final or recommended global benchmarks.
Facilitators must also identify items with a lot of variations in ratings across the panelists to ensure panelists have a complete and consistent understanding of knowledge and skills required to answer the item correctly. So, intra- and inter-rater consistency indices should be accurately estimated, wisely reported, and use for formative rather than summarize purposes. While calculating the inter-rater consistency for oral reading fluency tests, handling the missing data is critical. If we do not manage it well, it may give you an inaccurate estimate of the intra- and inter-rater consistency. We should either handle missing data wisely or use a sort of weighted intra- and inter-rater consistency.