Search
Program Calendar
Browse By Day
Browse By Time
Browse By Person
Browse By Room
Browse By Unit
Browse By Session Type
Help
About Vancouver
Personal Schedule
Sign In
Goals/Purposes.
In this session, we will present assessments developed to empirically measure the depth of semantic knowledge and examine variation in vocabulary depth across three domains: general academic vocabulary, history/geography, and biology/ecology. We will: (i) describe the design of four item types intended to capture different levels of depth of semantic knowledge; (ii) discuss our hypotheses; (iii) present psychometric results from a series of large-scale studies; and (iv) analyze how the depth of semantic knowledge varies by ability level and grade across the three domains.
Perspectives/Theoretical Framework.
Perfetti & Hart (2001) describe word knowledge as varying in orthographic, phonemic, syntactic, and semantic quality (the Lexical Quality Hypothesis). We expect that word knowledge accumulates gradually (Durso & Shore, 1991). This paper draws on earlier approaches to assessing depth (Brown, Frishkoff, & Eskenazi, 2005; Scott, Hoover, Flinspach, & Vevea, 2008), but focuses on one critical dimension: the richness of semantic knowledge.
Methods/Techniques.
(i) Mode of Testing: Paper-and-Pencil test administrations
(ii) Item Analysis: Proportion correct (P+), point-biserial correlations, and option-choice frequencies by item; 3-Parameter Item Response Theory (3PL IRT) analyses; One-Way ANOVA.
(iii) Vocabulary Patterns (depth measures across words): Correlation analysis, ANOVA, trend analysis, exploratory factor analysis.
Data Sources/Evidence.
Data is primarily drawn from two studies, one focused on academic vocabulary, and a second focused on two academic subjects – history/geography and biology/ecology.
Study 1. The target population was comprised of 1,449 7th- and 1,622 8th-grade students from urban, suburban and rural school across the US. A within-subjects design was employed to measure performance for the same general academic words across three depth measures. Items were created for two sets of ten words. Students were exposed to one set of words, three items for each word, one per item type, plus an anchor set of 20 synonym items – piloted in a previous study and selected to cover a range of difficulty while maintaining acceptable discrimination. Items were grouped together by type to reduce cognitive load, but the order of the blocks was varied, yielding 12 forms that were spiraled throughout each school, grade, and classroom.
Study 2. The target population was comprised of 7th-, 9th-, and 11th-grade students (4,543 in science and 4,164 in social studies) from urban, suburban, and rural schools across the US., evenly split across grades. All students received a common anchor set of 30 items measuring general academic vocabulary. In addition, each student received 20 topic-specific science or social studies vocabulary items drawn from two of the four depth item types, with forms spiraled throughout each school, grade, and classroom.
Results. In Study 1, the mean item difficulty (p+) fell within the expected order (confirmed by One-Way ANOVAs), whereby item types requiring the most superficial vocabulary knowledge were easiest versus those that required deeper semantic knowledge. Reliability was high (>.8). Data for Study 2 were collected in May-June, 2011. A comparison between the results of Study 1 and Study 2 will be made and the implications will be discussed.
Rene R. Lawless, Educational Testing Service
John P. Sabatini, ETS
Paul Deane, Educational Testing Service
Isaac I. Bejar, ETS
Chen Li, ETS