Paper Summary

Direct link:

Multivariate Count Data Analysis Using a Bayesian Hierarchical Multinomial-t Compound Regression: A Demonstration With Collocations

Mon, April 12, 4:30 to 6:00pm EDT (4:30 to 6:00pm EDT), SIG Sessions, SIG-Multilevel Modeling Paper and Symposium Sessions


A collocation is a system of words that tend to be found together, and collocations are essential for oral fluency. We analyzed the use of seven different collocation types by sixty Spanish speakers with varying levels of proficiency. Substantively, we were interested in differences in collocation type use between non-native speakers and native speakers. We implemented a hierarchical multinomial regression with a t-distributed observation-level random effect. The hierarchical component was implemented to efficiently estimate a 7-by-3 interaction and the t-distribution was included to capture over-dispersion while accounting for outliers. Approximate cross-validation suggested the model performed favourably relative to more commonplace alternatives. Additionally, we identified certain differences in collocation preference by speaker proficiency.