Paper Summary

Application of PAMS (Profile Analysis via Multivariate Statistics) to Interpreting Category Dimension Profiles in Correspondence Analysis

Tue, April 17, 10:35am to 12:05pm, Vancouver Convention Centre, Floor: Second Level, West Room 206

Abstract

Objective: The current study introduces the profile pattern approach to interpret dimensions in correspondence analysis.

Theoretical framework: The ordinary chi-squared statistic, used primarily for categorical data analysis, is designed only to examine significant row-column associations, but it does not provide any dimensional information for variable categories. Therefore, with the chi-squared test it is not possible to provide information about how much each of row/column categories contributes to the variation explained by dimensions. Moreover, there are some categorical data which are generically not independent. For example, when participants in certain subjects (e.g., Mathematics) are measured repeatedly by in different categories (e.g., time points), the data are not independent. Then, any chi-squared test cannot be conducted to assess row-column association. With the development of Correspondence Analysis (CA) (Greenacre, 2007; Nishisato, 1980), the dimensional information is now available for repeated or independent categorical data, and row/column category contributions for dimensions are also accessible.

Method: Rather than using the CA macro available in the R-language domain (http://cran.r-project.org), this study demonstrates the estimation procedures step-by-step, utilizing mainly singular value decomposition to bypass the complication of interpreting the CA results from the macro. The first step is to standardize the discretized data that include frequencies and conduct SVD of the standardized data to obtain dimension coordinates. Then, dimension coordinates are converted into correlation coefficients to enhance dimension interpretation.
Data sources: Students who completed Mathematics achievement tests across six time points (N=2708) in Early Child Longitudinal Study, Kindergarten (ECLS-K, U.S. Department of Education, National Center for Education Statistics, 2006) were analyzed and 51% were female. For CA, the Math achievement scores were discretized into seven performance categories (M1 – M7) for the rows utilizing Green and Rao’s (1970) recommendation. A seven (performance levels) by six (time points) frequency table was analyzed.

Results: The possible number of dimensions must be five, min⁡(7,6)-1 and the total inertia was 0.065092. An average inertia per dimension is .065092/5 = 0.013018 (20%). By this criterion, Dimension 1 (inertia=0.055937, 86%) is only above average and included for interpretation (labeled as “Increasing Profile”). The more detailed results will be included in the full paper.

Scholarly significances: There are two practically important aspects here. First, a categorical dimension is interpreted as its profile pattern by relating it to raw frequency profiles. Second, to enhance interpretation of the dimension coordinates, they are converted into correlations. The CA dimension coordinates are not usually interpreted as cardinal values, rather they are used by computing distances between row and column dimension coordinates in a plane.

Author