Search
Browse By Day
Browse By Time
Browse By Person
Browse By Policy Area
Browse By Session Type
Browse By Keyword
Program Calendar
Personal Schedule
Sign In
Search Tips
Background
According to the National Academies of Sciences, Engineering, and Medicine (2018), employers across disciplines are demanding that employees have skills in working with and extracting knowledge from data. College administrators have reacted, with institutions rushing to add undergraduate data science degree programs in different market sectors and segments (Swanstrom, 2020). Undergraduate data science degrees are an important part of a larger trend of the exponential creation of new degree programs at higher education institutions (National Center for Education Statistics [NCES], n.d.).
There is considerable variation in what data science education actually entails, and, by extension, variation in how undergraduate programs prepare students for data-intensive careers. Experts question whether it is possible to adequately train students at the undergraduate level in all areas of data science, which includes computer science, statistics, and a domain, plus ethics and other “soft skills” (Irizarry, 2020). Early adopters of undergraduate data science degree programs dedicate considerable coursework to statistics and computer science, with little consideration for domain-specific education, ethics, areas of communication, workflows, and reproducibility practices; moreover, the academic unit administering the degree program significantly influences the courseload distribution of computer science and statistics/mathematics courses (Anonymous, 2021).
Conceptual Framework
The National Center for Postsecondary Improvement (NCPI) market taxonomy can be helpful for elucidation of social and market forces in higher education influencing curricula decision-making. The NCPI market taxonomy demonstrates that institutional characteristics, such as completion rates, rankings, market prices, and demographic profiles, may be used to predict adoption of new curricula (Zemsky & Shaman, 2017). According to the taxonomy, institutions with less market power are under pressure to react swiftly to changes in consumer, i.e., student, preferences (Zemsky et al., 2005).
Data and Methods
I investigate the institutional characteristics of adopters of undergraduate data science degree programs through traditional regression methods and machine learning. Empirically, I draw on the Carnegie Classification of Institutions of Higher Education to select R1 and R2 doctoral universities and liberal arts colleges. I employ NCES data, specifically, the Integrated Postsecondary Education Data System (IPEDS)—a rich source of institutional variables—to construct a data set and include variables to measure market segment. I apply traditional regression methods, specifically, logistic regression, with a priori theory to set up testable hypotheses for what institutional variables are associated with adopters of undergraduate data science degree programs. I then use machine learning methods, including logistic regression with cross-validation, random forest analysis, and k-means clustering.
Forthcoming Findings
Applying traditional logistic regression methods gives a trend description and analysis of conditions. I project future developments by utilizing machine learning methods to attempt to better understand the types of institutions that will adopt undergraduate data science degree programs. Analyses will provide insight for the evaluation and selection of alternatives for higher education institutional offerings. This work affords understanding similar trends in other academic fields of study and ties into discussions on the value of a liberal arts education.