Search
On-Site Program Calendar
Browse By Day
Browse By Time
Browse By Person
Browse By Room
Browse By Unit
Browse By Session Type
Search Tips
Change Preferences / Time Zone
Sign In
Bluesky
Threads
X (Twitter)
YouTube
As machine intelligent systems move toward ubiquity, the secondary data used to train those systems is increasing in importance. Whether or not our research is directly connected to machine intelligence, international pushes for open science will make the data we produce as researchers available as secondary data. This paper aims to make a methodological intervention to “retrofit” design to secondary data. In outline, the procedure is: 1) determine the provenance of the data and perform an evaluation of its suitability; 2) identify the vectors of generalizability represented in the data; 3) determine which sets of data are quantity-dependent and which are quantity-agnostic; 4) identify potential forms of drift; 5) produce a relational map of how various data subsets intersect and tessellate.