Paper Summary
Share...

Direct link:

Integration, Explainability, and Secondary Data in the Age of Machine Intelligence: Toward Retrofitted Research Design

Sat, April 26, 3:20 to 4:50pm MDT (3:20 to 4:50pm MDT), The Colorado Convention Center, Floor: Ballroom Level, Four Seasons Ballroom 2-3

Abstract

As machine intelligent systems move toward ubiquity, the secondary data used to train those systems is increasing in importance. Whether or not our research is directly connected to machine intelligence, international pushes for open science will make the data we produce as researchers available as secondary data. This paper aims to make a methodological intervention to “retrofit” design to secondary data. In outline, the procedure is: 1) determine the provenance of the data and perform an evaluation of its suitability; 2) identify the vectors of generalizability represented in the data; 3) determine which sets of data are quantity-dependent and which are quantity-agnostic; 4) identify potential forms of drift; 5) produce a relational map of how various data subsets intersect and tessellate.

Authors