ESHS/HSS Annual Meeting

Individual Submission Summary
Share...

Direct link:

Probabilistic Record Linkage: From Radiation Epidemiology to Big Data

Tue, July 14, 11:00am to 12:30pm, Edinburgh International Conference Centre, Floor: Level 1, Harris Suite 1

English Abstract

Record linkage is the process of matching the elements of one database to the elements of another, for example, matching the patients in a medical registry with persons recorded by a national census. Today, it is foundational to many fields of activity. To take one example, Palantir Technology's core products link data sources for governments (particularly military and surveillance agencies); businesses (for industry and employee surveillance); and scientific research (particularly epidemiology). The idea of record linkage---as a distributed `book of life' written by each modern person---is often credited to the Vital Statistics division of the US Census, just after WWII. Modern record linkage is computer-assisted and probabilistic. Its origins lie in the radiation epidemiology studies at Atomic Energy of Canada, Ltd in the late 1950s, in research projects on the dangers of uranium mining and the genetic effects of radiation exposure under the direction of the biologist Howard Newcombe. The mathematical theory of probabilistic record linkage was established in 1967 at the neighbouring Dominion Bureau of Statistics. Despite the many changes in modern computing since the 1960s, the Fellegi-Sunter mathematical treatment continues to structure the field. This research is part of my investigation into the regulatory science of Canadian uranium mining. Here I will explore the movement of probabilistic record linkage from radiation epidemiology, as its logic spread to administrative statistics, to its modern ubiquity. I consider questions such as: which populations are at greater likelihood to be counted out by linkage? (I consider in particular Indigenous populations.) How have social groups reacted to these methods, including scientists and labour unions? And, to what degree, and how, has contemporary record linkage been conditioned by its epidemiological origins?

Author