Decentralizing Climate Change Data while Preserving Scientific Value

Fri, September 1, 2:00 to 3:30pm, Sheraton Boston, 3, Fairfax B


The “Data Refuge” initiative saw the participation of many data archivists and volunteers from multiple North American universities, including, to mention just a few, UPenn, University of Toronto, UCLA, UCSD, MIT, and Harvard. Between December 2016 and March 2017, data archivists selected, downloaded, and stored datasets that were considered potentially in danger to disappear or being deleted under the current administration, and eventually made the datasets available to the scientific community, as well as to the general public, for reuse. These highly valuable climate change data now exist in multiple locations, and possibility in multiple copies. Also, datasets have been organized in novel meta-structures, curated with ad-hoc metadata and tags, and purposefully made redundant.

In this paper, I explore how does scientific value travel, along with the datasets themselves, in the Data Refuge initiative. I ask the following questions:
When climate change datasets are decentralized and made redundant,
1) How is the evidentiary power and contextual nature of the data preserved?
2) How is chain of custody, and provenance information, established and maintained?

This study is of empirical nature. Data collection include interviews with Data Refuge participants belonging to different teams regarding the ways in which the datasets were selected, nominated to the Internet Archive, downloaded, stored on different platforms, curated, tagged and organized, and made accessible for reuse. Previous empirical work conducted by the author on the scientific practices of data management, sharing, curation and reuse in science also informed this analysis.


