Datasets: Building the Foundation of Digital Humanities and Jewish Studies

Sun, December 17, 4:30 to 6:00pm PST (4:30 to 6:00pm PST), San Francisco Marriott Marquis, B2 (02) Golden Gate B (AV)

Session Submission Type: Roundtable

Session Sponsor: Posen Digital Library of Jewish Culture & Civilization


Increasingly, datasets are used in the creation and shaping of scholarship both within and outside the field of Jewish studies. This roundtable brings together scholars and creators of these public and private datasets. In this panel, we will explore how datasets can be used, curated, and built upon for scholarship, but also the challenges in the humanities of sharing data with others. We explore best practices and the ethical and moral obligations that come when sharing datasets. What does giving credit to those who have created and curated datasets look like? Do those who post the dataset have an obligation to keep the dataset up-to-date? Where is the line between the creation and curation of a public dataset and a scholar curating the data for their scholarship? The lack of best practices has stymied scholars, many who wish to share their data with others, while also wanting to receive credit for their work. Public datasets themselves also pose challenges: updating and maintaining datasets while having finite resources. As datasets become increasingly available, these challenges, questions, and best practices are more relevant and crucial than ever before.

Framing Questions:
1. How to make the datasets available to researchers? How can organizations with lots of data create portals for digital humanities projects?
2. What are the ethical and moral obligations to open data, both for the individual researcher and the data itself?
3. Does the public nature of the data affect possible insights and analysis?
4. How can we share conceptual thinking for creating metadata schemas particular to individual projects’ discoverability so that they can help others.

Laura Eckstein (University of Pennsylvania) will discuss the need for collaboration among repositories, both to keep the individual repositories up to date and to increase access.
Rachel Deblinger (UCLA) will offer insight from the Modern Endangered Archives Program and the work of supporting data creation for open access digital archives. She will also discuss the pedagogical value of working with undergraduate students to clean and analyze open access datasets for cultural heritage materials.
Dikla Yogev (University of Toronto) will discuss some of the proprietary issues that arise with sharing your data—wanting to be a good digital citizen vs. protecting your work/IP. As well as collaborations between non-public projects and the technical challenges in dataset integration.
Michelle Margolis (Columbia University) will discuss data integrity, dealing with shared data (including importing data from other projects into yours), and data models and standards.

