Paper Summary
Share...

Direct link:

Digital Humanities Approaches in Archival Researches: Study on Mixed Cultural Identity of Lithuanian Jews in Early XX Century

Sat, November 23, 4:00 to 5:45pm EST (4:00 to 5:45pm EST), Boston Marriott Copley Place, Floor: 3rd Floor, Northeastern

Abstract

The paper discusses the problems of researching the identity of national minority’s members and ways of solving those using Digital Humanities approaches. A significant number of personal documents of Lithuanian Jews are currently stored in the YIVO archive in New York, in the National Library of Lithuania and the National Archives of Lithuania. They are written mostly in Yiddish, but also in Hebrew, Lithuanian and other languages, almost always by hand. This makes it very difficult to recognize the texts and analyze their contents.

The approach proposed engages neural networks and pre-trained language models for Yiddish and Hebrew text recognition. It includes training several models focusing on different types of handwriting styles on the base of Transkribus platform (since unlike widely spoken languages for which handwriting recognition software has already been developed, there is nothing like this for Yiddish and Hebrew). After recognition, the text should be marked up and then processed with specially developed applications in order to further group documents according to different criteria: document subject, author's gender (in the case of letters and diary entries), geography of residence, professional affiliation, and others. The analysis of documents using this methodology provided a new perspective on the influence of education level, gender and preferred language on the formation of the national identity of Lithuanian Jews in early twentieth century.

Author