Search
In-Person Program Calendar
Browse By Day
Browse By Time
Browse By Person
Browse By Room
Browse By Category
Browse By Session Type
Browse By Affiliate Organization
Search Tips
Sponsors
About ASEEES
Code of Conduct Policy
Personal Schedule
Change Preferences / Time Zone
Sign In
The paper discusses the problems of researching the identity of national minority’s members and ways of solving those using Digital Humanities approaches. A significant number of personal documents of Lithuanian Jews are currently stored in the YIVO archive in New York, in the National Library of Lithuania and the National Archives of Lithuania. They are written mostly in Yiddish, but also in Hebrew, Lithuanian and other languages, almost always by hand. This makes it very difficult to recognize the texts and analyze their contents.
The approach proposed engages neural networks and pre-trained language models for Yiddish and Hebrew text recognition. It includes training several models focusing on different types of handwriting styles on the base of Transkribus platform (since unlike widely spoken languages for which handwriting recognition software has already been developed, there is nothing like this for Yiddish and Hebrew). After recognition, the text should be marked up and then processed with specially developed applications in order to further group documents according to different criteria: document subject, author's gender (in the case of letters and diary entries), geography of residence, professional affiliation, and others. The analysis of documents using this methodology provided a new perspective on the influence of education level, gender and preferred language on the formation of the national identity of Lithuanian Jews in early twentieth century.