Individual Submission Summary
Share...

Direct link:

Criminal Justice Data and GDPR: how to reconcile personal data protection with scientific research purposes

Fri, September 13, 8:00 to 9:15am, Faculty of Law, University of Bucharest, Floor: Ground floor, Room 1.17

Abstract

The development of semantic and speech technologies depends to a large extent on the amount of high-quality data on which machine learning models are trained. The criminal justice system captures a massive corpora of recorded audio data from transcripts of judicial proceedings. However, language corpora, as a rule, contain personal data, and speech falls in the category of personal data, which is subject to special protection under the personal data protection framework in the EU. The development of semantic and speech technologies for the Slovene language is, therefore, on the one hand, hampered by the lower number of available text and speech resources compared to larger languages and, on the other hand, by the constraints imposed by the strict data protection regime based on General Data Protection Regulation (GDPR) and national legislation (ZVOP-2). However, in principle, GDPR does allow data to be processed for scientific research purposes, even if they were primarily collected for another purpose (derogation from the purpose limitation principle). In addition, it is permissible to limit the exercise of certain rights of the data subjects when processing data for research purposes if these rights unduly burden the research process and if the national legislation of the Member State permits it. However, in Slovenia, the exceptions applicable for scientific research purposes are not sufficiently clear, which harms the potential access to criminal justice data and, consequently, specific research projects aimed at developing semantic and speech technologies. The paper will present the process of preparing the groundwork for data acquisition, data extraction, data anonymisation, and the establishment of documentation, procedures, and rules for data processing needs.

Author