Periodic Reporting for period 3 - READ (Recognition and Enrichment of Archival Documents)
Periodo di rendicontazione: 2018-01-01 al 2019-06-30
The Virtual Research Environment “Transkribus” aims to provide benefits for all user groups involved in the “eco-system” of historical documents: Archives and libraries as content holders get the chance to enrich their documents on a large scale with full-text transcription and searching, (digital) humanities scholars are enabled to work intensively with historical documents in a sheltered and highly specialized environment, computer scientists are supported with large scale datasets and reference data and finally the public is supported to enjoy the benefits of accessing digital archives. More than 25,000 users are today subscribed in the Transkribus platform contributing with their documents, knowledge and engagement to the further development of the platform. On 1st of July the READ project turned into a European Cooperative Society (SCE) with limited liability. READ-COOP SCE will be the legal entity which maintains and further develops the Transkribus platform.
First of all we set a bunch of activities to make the project and the technology known to our four target groups. This started with a three days conference combing the public kick-off meeting of the project with a convention meeting of the co:op project. More than 150 people from over 20 countries took part in the conference. Videos of the presentations are online and an important resource for dissemination activities. Reactions on the conference were highly positive and opened the door to many archives and research groups. Dissemination activities were continued on several channels e.g. more than 20 workshops were organized by several groups in the project and held in a number of countries (Austria, France, Germany, Netherlands, Finland, Denmark, Norway, Italy, Switzerland, United Kingdom, Spain). Hundreds of people took part in these workshops and got familiar with the expert tool from the Transkribus platform.
Based on the overwhelming interest of archives and research groups in the project we were able to conclude more than 70 Memorandum of Understandings with institutions. These MoUs provide an excellent framework for cooperation. Among these are the Hessian State Archive (Germany), the Archivo Storico Ricordi (Italy), Huygens Institute for the History of the Netherlands (Netherlands), Alfred Escher Foundation (Switzerland) or The Linnean Society (United Kingdom), to mention just a few of this list. The success of this measure can be clearly seen by the number of users registered in the platform: After Y1 aboout 5000 users were registered in Transkribus, after Y2 nearly 9000 and now, at the end of the project more than 25.000 users are registered representing archivists, librarians, researchers, scholars and public users (family historians) from all over Europe and abroad.
Our second focus was the implementation of the Transkribus platform integrating a number of tools developed by the research groups in the project. Special attention was given here to defining interfaces and data exchange formats, to set up application servers for easy deployment of the single tools (which are coming in different operating systems and computer languages) and also to tackle the challenge of being able to store and process millions of images files. As a highlight of Y1 the award winning Handwritten Text Recognition engine from the CITLab team of the University of Rostock was implemented in the Transkribus platform. In Y2 major progress was made so that today Transkribus is able to offer the complete workflow for a text recognition project including the training of neural networks as well as keyword spotting. In Y3 and Y4 of the project a breakthrough in Handwritten Text Recognition and Layout Analysis was achieved by the teams from the Technical University Valencia and Rostock. Nowadays accuracy rates of clearly below 5% for historical handwritten documents can be achieved with a reasonable effort. Transkribus users have trained more than 2000 neural networks for their own specific documents. The data used for these trainings amount to a monetary value of 2-3 mill. EUR.
Y3 and Y4 was also used to prepare the foundation of a legal entity to further run and develop the Transkribus service platform. The decision was taken to go for a European Cooperative Society since this governance model enables us in the best way to provide services for our target groups in a collaborative manner.
Significant progress was made in terms of innovation. A specific device (ScanTent) and an app (DocScan) where created as prototype applications. The interest from user side is extremely high so that we will be marketing the device after the end of the project.