Periodic Reporting for period 2 - Content4All (Personalised Content Creation for the Deaf Community in a Connected Digital Single Market)
Periodo di rendicontazione: 2019-03-01 al 2020-11-30
CONTENT4ALL aims to make more content accessible to the Deaf community by developing the necessary technologies and algorithms to capture a sign interpreter in a broadcaster remote studio, process it and render it making use of a state-of-the-art photorealistic 3D virtual human, who will look like a real sign interpreter. It envisages a cost-effective mechanism that will encourage broadcasters to provide sign-interpreted content for many programs, regardless of the time of day and without any detrimental impact on the hearing users’ quality of experience.
The main goal of CONTENT4ALL is to propose a solution to the most immediate need of the Deaf users and broadcasters; providing a low-cost solution to create sign-interpreted versions of content produced for the hearing. The second major goal of the project is to create datasets and algorithms to enable automated sign-interpreted content creation in the longer term. An exploration of automatic sign-language production is the last objective of the project. The most important benefits of the project is related to the social impact, e.g. a larger pool of content will be made available to Deaf users at low costs to broadcasters satisfying the upcoming legislative requirements of EU governments. It will also be an enabling technology to allow the Deaf community to become more involved in its own content creation; however, the benefits will not only be to society but also financial benefits will also accrue to SMEs in the media production and sign-language translations businesses via the expended market place enabled by the technologies developed in CONTENT4ALL. Finally, other indirect business opportunities, that can leverage those technologies, can also be foreseen e.g. sign-language teaching, personalized Virtual Reality content creation for the hearing, etc.
This work resulted in a series of publications in top-tier journals and conferences; among the outstanding publications we can mention the two best paper awards at the Conference on Computer Vision and Pattern Recognition in 2019 and 2020. Moreover, for the technologies employed in the Demonstrator, CONTENT4ALL was awarded the NAB Technology Innovation Award 2020 reserved to innovative projects which are not yet commercialized but manifest a high potential of impact on the broadcasting market.
The other two achievements of CONTENT4ALL are 1) the release of a collection of 200h of broadcast content together with 20h of annotated videos i.e. sign language aligned with subtitles and 2) the creation of a laboratory proof-of-concept to demonstrate the advancement in the automatic sign language production.
Thanks to the strong relationship build with the deaf communities in Switzerland, Germany and Flanders, a set of 4 focus groups and several online questionnaires in different sign languages (e.g. DSGS, VGT, DGS, …) were held and distributed to collect user feedback on the different technological components, to continuously and incrementally improved the technical development.
Preliminary contacts with possible customers (broadcasters) were made and a detailed business plan derived accordingly.
To deploy such an innovative system, CONTENT4ALL makes use of Hardware and Software tools to capture a human signer in real-time in a studio which can be deployed in many locations even at home, thus called “Remote Studio”. This is cost-effective compared to a regular broadcast studio and is based on the state-of-the-art technologies developed to photo-realistically reproduce posture, gesture and facial expressions of the human sign interpreter via a 3D photorealistic virtual human. The model of a human is first recorded in a dedicated volumetric studio and then associated to a set of mathematical algorithms which allows the virtual human to exactly reproduce the real one. The creation and animation of such virtual human constitutes an advancement in the state of the art: first, the incorporation into a single human model of different elements at a high level of details e.g. hands at a precision of fingers or facial expressions including cheeks and forehead, then the challenge of animating such a model in real-time in the most precise way and without markers of obstructive elements. The generated stream of the virtual human is combined with the original broadcaster one and can be transmitted to watchers as a separate stream, to be watched on-demand on HbbTV 1.5 and 2.x compliant devices.
Another goal reached by the project is the creation of a repository of annotated data based on broadcast-quality video which is intended to be open to other research groups and within CONTENT4ALL will be used for training Artificial Intelligence models for experimenting with automatic translation into sign language and rendered with the improved 3D virtual human. The design of these algorithms, their implementation and testing constitute an advanced in the state-of-the-art, as they are among the first ones published in the scientific literature.
Finally, in the medium to long term, automated sign interpretation technologies are explored and tested. If this goal can be fulfilled, the sign language-speaking community in Europe will overcome the current state of media poverty closing that gap and explores avenues to enable a better society with equal opportunities.