Periodic Reporting for period 1 - ReCAP (Real-time Content Analysis and Processing (ReCAP) for Agile Media Production)
Période du rapport: 2016-12-01 au 2018-05-31
This leads to problem that companies involved in the creation and distribution of media don’t know enough about the content. For example; large archives of media content exist where the content (and therefore the value) is simply unknown; or content is distributed to inappropriate audiences; or without sufficient usage rights.
The ReCAP project sought input from content creators from multiple industry sectors about their particular challenges around these issues and identified these four key areas of concern:
● Content Discovery – to speed up fast turnaround workflows by providing automatic metadata enrichment
● Compliance - to identify inappropriate content, prior to distribution and transmission.
● Rights Management - to track licensed content and avoid accidental usage and penalties.
● Archive Enrichment - to identify and find historical content.
Fundamentally, ReCAP aims to reduce the amount of labour intensive, mundane, manual tasks typically assigned to people, who’s skills could be used more productively in the creative production process. A machine learning environment can process multiple files simultaneously, faster than real time, and produce accurate and meaningful information for people, or other systems, to use.
ReCAP also utilizes existing technology developed in the European Union to build an innovative and affordable solution that can be deployed immediately into real world businesses. Many commercial solutions are aimed at large enterprise and need lots of work to make them functional. ReCAP is aimed at small to medium sized creative businesses who need a solution quickly and don’t usually have large technical teams to manage complex ICT projects. ReCAP helps businesses by providing tools to make the process of media content production more efficient and cost-effective.
There were 4 main objectives at the outset of the ReCAP project;
1. To create a technology platform for automatic metadata extraction;
2. To adapt existing European visual analysis algorithms, codecs and multiplexers to fit the framework of the ReCAP platform;
3. To address real world workflow challenges;
4. To provide affordable and scalable state of the art technology to small and medium sized businesses.
● Automatic Speech Recognition (ASR)
● Optical Character Recognition (OCR)
● Face detection & recognition
● Logo detection & recognition
Crucially, businesses were attracted to the benefits of using machine learning to help address their workflow challenges, but were hesitant to use cloud services in view of potential challenges of processing large volumes of media content, which is typically stored on-premises, without creating additional workflow challenges or costs involved with moving large amounts of data to the public cloud, with the associated confidentiality and security risks. Additionally, the public cloud services are available to everyone and, therefore, difficult to train for specific scenarios.
To account for these challenges the ReCAP solution needed to be an easy to train software application with built-in processing capability, that could be deployed wherever needed, as well as the ability to consume public cloud machine learning services to achieve specific tasks.
ToolsOnAir developed the Media Processing Service (MPS), from an existing workflow software application to provides a business logic layer of abstraction from the underlying algorithm technologies and by leveraging the GStreamer open source multimedia framework media analysis pipelines could be assembled and run via an application programming interface (API).
The underlying algorithm technologies were developed and wrapped as GStreamer plugins by JOANNEUM RESEARCH and nablet. This work included the assessment and selection of different core machine learning open source projects, with ongoing improvements to the quality of the analysis. Algorithms were also benchmarked to ensure performance and accuracy against comparable machine learning services.
In order to demonstrate the software functionality, NMR designed an intuitive, web browser based Graphical User Interface (GUI), to control the MPS and to provide a frontend for users to quickly ingest and analyse media content and view the results. The metadata generated from the analysis are stored in industry standard formats which, along with the API, means that ReCAP is easy to integrate into other systems and existing workflows.
ReCAP was demonstrated, at various stages of development, at industry trade shows to gather user feedback and generate interest in the project and its results. The first public demonstration was at NAB 2017 (National Association of Broadcasters, April 2017, Las Vegas) and subsequent events - the EBU Metadata Developer Workshop (June 2017, Geneva), IBC 2017 (International Broadcast Convention, September 2017, Amsterdam), presentations at BVE 2018 (Broadcast Video Expo, February 2018, London) and NAB 2018 (April 2018, Las Vegas).
NMR also engaged with an European Commission Investment Expert Group initiative to assess the investment potential of products emerging from the Horizon 2020 project. The group assessed the leadership team capabilities, product readiness, market readiness and the financial strategy of the project and concluded that ReCAP was “ready for investment.”
Positioning ReCAP as offering “unlimited machine learning” out-of-the-box differentiates it significantly from all other machine learning providers on the market today. As a software application that puts the customer in control of deployment, performance and security, for a fixed, perpetual license cost.
ReCAP has been designed specifically for media and entertainment use cases, with the ability to easily train and integrate with other systems, and an intuitive user interface, is set against a backdrop of generic machine learning services. These factors lower the barrier to entry to state of the art machine learning technology.
ReCAP’s “unlimited machine learning” provides the opportunity for users to focus on being creative and be saved from the prospect of arduous and mundane tasks associated with manually tagging and logging thousands of hours of media content, and for businesses to make informed decisions about their rich media content.