Skip to main content
European Commission logo
polski polski
CORDIS - Wyniki badań wspieranych przez UE
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary

F.A.I.R. information cube

Periodic Reporting for period 1 - FAIRiCUBE (F.A.I.R. information cube)

Okres sprawozdawczy: 2022-07-01 do 2023-12-31

The core objective of FAIRiCUBE is to enable players from beyond classic Earth Observation (EO) domains to provide, access, process, and share gridded data and algorithms in a FAIR and TRUSTable manner.

To reach this objective, we are creating the FAIRiCUBE HUB, a crosscutting platform and framework for data ingestion, provision, analysis, processing, and dissemination, to unleash the potential of environmental, biodiversity and climate data through dedicated European data spaces. Within this project, TRL 7 will be attained, together with the necessary governance aspects to assure continued maintenance of the FAIRiCUBE HUB beyond the project lifespan.

This project’s goal is to leverage the power of Machine Learning (ML) operating on multi-thematic datacubes for a broader range of governance and research institutions from diverse fields, who at present cannot easily access and utilize these potent resources. Selected use cases illustrate how data-driven projects can benefit from cube formats, infrastructure, and computational benefits. They guide us in creating a user-friendly FAIRiCUBE HUB, which is tightly integrated to the common European data spaces, providing relevant stakeholders an overview of both data and processing modules readily available to be applied to these data sources. Tools enabling users not intimately familiar with the worlds of EO and ML to scope the requirements and costs of their desired analyses will be implemented, easing uptake of these resources by a broader community. The FAIR sharing of results with the community is fostered by providing easy to use tools and workflows directly in the FAIRiCUBE HUB.
We have achieved significant technical and scientific advancements through the establishment of the FAIRiCUBE Hub and the data science work of our five use cases (UCs) addressing the European Green Deal priority actions. In the following, we highlight the key activities and achievements in our pursuit of technical excellence:

FAIRiCUBE Hub
We have established the architecture of FAIRiCUBE Hub , which serves as the cornerstone of our project. The FAIRiCUBE Hub is a crosscutting platform and framework for data ingestion, provision, analysis, processing, and dissemination, to unleash the potential of environmental, biodiversity and climate data through dedicated European data spaces. We have made substantial progress in outlining its structure and functionality. Central to this effort has been the identification of services falling under the FAIRiCUBE Hub umbrella, paving the way for a comprehensive ecosystem of interconnected functionalities. Furthermore, initial steps have been taken towards harmonizing and integrating these services to ensure user-friendliness and accessibility.

Holistic Meta-Data Management Approach
An important objective of our FAIRiCUBE project has been to streamline the collection, ingestion, alignment, and availability of meta-data alongside earth observation and socio-economic data. We see this as a major milestone to address the F.A.I.R. aspect of FAIRiCUBE! To this end, we have devised user-friendly routines for data and metadata ingestion. Additionally, we have extended our efforts to encompass the inclusion of analysis and processing metadata, enabling comprehensive data management capabilities. A semi-automatic monitoring system for computer resources has been implemented to ensure optimal performance, further enhancing the efficiency of our operations. Our Knowledge Base (KB) stands as a unified access point to all metadata, fostering accessibility and facilitating data-driven insights.

F.A.I.R. and Open Documentation of Data Science Work
The FAIRiCUBE Hub serves as the main infrastructure for the execution and documentation of the use cases (UCs) data science work. It serves both as demonstrators of our capabilities and as a source of valuable project execution and management information. Through these UCs, we have identified and addressed for example data availability and ingestion bottlenecks, monitored their resource requirements as well as service interoperability issues, thereby enhancing the robustness of our platform with upcoming developments. The technical and scientific UC work is published and updated through dedicated GitHub repositories and FAIRiCUBE communicate tools. Especially for machine learning applications, a full and transparent description bundled with the scientific documentation of the process and results will provide a leap forward in open science applications. Our vision is to establish standards in that field and demonstrate typical data science work executed under the open science ready FAIRiCUBE Hub.

Enabling Data Science Resources for Non-Specialists
We are committed to making data science resources accessible to everyone, not just specialists. This involves providing guidance on integrating data science into existing workflows and offering access to essential resources such as data catalogues, storage, compute, and machine learning tools. Comprehensive documentation of the UC work has been partially made available to facilitate data-driven endeavors and future projects executed on the FAIRiCUBE Hub. Further, we have identified aspects of the data science work that still require technical competence. Realistically, not all subtasks of creating a data cube that is ready for further processing and ML applications can be automated and hidden from users. We will however provide documentation and guidelines to assist future users of FAIRiCUBE Hub services.
A lot of the available Earth observation data in the data catalogs provided through the technical partners and that were requested for the UC work come with incomplete meta-data description. Usually, the data file names encode some meta data information which can lead to misinterpretation. As we have developed a harmonized meta-data ingestion pipeline, we have enriched some of the datasets with additional meta data information that will be stored, maintained and made available to public users through the FAIRiCUBE knowledge base.

Our meta data ingestion pipeline can be thereby demonstrate the benefit for a wider audience of data scientists. As this pipeline has been jointly developed and defined data ingestion and meta data specialists, we have thereby formulated a base set of information of meta data fields that can serve as a standard for other data ingestion pipelines.

FAIRiCUBE has successfully initiated the integrates and implements AI ethics assessment into machine learning applications. Through walkthroughs with UCs owners and partner, we have laid the groundwork for embedding socio-technical scenarios into our methodologies, ensuring the ethical deployment, assessment and validity of AI technologies used in the FAIRiCUBE project work.

Further, FAIRiCUBE and its partner Constructor University (CU) have been involved in the following ISO and OGC standardization processes. Additionally, the rasdaman subcontractor team under CU recently participated in OGC Testbed-19 in the topics GeoDataCube and Analysis-Ready Data where the FAIRiCUBE experiences proved valuable.