Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary

Integrated human data repositories for infectious disease-related international cohorts to foster personalized medicine approaches to infectious disease research

Periodic Reporting for period 3 - RECODID (Integrated human data repositories for infectious disease-related international cohorts to foster personalized medicine approaches to infectious disease research)

Reporting period: 2022-01-01 to 2023-06-30

ReCoDID includes a multidisciplinary team of researchers from the global infectious disease arena, leveraging existing infrastructures and partnerships to develop a sustainable model for the storage, curation and analyses of the complex data sets collected from infectious disease-related cohorts. While infectious-disease related cohorts collect both data from (i) participant interviews, clinical assessment, and related geospatial, social, and environmental exposures (hereafter named clinic-epidemiological data, CE data), and (ii) high-dimensional data from advanced laboratory analysis on clinical samples (HDL data or OMICS data), a system is needed to facilitate the synthesis and analysis of these data, which are typically stored in separate repositories, within and across cohorts. Combining data repositories across cohorts for infectious disease research is rare. However, significant investments have been made in sharing CE and HDL data from population-based registries in high- income countries to improve personalised medicine, especially in chronic and rare diseases.

Combining data repositories across cohorts for CE and HDL data, the overarching goal of this project is to develop an integrated, sustainable platform for the sharing, synthesis, and analysis from infectious disease cohorts in keeping with the principles of FAIR (Findable, Accessible, Interoperable, and Reusable) data-driven research. The project will also develop biostatistical methods needed to analyze pooled cohort data. The repository platform will have a tiered permission system for data access. Cohort-specific hubs will facilitate cohorts’ analyses of their own data and cross-cohort analyses within a clearly elaborated legal, ethical, and equitable framework for cross-study data sharing.
In the third reporting period, the DPA between Heidelberg University Hospital and ERASMUS was finalized, the DPA with the French INSERM/ANRS initialized, and discussions with the EC-Projects ‘END-VOC’, ‘ORCHESTRA’, and ‘ECRAID’ were continued about adopting the ReCoDID data curation/sharing pipeline for their respective projects.Cognitive interview study was published and core key task for RP3 - the Data Governance Framework (DGF) – generated.Alandscape analysis of different data standards was carried out, leading to a (virtual) expert meeting at the beginning of the third reporting period in March 2022. ReCoDID researchers were able to continue this discussion with the stakeholders during their in-person stakeholder meeting at the EMBL in Hinxton, UK, in October 2022. After this, ReCoDID is on its way to becoming a convenor or broker for bringing groups invested in data standards, data sharing, and data harmonisation together. We have made great progress in its methodological tasks with several manuscripts published around measurement error and causal inference in pooled cohorts. We have further refined and drove the data integration on the EMBL platform after pilot datasets became available. The legal prerequisites of sharing were challenging but could finally be cleared for a cohort pilot data set through partner, EMC. This pilot shares CE datasets, with linked serological datasets and study metadata, therefore including a combination of data sharing at the European Genome-phenome Archive (EGA), BioStudies, ArrayExpress and the Cohort Browser at EMBL-EBI.
We have continued our efforts around sharing biological specimens and developing governance models for a federated biorepository. This research area has become more competitive over time since ReCoDID started, which shows that the ReCoDID researchers had the right intuitions about future areas of high importance. ReCoDID has gained substantial convening power over the course of the project, also being involved in many stakeholder interactions over the last years about data sharing and harmonisation, both in the ethical/legal space as well as in the technical area.
Our work successfully integrated COVID-19 cohort metadata into the EMBL COVID-19 Platform, harmonising diverse clinical-epidemiological data from H2020-funded COVID-19 research projects. The team established connected Data Hubs for hosting both clinical-epidemiological and SARS-CoV-2 OMICS data in a federated cloud. Additionally, we were able to build a comprehensive framework for harmonising anti-SARS-CoV-2 antibody serosurvey data.
The work in ReCoDID was affected and delayed by the COVID-19 pandemic, but at the same time the project proved to be more relevant than ever, which is reflected in the fact that ReCoDID was selected for additional funding, adding a whole work package on „COVID-19 research response“ (WP8). Subsequently, ReCoDID was featured in high-level consultations and has been included as a project ‚to collaborate with‘ in several new EC calls for proposals.

ReCoDID was also very active within the EU-CAN cross-consortial activities where several cross-consortional working groups were created, among them one on ‚data harmonization‘. The project works toward a broader vision of integrated data sharing platforms and (virtual) federated repositories for biological samples. Both topics have gained substantial traction since the time the project was evaluated and selected for funding. ReCoDID was able to partner with one of the NIH-funded CREID networks (Centers for Research on Emerging Infectious Diseases (CREID). At a conference in November 2022, organized by ‘The Global Health Network’ (TGHN) in Cape Town, ReCoDID and CREID had a common stand. Within the ethical/legal work package (WP2), the group decided to embark on additional work beyond the initial aims of the project and carry out cognitive interviews about the participants‘ perception of broad informed consent in research. This deepened the understanding of the interpretation of the GDPR legislation across institutions and EU member as well as non-member countries.

The project connects across the legacy of several EC projects and open-source infrastructures. More specifically, the ReCoDID project, with its focus on the harmonisation and linking of data across domains, approaches the development of infrastructure, software, tools, and services as a process in which established pre-existing and openly available components, such as databases, data standards and software packages, are brought together to build an ecosystem of connected parts able to support the complex integrative workflows of the project.

The work on the standardization of data, including a model CRF for acute febrile illness data sets has led into the development of a web-based tool, the “eCRF Builder”, for which the pilot stage is being completed during the next 3 months. After that, the eCRF Builder has already found its new ‘home’ in the newly funded project “CONTAGIO” and will be further developed by joined team of ReCoDID and the ‘Infectious Disease Data Observatory’. This will increase the visibility as well as the acceptance of the eCRF Builder, increasing the chances of critical input obtained by and be adopted by the scientific community.

Other exploitable results of ReCoDID also include the a) Data Governance Framework and repository of legal templates, b) the methological tools for pooled cohort data, c) the governance of federated biorepositories.
ReCoDID Work Packages, including new 'WP 8 on COVID-19 Research Response'