Periodic Reporting for period 3 - EUCAN-Connect (A federated FAIR platform enabling large-scale analysis of high-value cohort data connecting Europe and Canada in personalized health)
Periodo di rendicontazione: 2022-01-01 al 2023-06-30
Make FINDABLE - enable cohort data discovery down to data item and subject levels. We bridged existing metadata catalogue efforts from BBMRI, Maelstrom, and birthcohorts.net. We have collaboratively defined reference data model, implemented in Maelstrom/Mica and MOLGENIS catalogue software. We developed the Federated Catalogue (D2.2 https://catalogue.eucanconnect.eu/#/) with data from Maelstrom, Birthcohorts.net RECAP and shared LifeCycle/ATHLETE/LongiTools source catalogues. We identified key areas of focus for the community curation tools (T2.4).
Make ACCESSIBLE - deliver a low-maintenance open data access and process architecture. We expanded the DataSHIELD platform for federated analysis, developed a new method to ease the installation and data loading by the cohorts via a Docker based system called Coral; created a separate data API that allows diverse software to be connected, implemented this in MOLGENIS ‘Armadillo’, and expanded DataSHIELD to allow access to large ‘resources’ (e.g. omics files).We created “Profiles” to allow different DataSHIELD configurations; effected a wider roll-out of nodes running Opal, Coral and Armadillo. Armadillo has been deployed in over 30 nodes; system monitoring is now available in Coral.
Make INTEROPERABLE: accelerate data harmonisation, retrospectively mapping cohort data to standard variables to enable pooled analysis. We focussed on best practices to 1) explore the study-specific data and samples; 2) evaluate harmonization potential across studies; 3) process study-specific data under a common (i.e. harmonized) format; 4) estimate the quality of the harmonized data generated; and 5) generate the information required to achieve data analysis. We submitted a publication on the methodological framework for data harmonization; developed an R package to support the harmonization process and quality control; developed guidelines and templates to guide the documentation of harmonization initiatives; and documented six research projects from WP6 and 24 Canadian studies part of ReACH.
Make REUSABLE: developing DataSHIELD bioinformatics toolboxes and federated analysis methodologies. This includes successful release of DataSHIELD v5.0 and v6.0; implementation of continuous testing for all functions; systems for interacting with, training and supporting DataSHIELD users (website and forum); community meetings and multiple workshops. We delivered D5.1 Bioinformatics Toolbox catalogue of tools and methods for federated and/or privacy protected analysis of cohort studies and biobanks and D5.2 Training material DataSHIELD users required “Complete extension and customisation of DataSHIELD, Opal, and MOLGENIS for the specific analytic needs of EUCAN-Connect.”
Make COLLABORATIONS: promote uptake by the research community at large. We created demonstrator projects and engaged existing analysis projects in LifeCycle, RECAP, ReACH and InterConnect on 1) longitudinal life course analyses from early life onwards ; 2) (epi)genomic origins, microbiome and virome adaptations; 3) early-life exposome-related risk factors; and 4) personalized prevention strategies, related to cardio-metabolic, respiratory, musculoskeletal and developmental health and disease. Additionally, a EUCAN - Connect workshop on health outcomes was organized including attendance from LifeCycle, ATHLETE, LongiTools .
Make SUSTAINABLE: ethical and legal governance and extending capabilities beyond the reach of this project. Therefore, we started development appropriate ethical and legal governance framework; made substantive progress in evidencing expectations in EUCAN-Connect’s stakeholder community; and established a governance advisory group and an ELSI expertise forum. Substantial progress has been made as regards to the analysis of qualitative interviews, on long term sustainability of components delivered, participant observations and further ECOUTER sessions with consortium members.