Skip to main content
European Commission logo
français français
CORDIS - Résultats de la recherche de l’UE
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary

The evolutionary dynamics of pathogen emergence and establishment: from Reservoir Detection to Outbreak Control

Periodic Reporting for period 4 - ReservoirDOCs (The evolutionary dynamics of pathogen emergence and establishment: from Reservoir Detection to Outbreak Control)

Période du rapport: 2022-02-01 au 2023-07-31

Extracted evolutionary and epidemiological information from pathogen genomes has grown into an important instrument across infectious disease research. By harnessing such information, molecular epidemiologists aim to shed light on the origin and epidemic history of pathogens, from reservoir dynamics to emergence and adaptation to new hosts, and their spatiotemporal spread. However, despite the revolution in genome sequencing technologies and advances in statistical methodology, key questions about pathogen emergence and establishment in human populations remain unresolved for major viral epidemics. When confronted with new viral outbreaks, such as the devastating Ebola virus epidemic in West Africa, we also struggle to deploy these technologies in a systematic and concerted way despite a critical need to support public health interventions.
In this project, we aimed to unravel crucial steps in the emergence and establishment of key viral pathogens. We scrutinised the reservoir dynamics of HCV by sequencing complete hepacivirus genomes from infected samples emerging from a large-scale screening of African rodents, and analyzed the cross-species transmission history using novel evolutionary methods. To test hypotheses about the early establishment of HIV-1, we carved a genomic window into the past epidemic history of the virus by integrating molecular work on archival samples from Central Africa and on samples representative of the current HIV-1 diversity, using population dynamic models that incorporate epidemiological information. Finally, we took the Ebola epidemic in West Africa as a model to develop high-performance statistical approaches for extracting practical and timely epidemiological information from virus genome sequences during epidemics as they unfold. We further complemented these approaches with state-of-the-art visualisation tools.
In order to perform screening and genomic sequencing of hepaciviruses in animal populations, we have implemented efficient procedures to extract and purify RNA from tissue and dried blood spot samples. We screened these extracts for hepaciviruses using a highly sensitive hemi-nested PCR assay targeting a fragment of the conserved NS3 protease-helicase gene. Positive samples were sequenced using a metagenomic protocol. We have applied this approach on a comprehensive set of small-mammal samples, primary rodents, from Africa and Asia and we have generated novel hepacivirus genomes.

We have developed and implemented sensitive protocols for successful recovery of HIV-1 sequences from archival samples, including both formalin-fixed paraffin-embedded samples and serum samples. We have screened such samples and have obtained genomic data from serum samples from the Democratic Republic of Congo (DRC). Analysis of this data allows characterizing the extensive HIV-1 group M diversity.

In terms of computational developments, we have developed an epoch model for codon substitution processes that allows time-dependent variation in both the overall substitution rate as well as the non-synonymous/synonymous substitution rate ratio (https://doi.org/10.1093/molbev/msz094). We have developed a novel mixed-effects molecular clock model in order to tackle the problem of heterotachy, which has been shown to lead to inconsistencies in estimating the age of HIV-1 subtypes (https://doi.org/10.1093/ve/vez036). We have extended our earlier developments in integrating covariates in non-parametric coalescent models by implementing such an approach in the context of a study on a Yellow Fever virus outbreak in Brazil (10.1126/science.aat7115). All these developments are being made available in the popular BEAST framework. We have worked on general improvements of the BEAST architecture to support these methods (10.1093/ve/vey016) and released a new version of the BEAGLE library (10.1093/sysbio/syz020) to make computations more efficient. Finally, we have developed a real-time version of our BEAST framework to support evolutionary analyses in unfolding epidemics (10.1093/molbev/msaa047) and we have designed new visualization tools for the estimates that are being produced by such an approach. These tools have been widely applied during the COVID-19 pandemic to track the spread of SARS-CoV-2.
Having sequenced 90 new complete hepacivirus genomes, we have considerably increased the currently available genomic data for this viral genus. In particular, this data allows to better assess the role of rodents as a potential hepacivirus reservoir. Our analysis points a particular rodent genus that has a high prevalence and that appears to facilitate co-infections. This represents the first observation of hepacivirus co-infections. The finding of co-infections is also indirectly supported by our assessment of hepacivirus recombination in various animal hosts. Our data also provides new insights into the time scale of hepacivirus evolution.

Our work on archival samples has not only proven useful for HIV-1 evolutionary research, but it also for investigations into other viruses. The exploration of metagenomic-style techniques with our collaborators on our samples has for example led to the recovery of a yellow virus genome form an old sample. This prompts us to expand on this and further examine pathogens more broadly on pathology sample collections. For HIV-1, the genomic data we have generated helps to get a better insight into the early establishment of the viruses in humans.

With our ‘online’ BEAST approach, we have implemented the first real-time method in a widely-used Bayesian phylogenetic software. This approach allows to integrate newly available sequence data in a Bayesian inference analysis as it becomes available in an unfolding epidemic. We believe this is growing into a useful approach to analyze genetic data in both a timely and statistically robust fashion. Making these tools and associated visualization accessible to a broad audience is an important step in efficiently managing epidemics.
conceptual representation of viral phylogeography