Periodic Reporting for period 4 - ReservoirDOCs (The evolutionary dynamics of pathogen emergence and establishment: from Reservoir Detection to Outbreak Control)
Reporting period: 2022-02-01 to 2023-07-31
In this project, we aimed to unravel crucial steps in the emergence and establishment of key viral pathogens. We scrutinised the reservoir dynamics of HCV by sequencing complete hepacivirus genomes from infected samples emerging from a large-scale screening of African rodents, and analyzed the cross-species transmission history using novel evolutionary methods. To test hypotheses about the early establishment of HIV-1, we carved a genomic window into the past epidemic history of the virus by integrating molecular work on archival samples from Central Africa and on samples representative of the current HIV-1 diversity, using population dynamic models that incorporate epidemiological information. Finally, we took the Ebola epidemic in West Africa as a model to develop high-performance statistical approaches for extracting practical and timely epidemiological information from virus genome sequences during epidemics as they unfold. We further complemented these approaches with state-of-the-art visualisation tools.
We have developed and implemented sensitive protocols for successful recovery of HIV-1 sequences from archival samples, including both formalin-fixed paraffin-embedded samples and serum samples. We have screened such samples and have obtained genomic data from serum samples from the Democratic Republic of Congo (DRC). Analysis of this data allows characterizing the extensive HIV-1 group M diversity.
In terms of computational developments, we have developed an epoch model for codon substitution processes that allows time-dependent variation in both the overall substitution rate as well as the non-synonymous/synonymous substitution rate ratio (https://doi.org/10.1093/molbev/msz094). We have developed a novel mixed-effects molecular clock model in order to tackle the problem of heterotachy, which has been shown to lead to inconsistencies in estimating the age of HIV-1 subtypes (https://doi.org/10.1093/ve/vez036). We have extended our earlier developments in integrating covariates in non-parametric coalescent models by implementing such an approach in the context of a study on a Yellow Fever virus outbreak in Brazil (10.1126/science.aat7115). All these developments are being made available in the popular BEAST framework. We have worked on general improvements of the BEAST architecture to support these methods (10.1093/ve/vey016) and released a new version of the BEAGLE library (10.1093/sysbio/syz020) to make computations more efficient. Finally, we have developed a real-time version of our BEAST framework to support evolutionary analyses in unfolding epidemics (10.1093/molbev/msaa047) and we have designed new visualization tools for the estimates that are being produced by such an approach. These tools have been widely applied during the COVID-19 pandemic to track the spread of SARS-CoV-2.
Our work on archival samples has not only proven useful for HIV-1 evolutionary research, but it also for investigations into other viruses. The exploration of metagenomic-style techniques with our collaborators on our samples has for example led to the recovery of a yellow virus genome form an old sample. This prompts us to expand on this and further examine pathogens more broadly on pathology sample collections. For HIV-1, the genomic data we have generated helps to get a better insight into the early establishment of the viruses in humans.
With our ‘online’ BEAST approach, we have implemented the first real-time method in a widely-used Bayesian phylogenetic software. This approach allows to integrate newly available sequence data in a Bayesian inference analysis as it becomes available in an unfolding epidemic. We believe this is growing into a useful approach to analyze genetic data in both a timely and statistically robust fashion. Making these tools and associated visualization accessible to a broad audience is an important step in efficiently managing epidemics.