Periodic Reporting for period 2 - BioExcel-2 (BioExcel Centre of Excellence for ComputationalBiomolecular Research)
Reporting period: 2020-07-01 to 2022-06-30
BioExcel CoE was established to:
• Push the performance, efficiency, scalability, and usability of the selected software packages towards the Exascale in a co-design manner;
• Support convergence of HPC, High Throughput Computing (HTC), and High Performance Data Analytics (HPDA) with workflows combining HPC simulations with data management and analytics;
• Support and enlarge the user community (both academic and industrial) by providing workforce development, continued training, guidance, and best practices;
• Develop a sustainable and open community centre with user-driven governance and close global collaborations with US and Asian initiatives.
The need for extreme-scale computing in Life Sciences became apparent as the Covid-19 pandemic struck. Abilities for fast modelling and screening of potential drug candidates and antibody design are needed to dramatically reduce the time to a successful medicine or vaccine discovery, and thus damp the devastating effects of pandemics not only on individual lives but on the society as a whole. Through its prompt response, BioExcel showed the need for large pan-European initiatives which through concerted efforts can positively affect the fight against diseases.
Earlier workflow prototypes have been extended, optimized and some are already at maturity level for reliable production runs. They have been deployed on demand for several massively-parallel use cases including Covid-19. The first release of the BioBB library of application building blocks comes with a feature-rich set of modules with particular attention given to interoperability between the components. The release generates considerable interest in the communities including industry (AstraZeneca) and several collaborations have been started. It was also highlighted in the EU Innovation Radar. Portability and usability were further improved by containerization (Docker/Singularity) and packaging (CONDA), which along with integrations with Jupyter notebooks allow for smooth deployment and direct access on all major cloud infrastructures. Through collaboration with ELIXIR, we adopted FAIR principles for data management in all of our solutions. Combinations of PyCOMPSs workflow manager, BioBB building blocks, and the core applications GROMACS, HADDOCK and PMX to scale up key techniques applied to solving our main scientific use cases.
Despite impact from the COVID-19 we have organised a number of successful events that attracted significant attention from the community. We hosted 13 webinars attended live by 736 community members, recordings of which were viewed over 3500 times on our YouTube channel. We have made available 23 tutorials – an often requested training/support format which we identified as extremely valuable to users. We provided direct in-depth support to around 750 users of the core applications via various mechanisms including AskBioExcel forums with over 450 user queries resolved. Our forum accumulated almost 140,000 pageviews, of which 50,000 from registered & logged-in users and 90,000 from anonymous visitors (excluding web crawlers). The BioExcel Twitter account now has over 2000 followers, the unique views on our project is consistently between 2000-3000 per month, and the BioExcel Mailing list has over 1800 subscribers. Our successful competency-based training programme was further extended with integration of remote training. We assisted other organisations in making the switch to virtual training mandated by the pandemic.
We continued exisitng industrial collaborations and engaged in new ones with specific pilot projects with pharma companies. A BioExcel quality mark was developed in collaboration with SSI to increase the trust in our offerings. A service catalogue has been developed and deployed. IPR issues were addressed. The form of the legal entity to support commercial operations of the centre in the long term has now been decided. An Economic Association is to be incorporated in Sweden and initial legal discussions to deliver this are currently underway.
Quality assurance and KPIs were established, all targets were met with many reaching the stretch ones. Many activities such as consortium meetings were adapted in light of the increasingly remote/virtual style of events in the last two quarters.
COVID-19 efforts: Within the early days of the pandemic, BioExcel restructured efforts in support of addressing the crisis. In addition to working on specific COVID-19 related research, we focused on facilitating collaborations, extending community support, and providing access to HPC resources at partner centers. Some of the initiatives include establishment of the Covid-19 Molecular Structure and Therapeutics Hub (http://covid.bioexcel.eu); partnership in the Exscalate4Cov Consortium (https://www.exscalate4cov.eu ); launch of a dedicated web-server interface (https://bioexcel-cv19.bsc.es); doubling the number of concurrent jobs on the HADDOCK server to meet demand; signing a community letter in support of initiatives to share biomolecular modelling and simulations data; participation in numerous webinar series and presentations (including CECAM ones) covering methodologies, experience and results from our work.