Better resource management for metagenomic information processing
Problems presenting in the field of bioinformatics and computational biology are often related to intricate, time-consuming and 'loaded' computational processes. Researchers believe many of these operational difficulties can be resolved by using a computational grid, a network of computers in various locations. The bioinformatics field of metagenomics involves the study of genomes of a community of microbes. Problems presenting here have to do with annotation (identifying the key features) of these genomes as well as their assembly in various environments. Annotation gives biological meaning to raw datasets of information. The 'Algorithmics for metagenomics on grids' (Metagenogrids) project aimed to discover algorithmic solutions to these metagenomics problems, to be resolved on grid computing platforms. However, researchers decided to refocus the study when efforts failed to form a solid collaboration for study on metagenomics assembly. The new purpose was to come up with more efficient batch scheduling so as to improve the allocation and management of resources across computational clusters. An added challenge was that of ensuring the quality of service for users of these platforms. The EU-funded Metagenogrids team first tackled the issue of allocating resources for applications with unabated resource consumption rates and either never-ending execution times or unknown execution times. Several job-scheduling algorithms were developed, and results for high-performance computing (HPC) workloads were presented. In comparison to standard batch scheduling algorithms, results favoured using the newly proposed algorithms for vastly improved performance in the majority of experimental scenarios. This represented a step forward for the potential of combining virtualisation technology with lightweight scheduling strategies. The project's results have advanced efforts with a view to the future design of resource management systems for enhanced use of platforms, improved outcomes and quality of service. The foundations laid in this study appear strong enough to support further study that aims to extend results to applications whose resource consumption rates change during an application's execution.