Periodic Reporting for period 2 - AENEAS (Advanced European Network of E-infrastructuresfor Astronomy with the SKA)
Reporting period: 2018-07-01 to 2019-12-31
Once fully operational, the SKA is expected to produce an archive of science data products with an impressive growth rate on the order of 700 petabytes per year. Storage and computing resources associated by the SKA Observatory itself are expected to be constrained in order to support operations and the creation of the basic data products. Further processing and subsequent science extraction by users will require a global research infrastructure providing additional capacity in networking, storage, computing, and expertise. This research infrastructure will take the form of a federated, global network of SKA Regional Centres (SRCs). These SRCs will be the primary interface for researchers in extracting scientific results from SKA data and, as such, are essential to the ultimate success of the telescope.
The AENEAS (Advanced European Network of E-infrastructures for Astronomy with the SKA) project has investigated many aspects (including data storage and management, networking, computing, and user support) of the design of the network of SKA Regional Centres, with particular emphasis on the European component thereof. These studies have been captured in the various AENEAS deliverables. Together they start to fill in the design of the network of European SKA Regional Centres, some 5-7 years ahead of the official start of SKA observations. The design document gives a top level overview of the preliminary ESRC design – which is largely consistent with the ESRC requirements that AENEAS has collected. Along with the technical design, another goal of the AENEAS project was to identify the resources that could be provided to the SRCs by the European SKA partners and how these resources may be integrated into a unified analysis platform for SKA science.
For the more technical work packages, an assessment has been made of the variety of data products the ESRC will need to ingest in order to support SKA science. Using inputs from the SKAO, the computing and storage work package has characterized the type, number, and expected growth rate for the storage resources necessary to host the SKA science archive. In addition, they have characterized the workflows and associated computing resources needed to perform further post-processing and analysis on that data once hosted in the ESRC. This characterization includes a census of current relevant computing tools and middleware and these are included in the design recommendations for the ESRC computing platform. Cost information on computing and storage provided as part of this work, will prove extremely useful for the implementation phase.
The networking work package has conducted an extensive set of data transfer tests to characterize the expected network performance for data movement from the telescope sites in South African and Australia to the ESRC as well as best practices to optimize this performance. These results have been used to investigate different data distribution topologies within Europe as well as the network performance impact for different types of user access and analysis on the distributed SKA data. This resulted in design recommendations.
The work package team on user access and knowledge creation has been analyzing existing user interaction models for radio astronomy data in order to identify where modifications or enhancement will be necessary for the ESRC to support SKA science extraction. As part of this analysis a series of surveys have been conducted of both the user community but also several representative, operational radio astronomy facilities. These surveys considered, among others, types of data, methods of access, level of additional processing required, models for user support, and common analysis tools used. The results of these surveys have been published in deliverable reports and have resulted in recommendations for the ESRC user interfaces necessary to support both data discovery and analysis including possible improvements to the underlying Virtual Observatory (VO) software stack.
Services are essential for a distributed global network, which is intended to work in a seamless fashion, yet must also implement data access policies inherited from the telescope itself. These important services, including federate AAAI, were investigated in the final AENEAS work package.