Final Report Summary - GENIUS (Gaia European Network for Improved data User Services)
The GENIUS project was conceived to boost the impact of one of the largest European scientific endeavours, the Gaia astrometric mission. Gaia is an ESA Cornerstone mission launched in December 2013 and is producing the most accurate and complete 3D map of the Milky Way to date. A pan-European consortium named DPAC is in charge of the Gaia data processing, of which the final result is a catalogue and data archive containing more than one billion objects. The archive system containing the data products is located at the European Space Astronomy Centre (ESAC) and is already serving to the community the first product of the mission, the Gaia Data Release 1 (Gaia DR1).
The design, implementation, and operation of this archive is a task that ESA opened up to participation from the European scientific community and GENIUS has significantly contributed to this development in several significant areas: it has ensured that the archive design is driven by the needs of the user community, it has provided the users with several advanced exploitation tools to maximize the scientific return, it has ensured the quality of the archive contents and its interoperability with existing and future astronomical archives, it has consolidated the cooperation with the only other two astrometric missions in the world, the Japanese NanoJASMINE and JASMINE and, last but not least, it has carried out outreach and academic activities to foster public interest in science in general and astronomy in particular.
During the three and a half years of its execution GENIUS has fit seamlessly into the existing Gaia activities, exploiting the synergies with ongoing developments. Its members have actively participated in these tasks and have provided an in-depth knowledge of the mission as well as expertise in key development areas. Furthermore, GENIUS has had the support of DPAC, several Gaia national communities in the EU member states, and has fostered the cooperation with the Japanese institutions involved in JASMINE.
Project Context and Objectives:
Context: the Gaia mission
This proposal is devoted to the development of the archive of ESA’s astrometric mission Gaia, aimed to study the origin, formation, and evolution of the Milky Way and its components. The Milky Way is composed of a disk of about 50,000 light years in radius, containing stars of many types and ages as well as interstellar gas and dust; a spheroidal halo of some 100,000 light years in radius, containing very old stars; a bar and a bulge in the centre; and the ubiquitous dark matter. The disk contains spiral structure, with the spiral arms being the preferred location for star forming regions.
Although these general features of our Galaxy are rather well known, much remains to be elucidated, including the detailed number and structure of the spiral arms, the disk warp, the detailed shape and rotation of the bulge, disk, and halo, the dynamics and the kinematics of the Galaxy, and the distribution of dark matter. In addition, the process by which the Galaxy was assembled, presumably from small building blocks, and the history of star formation are not well understood.
To address these and many other questions one needs a deep all sky survey covering a significant volume of the Galaxy and providing 3D positions and velocities at high accuracy, as well as the physical properties of the objects observed. The goal of the Gaia mission is to create the largest and most accurate three-dimensional survey of our Galaxy and beyond by providing unprecedented positional, proper motion, radial velocity, and spectroscopic data for about one billion stars in our Galaxy and throughout the Local Group. In addition to the in-depth study of the Milky Way, the Gaia survey will also contribute to many other areas of astronomy and astrophysics. Just to mention the most relevant ones star formation history, stellar structure and evolution, stellar variability, stellar ages and the age of the universe, the distance scale, binaries and multiple systems, planetary systems, Solar System objects and even some fundamental physics applications such a test of General Relativity. Furthermore, since the Gaia survey will be uncensored it will also contain some millions of extragalactic objects (galaxies and QSOs) that will contribute to extragalactic astrophysics and to the redefinition of the optical astrometric reference frame.
Gaia was launched the 19th December 2013 and has been in operations for more than two years. The nominal mission was planned to last five years, but an extension of at least two years has been proposed to ESA. The Gaia satellite and mission operations are fully funded by ESA and the processing of the acquired data and production of the Gaia catalogue is the responsibility of the Data Processing and Analysis Consortium (DPAC), comprising more than 450 members in over 20 countries who are joining their efforts to overcome the challenging problem of managing the 1 Petabyte of Gaia data. DPAC activities are supported by national funding agencies that have signed a multilateral agreement with ESA to ensure the stability of the scientific teams for the necessary interval of time. Processing will be done on the premises of six Data Processing Centres also made up of scientists and engineers funded by the agencies who have signed the multilateral agreement. Neither ESA nor DPAC has direct responsibility for the scientific exploitation of the Gaia data, which instead is to be conducted by the scientific community at large. It is up to this community to organize the best use of Gaia’s products. Since the DPAC community has deep knowledge of the Gaia mission and extensive expertise in astrometry, synergy with the community at large is obvious and necessary for making use of the enormous legacy that Gaia will provide. To that end, several networks are currently in place, such as the GREAT (Gaia Research for European Astronomy Training) Research Networking Program (RNP), funded by the European Science Foundation, the GREAT Initial Training Network (ITN), funded by the European Commission, and the REG (Red Española de Gaia, Spanish Gaia Network), just to mention some examples.
In this context, one of the key tasks needed to make the data available to the scientific community is the definition, implementation and operation of the Gaia archive. The way ESA is handling this task is two-fold:
o Firstly, the Gaia archive will be hosted at ESA’s European Space Astronomy Centre (ESAC), where the agency will locate an engineering team to work on the design, deployment and operation of the archive.
o Secondly, a unit inside DPAC (named CU9) is in charge of the development of the Gaia archive.
Main objectives
The objective of the GENIUS proposal was to contribute to the design and implementation of the Gaia archive in close cooperation with ESA and DPAC. The Gaia archive is the key to the scientific exploitation of the Gaia data and has been in operations since September 2016. The current archive and its future versions will benefit from the GENIUS developments in four main areas, which were the main objectives of the proposal:
1. Tailoring to user needs
A first objective of GENIUS has been to ensure that the requirements driving the design of the Gaia archive and the tools provided for its use were fully in line with the foreseen scientific usage of the Gaia data. To achieve this, the user community was involved in all stages of the development of the project, ensuring that the community’s needs were translated into requirements, design features, user interfaces and tools. We wanted to avoid the kind of situation where a data archive is an elegant exercise of engineering skills but does not fulfil the needs of its users.
2. Optimum archive system
Deriving from the above goal, a second objective of GENIIUS was that the design of the archive itself and its interfaces were tailored and optimised for the needs defined by and with the users. The European Space Agency has assumed responsibility for developing and hosting the Gaia archive at ESAC, where a team is in charge of the hardware infrastructure and database design for the purposes of serving all current and future ESA missions. In coordination with this team, GENIUS has made contributions in the areas of server–side infrastructure to support richly functioned interactive and Virtual Observatory compliant user interfaces.
3. Tools for exploitation
The next logical step once the archive system is available is the development of tools allowing its effective exploitation. The GENIUS third objective has been the definition of such tools, based on user needs, and their implementation in the Gaia archive. It is important to remark here that we do not claim to have made all the possible tools or the basic query interfaces but that we have provided tools that can significantly enhance the scientific exploitation of the catalogue. The developments carried out cover four areas: visualization tools, data mining tools, Virtual Observatory tools and outreach tools.
4. Validation
A key task for the release of the Gaia catalogues is the validation of its contents. Thus, the fourth GENIUS objective has been to contribute to the validation of the catalogue in close cooperation with DPAC. Ensuring a high quality for a one-billion+ object catalogue containing a wide variety of data (astrometric, photometric, spectrophotometric, spectroscopic, . . . ) is a major scientific and statistical challenge and to carry it out a combination of existing exploration tools and specifically developed tools is required. In particular, the validation has required an intensive cross-matching and interoperation of Gaia data with other astronomical archives as well as with specific ground-based observations.
In addition to the above described technical objectives GENIUS had set up the additional goal of establishing and consolidating a collaboration with the only other astrometric missions being developed in the world: the Japanese JASMINE suite of missions (nano-Jasmine, small Jasmine and Jasmine). Building on the already existing ESA and DPAC collaboration with the Nano-JASMINE team, GENIUS has fostered the consolidation and extension of these ties through personnel exchanges, joint meetings and collaboration in several technical tasks, ranging from the astrometric data reduction to the mirroring of the Gaia archive. This collaboration will
also benefit the European astronomical community through the maintainance and extension of the space astrometry expertise.
Project Results:
GENIUS was designed to boost the impact of a European breakthrough in astrophysics, the Gaia astrometric mission. GENIUS aimed at significantly contributing to the development of the Gaia archive based on the principles of:
1. archive design driven by the needs of the user community that will scientifically exploit the Gaia results
2. provision of exploitation tools to maximize the scientific return
3. ensuring the quality of the archive contents and the interoperability with existing and future astronomical archives
4. cooperation with the only other astrometric mission in the world, the JASMINE suite (Japan)
5. and last but not least, allow the archive to facilitate outreach activities.
To achieve these goals the work was structured in several work packages, whose results are described here.
WP1 management
This work package included the administrative tasks to fulfil the EC requirements and rules as well as the global administrative and coordination tasks inside the consortium, including financial management, intellectual property management and project documentation. It also included the coordination with institutions and bodies relevant for the development of the Gaia archive like ESA, DPAC and GREAT, as well as the representation of GENIUS in meetings or committees related to this coordination.
Work-life balance & gender
One of the commitments of GENIUS was to provide a good work-life balance in the execution of the project. The primary reason for this was to help in reducing the gender gap, especially significant in technology-related projects like GENIUS. Many women, for whatever reason, end up siding with their families and are less mobile, both nationally and internationally. In order to counter this, we have tried to allow for more but shorter trips and virtual international contacts through teleconferencing. A teleconferencing system, WebEx, was acquired and made available to the consortium, and has been widely used. A total of more than 120 teleconferences, an average of 3 per month, have been held using this system.
The consortium has also made an effort towards gender balance in the hiring of personnel, with a moderate success. In computer science projects, the IAU statistics show that a 21% female staff is the average. In the GENIUS hiring we have managed to improve this percentage to 28% in the newly hired personnel (calculated by number of contracts not FTEs).
However, the gender balance is not so successful regarding the continuity of the careers of the GENIUS contracts. Once GENIUS finished 72% of the contracted personnel still remained in the same institution with different funds (soft money) but looking at the gender:
• 76% of men stayed, while only
• 62% of women remained
And even less favourable, for the personnel that left the institution:
• 100% of women are still looking for a job at the end of the project
• 100% of the men, now working in big companies or big projects. 60 % moved to another country while a 40% came back or remained on their country of origin, so mobility does not seem a factor.
This is small number statistics, so no firm conclusions can be reached, but it is nonetheless interesting as a study case and shows the difficulties of achieving gender equality.
General meetings
The following general meetings were organized during the project
• Kick Off Meeting (KOM) organized in Barcelona (Dec 2013)
• Joint Gaia CU9-GENIUS meeting in order to fully integrate the GENIUS activities in the overall Gaia DPAC effort, (Vienna Jul 2014)
• GENIUS First year review (Brussels Dec 2014)
• Gaia Archive Review #1: joint meeting with ESAC Science Archives (Madrid Jan 2015)
• Coordination meeting (ESA-GENIUS-CU9) for cross-match activities (Rome Feb 2015)
• Joint GENIUS-CU9 Plenary Meeting (Barcelona Sep 2015)
• GENIUS Second year review (Barcelona Oct 2015)
• Mid-term external review (Leiden Nov 2015)
• GENIUS Third year review (Sitges Jan 2017)
• External Advisory Board review, 27 March 2017
• Final Review Meeting, Brussels 19 of April 2017
External advisory board
GENIUS was twice reviewed (mid-term and final review) by an external advisory board composed by:
• Francoise Genova (CDS)
• William O’Mullane (ESA/LSST)
• Tadafumi Takata (NAOJ)
• Mark Wilkinson (U. Leicester)
WP2 Tailoring to the end user community
Unlocking the full potential of the Gaia catalogue and archive is not straightforward and requires an ambitious and innovative approach to data publication and access. A key aim of GENIUS was to ensure that the corresponding technical developments were driven by and focused on the scientific needs of the astronomical community that will use the Gaia catalogue. That is, the Gaia catalogue and data archive should be tailored to the needs of the scientific end user, but also the interested amateur or curious member of the general public. Tailoring was done by capturing the end user’s scientific requirements and turning those into specifications on the basis of which the Gaia data archive, catalogue and data access methods were built.
Requirement gathering
An analysis of the pre-existing CU9 requirements documents was carried out and these were matched to the use cases identified for the archive. The goal was to find out if there were any gaps in the CU9 requirements with respect to the use cases. The results were documented and circulated among the CU9 work package managers so that they could respond to the findings. In addition to the review of the general requirements, the list of requirements and feasible use cases to be covered by archive visualization has been compiled in the CU9 Visualization Software Requirement Specification document. All the requirements were incorporated into the design of the Gaia archive and have been mostly covered by the current version for Gaia Data Release 1. The update for the Gaia Data Release 2 will continue taking these requirements as a reference.
Archive beta testing
The implementation of the archive for Gaia Data Release 1 was submitted to beta testing under the supervision of GENIUS members. The feedback was collected based on the experience of using the pre-release version of the archive.
Confronting complex models with complex catalogues
GENIUS has worked to provide advanced access facility for the Gaia archive enabling the comparison of Galaxy models with the full catalogue contents. The aim was to develop requirements for such a facility. An API for such access has been defined, and an example implementation (named BEANS) has been developed. However, the aim was not really met even if a proposal for an API was written.
Seamless data retrieval across archives and wavelength domains
This task was oriented to provide a multi-wavelength cross-match work. It started by working on the infrared domain and was followed by submillimetre and high-energy (X- and gamma-ray) domains. The task has defined cross-match algorithms for the different wavelengths and has analysed the available catalogues to be matched with the Gaia one. The algorithms defined were implemented to provide pre-computed cross matches with several sky surveys that are currently included in the Gaia DR1 archive, and will be repeated in the DR2 archive. Worth to mention the popularity of the pre-computed cross-matches among archive users. Furthermore, the work developed in GENIUS will be the basis for a web service for multi-wavelength on-demand cross-match, to be implemented in the Gaia archive or other data centres.
The living archive
The requirements for implementing the concept of a ‘living archive’ (the idea that it should be possible to incorporate new information into the archive from its users or future surveys) for the Gaia mission were defined during the projects. The findings resulted in a recommendation (by the DPACE chair) not to pursue the living archive approach in the context of the Gaia mission due to the practical difficulties for its implementation and maintenance. However, some living archive aspects are already implemented in the Gaia archive, such as the sharing of user-provided/constructed tables and the availability of the VOSpace.
Re-processing of archived (raw) data
GENIUS gathered the requirements for the long term archiving of the Gaia data products and the processing software. The goal was to facilitate the (partial) re-processing of raw or intermediate level Gaia data, which would be aimed at improving the standard DPAC data products. The results were recorded in a technical note which will serve as the basis for further preparations by the Gaia Mission Manager and DPACE chair for the long term archiving of the Gaia data and the processing software.
Collaboration with Jasmine
Besides the above described activities, GENIUS also has contributed to strengthen the coordination between Gaia and the JASMINE suite of missions. The JASMINE suite (nano-Jasmine, small-Jasmine and Jasime) are, besides Gaia, the only astrometric missions planned in the world. Through the exchange of visits between the teams and the organization of joint meetings, the GENIUS project has allowed for extended proposals of collaborations in the areas of astrometric reduction, archive mirroring and future missions cooperation. Furthermore, a mirror of the Gaia archive has been set up in Japan as part of this extended collaboration.
WP3 Aspects of archive system design
The objective of this work package has been to design, prototype and develop aspects of the archive infrastructure needed for the scientific exploitation of Gaia data. The design and technology choices made were motivated by the real user requirements identified by WP 2 and by other initiatives, such as the GREAT project, and were made with full recognition of the constraints imposed by the ESAC archive system, with which it had to interface effectively. Prototypes were prepared and tested in cooperation with the end user community and with the ESAC Science Archives Team (SAT) through the DPAC CU9. A core principle was the adoption of Virtual Observatory standards and the development of VO infrastructure to enable ready interoperation with other external datasets needed to release the full scientific potential of Gaia.
Aspects of archive interface design
The GENIUS work on archive interface design was tackled in three ways:
• Definition of archive system requirements specification
• Definition of subsystem Interface Control Documents for coordination with the ESAC SAT
• Enhancement of the DPAC Main Database Dictionary Tool for use more generally amongst CU9 in partnership with GENIUS.
Besides playing a key role in the specification of the archive requirements and data model, GENIUS has designed and implemented several features of the archive, including: a TAP/ADQL autocomplete library, changing the end–point metadata resource from VO Support Interface (VOSI) to TAP_SCHEMA and bug fixing and enhancements to the ADQL parser.
VO infrastructure
The GENIUS contributions in this area have focused on working with the International Virtual Observatory Alliance, at both face–to–face meetings and via the Standards committee, for the adoption of several new features of the ADQL standard that are of great importance to users of Gaia data. GENIUS has also prepared the path for the implementation of the next version of the ADQL standard which will help to improve the interoperability of the Gaia data published by ESAC SAT with data published by other archives within the IVOA and with data access clients developed using the IVOA standards. GENIUS has also contributed to the ADQL parser implementation that is common to ESAC SAT, providing enhancements and bug fixes.
Data Centre Collaboration
GENIUS has designed, documented and prototyped a complete subsystem that implements Distributed Query Processing. This has long been an ambition in the VO world as it is seen as fundamental to the ubiquitous usage scenario of cross–querying multi–terabyte, multi–wavelength survey datasets of which the Gaia catalogue is a prime example. The software has been containerised for ease of deployment in any Data Centre, not least ESDC for Gaia data. A GENIUS prototype demonstrator is available and has been extensively tested for a wide range of queries harvested from real–world usage scenarios and ADQL queries gathered during data centre operations.
Cloud–based research and data mining environments
At the time of writing of the GENIUS proposal (5 years ago) the state–of–the–art research environment for cloud-based and data mining environments was the CANFAR system operated by the Canadian Astronomy Data Centre. GENIUS envisaged the somewhat ambitious plan of developing something similar for European astronomy and specifically aimed at ‘grand challenges’ for Gaia data. During the development programme it has become clear from investigations that bespoke systems are better designed around operating system level virtualisation as opposed to the more traditional hardware virtualisation, not least because OS–level virtualisation, a.k.a. containerisation, is more light–weight and hence performant. In the IT world ‘Docker’ is emerging as the leading third–party system for containerisation. In GENIUS we have extensively tested Docker in the context of use for deployment of astronomical software and demonstrated its use in this way by containerising two of our main deliverables, namely the IA2TAP client–side VO publishing software, and the Distributed Query Processing infrastructural components. The work has been compiled in a detailed report of this experience for publication in Astronomy & Computing.
WP4 Tools for data exploitation
A use of the Gaia archive based on simple queries (i.e. sky region queries) would only allow a basic use of its potential. To fully exploit a billion object data set, containing a wide variety of data (astrometric, photometric, spectrophotometric, spectroscopic, . . . ) more advanced and powerful data exploration tools were needed. GENIUS has contributed several of such tools tailored to the actual needs of the scientific user community.
Visualization tools
GENIUS has designed, implemented and tested the interactive visual exploration web service running at the Gaia archive that is available for DR1. The service has been adapted to the large size and complexity of the Gaia archive. The visualization services for the Gaia archive are based on client-server architecture. The Server (aka Object/Visualization Server) runs close to the data. The Clients provide visual display and user interface, usually at a location away from the archive. The architecture is designed to support plugins that can be used for extending the server-side capabilities in several ways (such as data transformations, data simplifications, volume calculation, indexing, etc.).
In addition to the technological developments, the service produces visual contents, including images in specialized formats that are intelligible representations of the huge information content of the archive. Furthermore, the visualization developed in GENIUS the Sky Panorama that has become the iconic image for Gaia DR1.
Data mining
The Gaia Data Analytics Framework (GDAF) is the final result of the GENIUS work and constitutes an important part of the GENIUS legacy. It provides an infrastructure for Data Mining with the Gaia data which is both powerful and easy to use. The system is based on a Cloudera Hadoop distribution alongside Spark framework for the Big Data analytics. We have defined an easy deployment and setup procedure in order to replicate it in any data centre. Furthermore, a mini-distribution encapsulated in a virtual machine is available for installation in laptops or desktop computers; this mini-distribution provides the user a simple but fully-featured development environment from which the applications can be directly exported to the full environment. Some preliminary scientific use cases and a pool of scientists have been identified to act as a ’super user’ community to continue with the system definition
GDAF has been presented to ESA and its integration in the Gaia archive is being evaluated, as well as its installation in other data centres. The installation at CSUC is in use and the Gaia UB team plans to build on this infrastructure for its exploitation of the Gaia data.
VO tools and services
GENIUS has included the development/enhancement of several tools working with the archive using the VO standards:
• VOSA (VO Sed Analyzer) is a web-based tool designed to combine private photometric measurements with data available in VO services distributed worldwide to build the observational spectral energy distributions (SEDs) of hundreds of objects. GENIUS has upgraded VOSA to provide access to Gaia photometry and give a reliable estimation of the physical parameters of thousands of objects at a time. This upgrade has required the implementation of a new computation paradigm (including a distributed environment, the capability of submitting and processing jobs in an asynchronous way, the use of parallelized computing to speed up processes and a new design of the web interface).
• Clusterix: GENIUS has built this new Virtual Observatory compliant tool. Clusterix is a web-based application to calculate the membership probability of a list of objects using proper motions. The tool also takes advantage of the Virtual Observatory to gather parallaxes, radial velocities and proper motions from VO services and to estimate temperatures, gravities and metallicities using VO tools like VOSA.
• TopCat: is an interactive graphical viewer and editor for tabular data. Its aim is to provide most of the facilities that astronomers need for analysis and manipulation of source catalogues and other tables, though it can be used for non-astronomical data as well. It understands a number of different astronomically important formats (including FITS, VOTable and CDF) and more formats can be added. TopCat has been enhanced by GENIUS in several ways, including: direct access to the Gaia archive TAP interface, providing access to the Gaia catalogue from the CDS x-match service and incorporation of the native Gaia data format gbin. The TopCat developments for GENIUS, and specially the ability to directly read gbin data, have made it the tool of choice for the DPAC consortium.
WP5 Tools for data validation and analysis
The preparation of the Gaia archive before its publication requires a careful, detailed and an in depth validation of its contents. The scientific and statistical challenge of this task on a one billion data set containing a wide variety of data (astrometric, photometric, spectrophotometric, spectroscopic, . . . ) is daunting, and would be impossible without tools adapted to work on such a massive and data-diverse archive. GENIUS has produced such tools, based on the actual validation needs and on the characteristics of the archive system, thus making them as efficient as possible. Furthermore, the validation process relies on methods and tools that can also be used, with little or no adaptation, for the scientific analysis of the catalogue.
GENIUS has also provided part of the human resources needed for the validation of the first Gaia data release using these and other tools, making a significant contribution to its success. The areas of work for this validation work are described below.
Looking for trouble: definition of problem cases, validation scenarios and tools
GENIUS defined validation scenarios, and implemented the corresponding tests, to carry out basic verifications of the Gaia DR1 Catalogue content. These tests ensured that the field contents were as expected and that all fields are within valid ranges and fields present as indicated, including:
• Verification that the data provided in the Gaia DR1 release conformed to their specifications, either as explicitly stated in the CU9 Data Model or as implicitly understood by the validation group.
• Verification that the data from different fields provided in the DR1-TGAS release was mutually consistent, given expectations about relationships between fields.
• Verification that the data in different versions of the Gaia DR1 catalogue was consistent. The tests could not be completed due to technical difficulties.
• Check if the data provided in the DR1-TGAS release contained duplicate sources.
• Verification of the consistency of the parameter distribution in the Gaia DR1 catalogue, over the whole sky. This includes the verification of the distribution of the astrometric parameters and their median error distribution consistency, standard error distribution continuity as a function of magnitude at gate transitions, high proper motions and negative parallaxes ratio, and the dependence of the position error ratios with magnitudes, etc.
• Verification of the accuracy of the errors reported in the DR1-TGAS data.
Simulation versus reality: from models to observables
GENIUS implemented validation tests were to verify the Gaia DR1 data through comparison with a data set generated from a realistic model of the Milky Way. The latter is provided by a population synthesis model of the Milky Way based on hypotheses from a probable scenario of formation and evolution of the Milky Way, on stellar models and combined with empirical constraints and dynamical considerations. The tests were based on comparisons of different moments of the distributions over magnitude, colours, proper motions and parallaxes, and there variations as a function of latitude. Comparisons are done between model expectation and Gaia data.
Two different versions of the Besançon Galaxy Model were used for the comparisons: BGMBTG_2_08 and the new BGMBTG_4.08.
Confronting Gaia to external archives
GENIUS implemented tests to check that Gaia DR1 data is coherent compared to other known catalogues, so to accept or not them for publication. They concern several types of objects (simple stars, variables, binaries, multiples systems, quasars...) as well as different kind of parameters (parallaxes, proper motions, radial velocities...).
To ensure that the tests are correctly working and return correct results, they were tested on simulated and real data. For that purpose the simulation of the Tycho-Gaia Astrometric Solution, the Hipparcos catalogue, and the Initial Gaia Source List (IGSL) were used.
Validation of Jasmine data
The experience gained in the validation of Gaia-DR1 data is applicable to the future validation of Jasmine data. Work has been done in GENIUS to compare validation of Jasmine satellites data to Gaia data with the goal of helping both
Data demining: outlier analysis
GENIUS has developed tools and tests to allow identification of outliers, or at least substructures which could then prove to actually be due to artefacts, not real structures. To understand whether the statistical properties of the Gaia DR1 data set are consistent with expectations, distribution of the data (and in particular their degree of clustering) was compared to suitable simulations for all 2D subspaces.
Furthermore, the visualization tools developed in GENIUS were used to detect artefacts in the sky distribution. Source density maps at several resolutions and colour scales were tested in order to detect structures in the data that might require further inspection.
Transversal tools for special objects
Finally, GENIUS has developed specific tools for the validation of data for several special types of objects, including:
• Solar System objects
• Stellar clusters
• Multiple stars
• Variable stars
WP6 support activities
GENIUS included two specific activities to provide support activities needed for its developments.
Provision of simulated data
The availability of simulations of the Gaia catalogue was crucial for the development of the systems and tools of the Gaia archive, and therefore for the GENIUS tasks: the real Gaia data was not available until later in the schedule and the simulations were needed to test the systems. Furthermore, the simulations were also needed for some of the validation tasks described above.
With this purpose, the GOG simulator (developed for Gaia-DPAC since 2006) was deployed at CSUC and at the MareNostrum supercomputer (combining the CSUC shared memory systems for the simulations of very dense regions of the sky, and the MareNostrum supercomputer, with a large number of processors but with a limited amount of memory per processor, to generate the simulations for the lower density regions of the sky). This combination has allowed the generation of three GOG simulations of the Gaia catalogue, used for the preparation and validation of the Gaia DR1.
Besides the execution of the simulations themselves the GENIUS contribution has also included the maintenance of GOG and its optimization that has largely improved its performance and has allowed keeping it up to date with the evolution of the mission.
Science alerts testbed
The Gaia flux-based science alert stream is issued to the community through the science alert processing carried out at the Cambridge Photometric Data Processing Centre (DPCI). The science alerts processing issues basic information for each flux alert via the VOEvent system to the community in a timely fashion (with alerts being produced 1-2 days after observation by Gaia). The alert packet contains basic characterisation information for each event, including parameters such as estimated alert object type, and more advanced classification for certain objects such as supernovae (SNe). For these, inherent Gaia photometric data is used to provide additional information concerning SNe alerts including class, epoch, redshift, reddening.
The testbed work carried out in GENIUS developed the interfaces required to connect the real time science alerts classification processing to the main Gaia data products. Thus, as the mission evolves, and more knowledge is accumulated about objects measured by Gaia as it successively scans the sky, there will be opportunity to cross reference new alerts against previous knowledge of that sky point as well as previous alerts against new information. For instance, irregular outburst events may show multiple times during the Gaia mission. Identification will be improved through correlation with earlier Gaia knowledge.
The testbed provides linkages to external data resources provided through GENIUS, in particular via interfaces to the archive development. The GENIUS alerts prototype has been releasing photometric alerts to the community from early 2016 and improved publishing of alerts has been enabled by the publisher; significant improvements have also been undertaken in terms of functionality and usability of both the Alerts pages and the Alerts Marshall.
WP7 dissemination
By its very nature the GENIUS project is closely tied to the dissemination of the Gaia data to the astronomical community with the availability of the archive and publication of its Catalogue through DPAC CU9 in coordination with ESA and DPAC. The Gaia archive is located at ESAC, the European Space Astronomy Centre, where all ESA astronomy and Solar System missions’ archives are kept and well disseminated, being regularly retrieved by more than 3,000 registered users. But in this context, the publication of Gaia DR1 has broken all records, with more than 40,000 queries and 72 Terabytes of data transferred just in the first 24 hours of operations. Clearly, Gaia data and its archive have had, with the help of GENIUS, a big impact on the scientific community.
But beside the professional community, the educational community and the general public are obvious targets for the dissemination of the Gaia results and GENIUS has significantly contributed to the mission visibility for this public. The project contribution has complemented the activities of ESA and has been closely coordinated with the activities of the national communities working in Gaia.
Specifically, GENIUS has developed and implemented a community portal released in July 2015:
http://www.gaiaverse.eu
This portal includes information about Gaia in thirteen languages, with contents ranging from general news to tools and content for outreach. The portal is managed by an editorial board and is maintained by a group of 20 collaborators and at the end of GENIUS it has been handed over to the Gaia community as a tool for multi-language outreach. In addition to the portal, and linked to it, GENIUS has also maintained a twitter account for the dissemination of Gaia/GENIUS news. Social network account have had a especially high impact around the publication of the Gaia Data Release 1, where for instance the twitter account had during the first 24 hours (14-15 Sept.) 10.4K impressions. This account has also been handed to the community at the end of the project.
In addition to these transversal activities, GENIUS has also carried out other specific outreach-related activities during its duration:
• GENIUS has funded the making of an outreach video during the GENIUS/DPAC plenary meeting 2015 in Leiden, aiming to explain to the general public how this type of meeting contributes to the strengthening of the collaboration between scientific teams all around Europe.
• GENIUS has funded courses on scientific communication to its members and some guests of the Gaia community. These courses were aimed to improve the communication skills of the participants to help them better communicate to the general public the results of Gaia.
Finally, the GENIUS members have participated in several instances with the national and ESA regular outreach activities, including general public presentations, talks for students and press releases.
Conclusions
GENIUS was designed to boost the impact of the Gaia astrometric mission, a European breakthrough in astrophysics. Gaia, an ESA Cornerstone mission launched in December 2013, is producing the most accurate and complete 3D map of the Milky Way to date. A pan-European consortium named DPAC is in charge of the Gaia data processing, of which the final result is a catalogue and data archive containing more than one billion objects. The archive system containing the data products is located at the European Space Astronomy Centre (ESAC) and is serving to the community the first product of the mission, the Gaia Data Release 1 (Gaia DR1) since September 2016.
The design, implementation, and operation of this archive are tasks that ESA has opened up to participation from the European scientific community. GENIUS has significantly contributed to this development based on the following principles: an archive design driven by the needs of the user community; provision of exploitation tools to maximize the scientific return; ensuring the quality of the archive contents and the interoperability with existing and future astronomical archives (ESAC, ESO, ...); cooperation with the only other two astrometric missions in the world, Nano- JASMINE and JASMINE (Japan); and last but not least, the facilitation of outreach and academic activities to foster public interest in science in general and astronomy in particular.
GENIUS has fit seamlessly into existing Gaia activities, exploiting the synergies with ongoing developments. Its members have actively participated in these ongoing tasks and have provided an in-depth knowledge of the mission as well as expertise in key development areas. Furthermore, GENIUS has had the support of DPAC, several Gaia national communities in the EU member states, and has fostered the cooperation with the Japanese astrometric missions above mentioned.
Potential Impact:
The GENIUS project has had an impact on four main areas:
1. Societal impact
The GENIUS activities have significantly contributed to the Gaia outreach, bringing to the public the results of a key European mission. Without a coordinated outreach programme between ESA, DPAC and GENIUS Gaia would be just another specialised star catalogue (albeit an extremely precise one). The full potential of the 3D (6D) information can be realised only from the exploration and visualization tools which have been developed within GENIUS, not from the Catalogue alone. We can specifically highlight here the Gaia Sky Panorama, a visualization of the contents of the first Gaia catalogue that has become the iconic image of the Data Release 1 and that was created by the GENIUS Portuguese team as part of their visualization activities.
Furthermore the impact on society goes beyond outreach only with the development of the software to confirm and automate the Gaia alerts and combine ground–based with space–based Gaia data for detected Solar System objects, including the potentially hazardous Near-Earth Objects.
And, needless to say, GENIUS is a pan–European project that has enhanced working relationships and collaboration between European research and higher education establishments, and as such impacts society in a fundamental and positive way.
2. Economic impact
Regarding economic impact, innovation in the use of Information Technology for research and development programmes is a proven way to enhance the know-how base of an economy. The GENIUS developments in IT have resulted in the training of developers who subsequently have either remained in the scientific field or have moved to private companies, ensuring the preservation of the expertise gained.
3. Educational impact
The availability of the Gaia data will allow the teaching of astronomy in innovative and enticing ways. The Gaia archive follows the paradigm of “Open data”, with the data available to everybody from the first moment, and has been designed with the needs of all types of users in mind. The GENIUS work on archive requirements has included provisions for amateur astronomers and the teaching of astronomy, and the archive can now support these activities. Gaia allows us to realize a 3D journey through our Galaxy, introducing astrophysics to a new generation of students and inspiring the next generation of researchers to enter the physical sciences.
4. Scientific impact
At the end of the nineteenth century, the first large international astronomical collaboration, the “Carte du Ciel”, was conceived with the goal of providing “a legacy of the exact status of the sky”. This massive project, which contributed to the origins of the International Astronomical Union, was the realization for sky maps of the potential power of photography, the new technology at that time. One century later, Hipparcos, the European Gaia precursor, was the first experiment to use space technology for pinpointing the positions of (a very limited) number of stars. Hipparcos had a significant impact on astrophysics, as assessed by the number of refereed publications derived from it, in the range of 150 to 200 per year in the first years after the publication of its catalogue. The Gaia impact will be much higher, given the larger number of objects and the additional types of data. Gaia represents an extraordinary means by which to convert time into space through its more than one billion star Catalogue. Even more, because Gaia will measure the velocity and the physical properties of the observed sources, increasing the dimensionality of the observables to more than 6. Only time will truly tell, but it is already clear that Gaia represents the European legacy mission at the beginning of the twenty first century, being not simply an ESA cornerstone, but also a cornerstone in the historical quest to measure the size of the local universe, and the astrophysical record of its observable content.
Overall GENIUS represents an essential part of the Gaia project, namely the dissemination of the results of the biggest astronomical survey up to date (as a matter of fact, several surveys in one: astrometric, photometric and spectroscopic) to the scientific community and the general public. GENIUS has represented a concrete and visible part of the huge work being undertaken by the 430+ European DPAC scientists and engineers, not mentioning the work done by European industry. Indeed, the work undertaken within the GENIUS project has helped to unlock the full scientific potential of the Gaia catalogue and data archive. Hence GENIUS has been a clear and timely added value to the Gaia mission and data processing by:
o gathering the different fields of expertise in the community to provide advanced requirements
o going much beyond usual queries to data archives
o distributing the data to the whole astronomical community and enhancing the visibility and impact of Gaia
o developing visualization and data mining tools to allow the most effective archive analysis
o combining Gaia with ground-based data, thus extending the interpretation capabilities across archives and wavelength domains.
Although the GENIUS proposal has been focused on the Gaia data archive, the research and development within this project will also benefit other data archives, be they from space or ground based experiments. Part of this benefit arises naturally through the push for interoperability with other archives, while the public dissemination of the GENIUS results can be used to enhance other existing archives or to prepare future data archives.
Furthermore, beyond Europe GENIUS has helped cementing the Gaia collaboration with the only other astrometric mission in the world: the Japanese JASMINE suite. The previously existing cooperation has been enhanced through exchanges, joint meetings and collaborations and has led to extended proposals of collaborations in areas like the astrometric reduction, archive mirroring and cooperation in future missions.
List of Websites:
http://www.gaiaverse.eu/
Primary Coordinator Contact (GENIUS coordinator):
* Xavier Luri xluri@fqa.ub.edu
Other coordinator contacts:
* Lola Balaguer lbalaguer@fqa.ub.edu
* recerca.europea@ub.edu