Skip to main content
European Commission logo
español español
CORDIS - Resultados de investigaciones de la UE
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary
Contenido archivado el 2024-05-29

Enabling and Supporting Provenance in Grids for Complex Problems

Exploitable results

The Provenance project aimed to design, conceive and implement an industrial-strength open provenance architecture for grid systems, and to deploy and evaluate it in complex grid applications, namely aerospace engineering and organ transplant management. More specifically: 1. To specify the contents of provenance in relation to workflow enactment. The project has proposed a novel definition of provenance for process-oriented computational environment, and has derived a data model for representing provenance that is technologically-independent. While still addressing workflow enactment, the Provenance project expanded its conception of provenance beyond workflow enactment to include a variety of programming and distributed systems styles. Concretely, this has allowed us and others to capture provenance in multiple workflow-based systems, such as Triana, Active BPEL, the Grid virtual data toolkit, Tent, but also in Java-based applications, and other distributed technologies such as RSS. The generic nature of our approach, and its suitability for a variety of distributed styles, is also highlighted by our visualisation tools that can render enacted workflows graphically in multiple forms. Importantly, to help application designers extract the relevant process documentation to support their provenance query, the project has specified a methodology, which guides them step by step, to make their applications provenance aware. Such a methodology is the first of its kind. 2. To design and implement a scalable and secure distributed co-operation protocol to generate provenance data in workflow enactment. The project specified a recording protocol as the set of messages that application components cooperatively exchange in order to document their execution, whether workflow-based or not. Considerations of scalability and security influenced the design of the data model and protocol: the data model allows for autonomous creation of process documentation, whereas the protocol supports for their asynchronous recording, both promoting scalability; cryptographic techniques such as signatures and digest in documentation style allow for preserving and verifying the authenticity of assertions. 3. To conceive and implement tools to navigate, harvest and reason over provenance data, also in a scalable and secure manner. Several tools have been designed and implemented, making use of the provenance store query interfaces, and analysing process documentation, to provide added value to end-users, such as displaying past executions, checking if past executions satisfy some constraints, finding inputs to an execution, identifying doctors involved in a case, or producing a textual narrative for an execution. As part of the implementation and design of the tool suite, scalability experiments have been undertaken on the analysis engine, and user access performance for the portal. By means of the Client Side Library, secure access to the provenance store is ensured for tools in accordance to the overall security architecture. Tools themselves can help specify security configurations for the system. 4. To design and engineer a scalable and secure software architecture to support provenance generation and reasoning. In its technology-independent form, the architecture addresses both scalability and security: it specifies design patterns for alternative deployments of the architecture, it supports linking of multiple provenance stores, and it explains how multi-institutional deployments of the architecture can be achieved securely. Our open specification effort addresses general security considerations in its various documents, in a similar manner to other standardisation proposals, but specifically focuses on a secure profile for the p-structure and securing of messages and store. In terms of software implementation, the provenance store deployed in the Globus Toolkit GT4 can make use of grid security specification (such as WS-Secure Conversation), and allows for multiple store deployments for scalability or in the presence of multiple security domains. The Client Side Library also built using the GT4 toolkit allows for secure communications with the provenance store. 5. To deploy and evaluate the provenance system in two different grid applications, namely aerospace engineering and organ transplant management. The reference implementation of the architecture was successfully deployed and integrated with the Aerospace and OTM applications. The evaluations involving users demonstrated how the architecture offered capabilities that were inexistent before. Furthermore, through the provenance challenge, another deployment of the architecture was successfully undertaken in the context of an FMRI (Functional Magnetic Resonance Imaging) workflow. 6. To propose a draft provenance specification for input to an open standardisation process thereby contributing to the standardisation efforts in this area within the grid and web services architecture domains. The project has announced its open specification philosophy in a white paper and provided an extensive open provenance specification. It further contributed to the standardisation process by means of provenance scenarios in the OGSA Data Scenario document. The reference implementation is a concrete realisation of this open specification, which was publicly released to the community under the Common Public License, an open source license. All our designs were based on a rigorous software engineering approach: we captured requirements from a dozen different projects; we formulated these as user and technical requirements; we designed an architecture, precisely and systematically identifying design decisions and the requirements they satisfied; we contacted the requirements providers and discussed our design with them, and iteratively improved our design; finally, deployments in concrete applications and the write-up of the open specification led us to specify a provenance FAQ and clarify some architectural aspects. This allowed us to meet our contractual obligations. Also, these substantial results allowed to disseminate the project outcomes and put exploitation strategy in place, as discussed in the next section.

Buscando datos de OpenAIRE...

Se ha producido un error en la búsqueda de datos de OpenAIRE

No hay resultados disponibles