Project description
Big data for all: ditching the server requirement and moving easily to the cloud
Developing software for cloud-based applications is challenging and deploying it to the cloud can be even more so. ‘Serverless’ computing addresses this issue, with code execution managed by the cloud provider so that programmers are freed from the hassle of managing and maintaining servers. The EU-funded CloudButton project has created Lithops, an open source multi-cloud serverless data analytics platform that will make big data more accessible to programmers of all levels. The project also facilitates migration of existing data-intensive applications from the high-performance computing, data analytics and machine learning domains to the cloud.
Objective
This project is inspired by the following sentence from a professor of computer graphics at UC Berkeley : “Why is there no cloud button?” He outlined how his students simply wish they could easily “push a button” and have their code – existing, optimized, single-machine code – running on the cloud.
The main goal is to create CloudButton: a Serverless Data Analytics Platform. CloudButton will “democratize big data” by overly simplifying the overall life cycle and programming model thanks to serverless technologies. To demonstrate the impact of the project, we target two settings with large data volumes: bioinformatics (genomics, metabolomics) and geospatial data (LiDAR, satellital).
To achieve these ambitious objectives, CloudButton defines the following goals:
1. High Performance Serverless Run-time: We will create the first FaaS compute run-time for Big Data analytics leveraging Apache OpenWhisk. The proposed serverless big data analytics platform will be based on a mature open source code base (Apache OpenWhisk) augmented with Apache Open Whisk Composer.
2. Mutable Shared Data Middleware for Serverless Computing: We will create Distributed Mutable Data Structures leveraging RedHat Infinispan In-Memory Data Grid. Our middleware will provide language-level constructs for data persistence, dependability and concurrency control to serverless functions.
3. CloudButton Toolkit: Serverless Data Analytics Platform: The toolkit will implement on top of [1] and [2] the Serverless Cloud Programming Abstractions that can express a wide range of existing data-intensive applications with minimal changes. We will develop new tools and methodologies to port existing data-intensive applications from the HPC, data analytics and machine learning domains to the CloudButton toolkit.
Fields of science
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques.
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques.
- natural sciencescomputer and information sciencesinternet
- natural sciencescomputer and information sciencesdata sciencebig data
- natural sciencescomputer and information sciencessoftwaresoftware development
- natural sciencescomputer and information sciencesartificial intelligencemachine learning
- natural sciencescomputer and information sciencessoftwaresoftware applications
Programme(s)
Funding Scheme
RIA - Research and Innovation actionCoordinator
43003 Tarragona
Spain