Description du projet
Scalable data analytics
LeanBigData aims at addressing three open challenges in big data analytics: 1) The cost, in terms of resources, of scaling big data analytics for streaming and static data sources; 2) The lack of integration of existing big data management technologies and their high response time; 3) The insufficient end-user support leading to extremely lengthy big data analysis cycles. LeanBigData will address these challenges by:Architecting and developing three resource-efficient Big Data management systems typically involved in Big Data processing: a novel transactional NoSQL key-value data store, a distributed complex event processing (CEP) system, and a distributed SQL query engine. We will achieve at least one order of magnitude in efficiency by removing overheads at all levels of the big-data analytics stack and we will take into account technology trends in multicore technologies and non-volatile memories. Providing an integrated big data platform with these three main technologies used for big data, NoSQL, SQL, and Streaming/CEP that will improve response time for unified analytics over multiple sources and large amounts of data avoiding the inefficiencies and delays introduced by existing extract-transfer-load approaches. To achieve this we will use fine-grain intra-query and intra-operator parallelism that will lead to sub-second response times.Supporting an end-to-end big data analytics solution removing the four main sources of delays in data analysis cycles by using: 1) automated discovery of anomalies and root cause analysis; 2) incremental visualization of long analytical queries; 3) drag-and-drop declarative composition of visualizations; and 4) efficient manipulation of visualizations through hand gestures over 3D/holographic views.Finally LeanBigData will demonstrate these results in a cluster with 1,000 cores in four real industrial use cases with real data, paving the way for deployment in the context of realistic business processes.
Champ scientifique
Not validated
Not validated
- natural sciencescomputer and information sciencesdata sciencebig data
- natural sciencescomputer and information sciencesdatabasesnon-relational databases
- natural sciencescomputer and information sciencesdata sciencedata processing
- natural sciencescomputer and information sciencesdatabasesrelational databases
Appel à propositions
FP7-ICT-2013-11
Voir d’autres projets de cet appel
Régime de financement
CP - Collaborative project (generic)Coordinateur
28040 Madrid
Espagne