Putting quality into Big Data applications
The key objective of the EU-funded DICE (Developing Data-Intensive Cloud Applications with Iterative Quality Enhancements) project is to tackle skill shortages and learning curves in Big Data application development and shorten the time to market for applications that meet quality requirements. Through developing innovative new methods and tools, the project aims to strengthen the competitiveness of European independent software vendors in the field of business-critical Big Data applications. An open source version, as well as two commercial products, have since been released, with the aim of ensuring that the benefits of DICE continue long after project completion. Data-intensive demands The global spread of Smartphones, increased use of sensors in sectors such as automotive and security and the avalanche of social media content uploaded every day means that the world is swimming in data. Extracting usable information from this swirling pool can help companies to better understand their target audience and identify new trends – but this is still no easy thing to do. “Organisations that wish to benefit from Big Data must first carefully design computer systems capable of processing and analysing the information they need,” explains DICE project coordinator Dr Giuliano Casale from Imperial College London in the UK. “Though new ways of designing, organising and operating Big Data applications are emerging on the market, many tech start-ups don’t have the tools at hand to properly develop bespoke Big Data software systems, or fully integrate Big Data analytical technologies within existing products.” This shortcoming is significant, says Dr Casale, because in the rush to tap into the lucrative Big Data market, some tech companies have not paid enough attention to this important quality aspect. And because quality engineering of data-intensive software systems is still in its infancy, predicting and guaranteeing quality-of-service in Big Data software systems is extremely difficult to do. Methodical data analysis The DICE project addressed this challenge by creating a set of 14 tools to support – with a high-level of automation – core activities in Big Data application development including software quality assessment, software architecture enhancement and delivery to the cloud. All tools have been organised in a coherent methodology, inspired by the principles of an emerging software delivery paradigm known as DevOps. “There has to date been a shortage of methods to express quality requirements,” explains Dr Casale. “What we did was define an integrated methodology from design to operation that tackles these shortcomings. This is effectively the first quality-driven development environment for Big Data applications.” Assessments of the DICE methodology, as well application of the tools, were then carried out in three data processing pilot schemes: social media data analysis; batch processing for tax fraud detection; and cloud-based management of real-time port operations. “Preliminary results indicate substantial productivity gains thanks to DICE, particularly in terms of reducing the delivery and configuration time for new Big Data applications,” says Dr Casale. “The DICE framework was also able to identify several violations and anti-patterns in the application designs, as well as consistently reduce manual times for testing and evaluation.” An open source version of the DICE framework for developers has been released through the project website. In addition, the DICE framework has been repackaged and customised into two bespoke products: DICE Velocity and DICE BatchPro. DICE Velocity is tailored to the needs of companies developing applications based on stream-processing technology, while DICE BatchPro aims to help companies easily configure and deploy cost-effective batch data processing.
Keywords
DICE, ICT, Big Data, data processing, cloud, DevOps, social media, stream-processing