Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary
Content archived on 2024-06-18

BLUE-Enterprise and Trade Statistics

Final Report Summary - BLUE-ETS (BLUE-Enterprise and Trade Statistics)


Executive Summary:

Designing efficient and relevant systems of collection and presentation of statistical data and information for evidenced-based policy making in the Third Millennium has been on top of the agenda for the European Union for some time already. The task of supplying relevant information for policy makers and stakeholder requires actions along, essentially, four different paths: (i) managing the burden on enterprises of statistical reporting while (ii) ensuring a high quality of the statistics, (iii) formulating new concepts and methods to correctly follow and measure the structural changes in the economy and (iv) opening new paths for exploiting the potential of technological innovation. An additional challenge is caused by the fact that, unavoidably within the EU but also most Member States the four major tasks are handled by different institutions and bodies and at different levels of competence, resulting in additional needs for coordination and communication both vertically and horizontally and for enhanced attention to the consistency of the activities in the different areas and, in particular, for ensuring a high degree of harmonisation of concepts and methods and that the overall quality of statistical data is not lost in the process.

Whereas the establishment of annual accounts constitutes a significant share of the administrative and regulatory burden on enterprises, available data show that pure statistical reporting is only a tiny part (about 1%) of this burden. In fact, regulation related to social issues, health and safety on the work place, protection of consumers, the environment, chemical substances etc. remain the essential part of the regulatory burden and is often framed by EU directives.

The endeavours to reduce the burden of statistical reporting through, notably, reduction of the obligation for SMEs to prepare annual accounts, should be weighed against the risk of reducing the quality and quantity of available data and the potential increase in the increase in the burden of and obligation for the NSIs to undertake additional sample surveys to allow compilation of essential and high-quality information on different aspects of the economy.

Innovation of concepts and methods is also a necessary condition for modernisation of statistics through increased use of administrative and accounting data through added efforts with regard to integration of data sources and the setting up of “data warehouses”. As demonstrated through the experience of certain member states with advanced application of ICT in the field of production of derived statistical data an efficient collaboration between various branches of public administration is essential and may justify the creation of a single “window” for collection of data from enterprises and households.

In the perspective of the creation of a single window for collection of data from enterprises the need for harmonisation of concepts (including definitions and breakdown of series) is even more apparent as a main feature of modernisation.

However, reconsideration of modernisation through harmonisation of concepts and methods will also need to take into account the need for renewal of paradigms and for elaborating new statistics for new activities or branches which, hitherto, have been the source of serious mis-measurement. This is the case with respect to the measurement of immaterial investments and assets, which have been in the focus of analysis and debates already for decades. However, more recently the needs for modernisation are also enhanced by the recognition of the pressure for better measurement of activities which hitherto have been neglected as being outside the conventional measurement of output (GDP).

The overall conclusion and recommendation is therefore that the various efforts to modernise European statistics would benefit from additional measures to ensure consistency between the various activities and concepts and with, possibly, enhanced concern for the quality and relevance for the decision-making process in an increasingly complex socio-economic environment.

Project Context and Objectives:

BLUE-ETS is an R&D project on official business statistics and, specifically, on one of EU National Statistical Institutes’ (NSIs’) key challenges, that is, to provide high quality and robust statistical information, for better policy and socio-economic research, and to support the renewed Lisbon Strategy and the Europe 2020 strategy. It intended to support and contribute to the success of the EU Parliament and Council Programme on the Modernization of European Enterprises and Trade Statistics (MEETS). BLUE-ETS aimed at providing new perspectives, open new vistas in frontier FP7 socio-economic research areas and, in particular, at promoting the development of new tools and knowledge on key selected statistical issues and methodologies.

BLUE-ETS activities was, as hinted in Fig. 1, structured around 4 main thematic activity blocks and 11 Work Packages, with focus, purpose and function on achieving and, indeed, establishing, quality levels of excellence in each of its fields of research.

Insert Figure 1 - BLUE-ETS Subject Areas and WPs

In the BLUE-ETS specific instance, this “activity focused perspective” entailed three complementary, consecutive phases, i.e.:

o Phase 1, on current business statistics challenges, tools and means, envisaging complementary new R&D investments in selected fields of official business statistics.
o Phase 2 on fostering the dialogue and cooperation e.g. exchanging views, sharing experience, gauging and selecting pilot testing possible “champions” and distilling “shared guidelines” with a view to develop an arena in which NSIs, data suppliers, users at large (and, specifically, EU policy makers) can exchange views, discuss and cooperate towards achieving specific shared objectives and allied “frontier” solutions.
o Phase 3, on promoting synergies and convergence towards common practices, through forms of structured cooperation and dissemination of best practice, in collaboration with Eurostat and international organizations (e.g. OECD and UN) with a view to

- establishing an arena and complement the new BLUE-ETS knowledge which is being developed to fulfill the EU MEETS Decision with, e.g. allied ESSnet plans and kindred initiatives, e.g. in particular, cooperation across the board, from NSIs, universities and business organizations;
- spreading the new frontier of best practice, across the EU NSIs community; and
- contributing to the fulfillment of the EU Parliament MEETS commitment; with a view to, eventually, promote convergence towards EU-common strategies and solutions across EU NSIs and, henceforth, “bridge” EU MEETS and new BLUE-ETS R&D knowledge.

This involved charting common roads to the EU Parliament MEETS and complement efforts with the aim of creating the best business statistics for the EU and, to this end, fostering cooperation and dialogue - with the active participation and involvement of all EU NSIs, EUROSTAT and business.

In this perspective and, in particular, to effectively deliver and bridge the BLUE-ETS project with the EU Parliament MEETS, the Consortium aimed at efficiently advancing on each of its three BLUE-ETS core challenges, i.e.:

(1) burden and motivation;
(2) quality, availability and access; and
(3) bridging the supply with the demand for business statistics.

We have thus aimed at connecting producers and users of business statistics in three interconnected research areas:

• DATA PRODUCTION, ACCESS and QUALITY - which we have approached as management of statistical knowledge and which is increasingly demanded in custom-tailored and integrated forms. In this perspectives, we have undertaken research on (i) new, more efficient ways to collect, access and disseminate statistics; (ii) cost, quality, robustness of the data; and (iii) tailoring to purpose, e.g. microsimulation, policy evaluation, decomposable indicators, requirements and, last but not least, (iii) new vistas for dissemination and access opened up by, e.g. new technologies.
• DATA SUPPLY and USERS DEMANDS FOR STATISTICS - which implies that statistics should not no longer be seen as falling from “up there” and in “customized formats”, but as information that can be tailored to suit “DEMANDS” which, at a time of unprecedented structural change, requires that the supply, i.e. NSIs, must constantly meet new and increasingly diverse demands. Thus we have reviewed schemes of consultation and cooperation to best shape and tailor EU official business statistical knowledge to use, purpose and circumstances, which are increasingly diversified across groups, countries and over time.
• BEST PRACTICE, GUIDELINES and STRATEGIES for the SUCCESSFUL IMPLEMENTATION of the MEETS project - focusing on new tools and opportunities and the monitoring of dynamic and fast changing environments and assist NSIs in their efforts to adopt best practice as much as feasible.

Project Results:

The work performed and main results achieved during the lifecycle of the project can be summarised as follows:

1) “NSIs’ practices concerning business burden and motivation” (Work package 2) This work package ended its activities in the first reporting period producing the two scheduled deliverables. Research has shown that most NSIs measure actual response burden (defined as the time or money spent by businesses to respond to survey requests), but only a minority also measures perceived response burden (defined as the respondents’ assessment of how burdensome they find it to comply with data request).

The methods used for response burden measurement vary largely between NSIs and also within NSIs. This makes it impossible to compare burden levels between and within NSIs or track progress in time, which urgently calls for a standardized methodology for response burden measurement. Research found that many NSIs are actively working on response burden reduction. They usually combine various types of actions to a) reduce the number of data items collected (for example by using administrative data), b) make it easier for businesses to provide data (for example by improving questionnaire design) and c) by promoting the benefits of responding (for example by providing concrete examples of how the data are used). Most NSIs do not evaluate the effects of their burden reduction measures. This means among others that we have little information on how reduction measures impact data quality, the costs of making statistics and the burden experienced. Effects of actions intended to reduce response burden should be monitored, reviewed, documented and published. Where feasible, experiments should be conducted to examine effects of burden reduction actions. Standardized methodologies for the measurement of the response burden are within easy reach. NSIs need to commit to monitoring and documenting actions and progress in response burden reductions, as well as to sharing lessons and cooperatively learn from different experiences.

2) “Business perspectives related to NSIs’ statistics” (Work package 3) This work package ended its activities in the first reporting period. Research activities have produced the two scheduled deliverables and have shown that (i) businesses as users of NSI statistics are mainly served the same content as the general public, the consequence being that the business’ needs are not catered for appropriately by the studied NSIs; (ii) knowledge about the businesses' use of NSI statistics is in the studied NSIs fragmented, tacit, insufficiently systematised and partial; (iii) business demand for and use of NSI statistics varies by size and industry; (iv) top categories of NSI statistics relevant to businesses in participating countries concern prices, labour market statistics, general economic indicators and indicators by industry; (v) there are considerable obstacles to business use of NSI statistics (e.g. unintuitive location of statistics on the website, insufficient statistical literacy etc.); (vi) the connection between users of NSI statistics and respondents to NSI surveys within a business is in general weak and in part varies by business size, making it nontrivial to establish a motivating link between data provision and data usage.

3) “Improve the use of administrative sources” (Work package 4) Activities have focused on ways and means to “gauge and appraise the potential” of administrative sources, developing an approach that can be generally applied to determine the quality of administrative data sources for statistics production and developing a Quality Report Card and scripts to assist NSI’s. The main messages have been: 1) Measurement of the input quality of administrative data differs greatly from that of survey data. Hence the establishment of a new framework with five general dimensions for the input quality of administrative data, viz. Technical checks, Accuracy, Completeness, Integrability, and Time-related dimension, and the need for new indicators and measurement methods. 2) Determination of the input quality of administrative data is not a task on its own. It should always be preceded by evaluation of the quality of the metadata and delivery related quality components. 3) Evaluation of the input quality of administrative data can be looked at from several points of view. It can focus on the quality of the data source itself and on the expected quality of the estimates based on the data in the source. Since this can affect the assessment of the data, the framework developed needs to be able to cope with that. 4) It is very difficult to limit the number of indicators (and measurement methods) for the input quality of administrative data. It was found that each user or country representative had a need for indicators (or measurement methods) with a specific focus. To enable to serve each user, the set of indicators (or measurement methods) was set up as a general ‘toolbox’ covering every general quality component of the data in each of the five dimension established. From this ‘box’ each user can pick the tools needed. 5) Visualization methods and Tableplots in particular (Tennekes, de Jonge and Daas 2011, Journal of Data Science 11, 43-58) are a highly effective, easy, and informative way to deliver information on the general content of administrative data (and other large data) sources to users. 6) WP4-findings are not limited to administrative data sources used for business or economic statistics. They can be generally applied to any administrative or other secondary data source used for statistics.

4) “New ways of collecting and analyzing information” (Work package 5) The main objective was to examine the applicability of soft computing approaches and text mining techniques on data sets as modern means to improve the collection and the quality of data for business and trade statistics. Research activity has produced two deliverables: research of soft computing and text mining and review of existing practices. The main findings from first deliverable are: (1) administrative and secondary textual data sources contain valuable information which can be mined by the proposed statistical tools, (2) statisticians possess knowledge how to deal with their tasks, but this knowledge cannot be expressed by precise and crisp rules, (3) large administrative and statistical datasets may exhibit features relevant for imputation that cannot be detected by traditional tools. The work was focused on the development of mathematical equations, data mining methodologies and design of experimental tools and testing on real data.

The research developed text mining tools for extracting useful information on businesses in a quickly and cheaply way; fuzzy logic tool for tailored reminders, classification of respondents and revealing relational dependencies between collected and estimated data; tool based on genetic programming for flash estimates and imputation; and neural networks approach for improvement of imputation.

In addition, the researchers concluded that their findings are not limited to business and enterprise statistics only. They can be generally applied to other areas of statistics where large amount of data should be examined. For example, flexible queries by words in data dissemination on data portals; dealing with multilingual contexts, by performing statistical analyses on language independent “higher order data (“concepts“) with the possibility of coming back to a lower level of granularity (words); hybridization of neural networks and fuzzy logic for improvement of imputation and classification of respondents. Research also revealed the potentiality for improvement of text mining by fuzzy logic. As a result it was included as a part of the FP7 project proposal called EUPHORIA.

The second activity was focused on the review of existing practices in data collection, more precisely on the use of administrative data, secondary (alternative) sources, data and metadata standards with a discussion of future promising perspectives and issues.

5) “Enhancing quality of business statistics” (Work package 6) In view of the achievement of the main objective of introducing statistical methods that deal with the problems allied to the production of disaggregated estimates for business statistics, the research work has lead to the following results: 1) an integrated business database has been defined using Italian business data coming from small and medium enterprises (PMI) survey sample data as well as large enterprises (SCI) census and the statistical business register (ASIA). This integrated business data base was then used to generate a synthetic business data set of high quality that can provide a basis for further comparative research. With regards to the case studies a literature review on variance replication methods, in presence of single imputation and an exploratory analysis on the business database have been conducted. 2) A spatial robust empirical best linear unbiased predictor (SREBLUP) has been derived, which should be very suitable in the context of business statistics, since it combines the advantages of robust estimators with the ability of incorporating spatial information. Moreover, methodological developments in terms of composite Information Theoretic (IT) estimators have been produced. More specifically, IT estimators of variables and indicators at sub-area level by combining information at area (aggregate) and sub-area level were developed. 3) The usefulness of small area estimators based on non-linear transformation for skewed data has been underlined from both a Bayesian and a Frequentist perspective. The results indicate that these estimators may be superior to methods based on linear predictive models in the presence of skewness. 4) Furthermore, a new benchmarking method for the nested error unit level regression model was introduced. This can be used after a back-transformation with bias-correction (necessary in the case of skewed variables as encountered in business surveys). 5) The potential advantages of appropriate imputations methods for variance estimators and small area estimators have been underlined in simulation studies. 6) The problems when ignoring complex survey designs in model-based small area estimation procedures have been studied by means design-based simulation studies. The results shed further light on the importance of choosing the appropriate approach to incorporate design weights into the statistical modelling. 7) The potential benefits of exploiting the correlation in rotational panel designs over time has been underlined. Estimators using the correlation structure yield large reductions of the standard deviations of changes over time.

The synthetic data set TRItalia shall provide a basis for further comparative research in terms of open data policy. The synthetic data set, after validation of respecting the existing legislation on confidentiality, will be made available in terms of open data policy, i.e. the dataset will be provided to interested users on the BLUE-ETS website and on Trier dataserver.

6) “Business data integration, systematization and access” (Work package 7) The work focused on the potential of administrative registers and methodologies to deal with disclosure risks and the study of methods for managing the quality in producing economic statistics. One of the main finding has been the construction of an Italian LEED (linked employer employee database) resulting in the development of a first prototype data base, based on administrative data (treated for statistical purposes), currently accessible for economic study and research in the labour market. Advances have been reached on ways to increase the accessibility at international level of business, administrative and integrated microdata with a view to harmonising dissemination strategies. New methodologies have been developed to release complex microdata stemming from business microdata, also in the form of public use files. Moreover, analysis has investigated how to improve the quality of a chain of statistical processes. The emphasis was on improvement of the complete chain and not on the improvement of a single process or single data storage of the chain.

7) “Methodological case studies” (Work package 8) – this work package had the main objective to empirically test methodologies developed in work packages 2, 3, 4 and 5 dealing with the reduction of response burden and motivation enhancement, measurement of quality of administrative sources and innovative tools and procedures for data collection and analysis. It produced 21 case studies in respective fields declaring which approaches are ready for implementation and which need further research or development; a compendious overview of all case studies; and ideas for further research in respective fields or by interdisciplinary research among the fields.

Research highlighted the need to establish international standardization of collection of paradata related to response burden measurement that would assure high quality and comparable measures. It also exposed that the quality dimensions and indicators developed to evaluate administrative data quality could be applied to other data sources that potentially provide valuable information for official statistics. New opportunities were also recognized in synergy between independent working packages like improvement of respondents’ motivation by soft computing: This research induced the proposal for the ERC Synergy Grant (acronym FuzzyMotiv).

8) “New types of indicators-Applying results” (Work package 9) is about linking data to indicators for better grasping and handling problems, assessing how official statistics need and can adjust to reflect and catch structural jumps and change. As to the work related to the measurement of intangibles in national accounts statistics and associated general measurement issues, the presented review of measurement problems and approaches has lead to conclude without ambiguity that there are a lot of methodological questions that need further examination. In particular, there is a most pressing need to promote alternative indicators for the basic economic performance and, in particular, productivity.

Analysis on new indicators has developed and implemented a decomposable indicator for public institutions’ contribution to competitiveness. The new indicator has been applied in a case study for Italy. As a complementary research, it has been investigated the multidimensionality of the concept of competitiveness developing a tool able to analyse multidimensional competitiveness at different level of aggregation with a micro-level foundation. Further analysis investigated initiatives to develop new statistical indicators on business collaboration and Collaborative Working Environment (CWE) in official statistics as well as to improve the data availability not only on the use of collaboration tools but also on the skills needed for this practice. Finally, analysis dealt with the definition of rural areas investigating the key statistical categories considered at National Accounts level. A territorial disaggregation of indicators of rural areas was considered providing graphical representations based on Italian data from the Italian National Institute of Statistics.

9) “Improve the Dialogue across EU NSIs and Stakeholders” (Work package 10) has focused on improving the dialogue across EU with the aim to involve in the project the main stakeholders (policy makers, public administrations, businesses, academics and researchers and users of statistics in general). As part of WP10 two workshops were conducted where main results from WP2, WP3, WP5 and WP8 were presented. Stakeholders from all over Europe were invited, including NSIs, OECD, Eurostat, Business Organisations etc. Part of the outcome of the workshops is documented in proceedings publicly available on the BLUE-ETS website. Further, a BLUE-ETS Conference on Burden and Motivation in Official Business Surveys was organized which reached the goal to share knowledge on the motivation and burden in business surveys, present and discuss BLUE-ETS findings and current practices at NSIs, exchanging experiences and opinions with experts and stakeholders. Other meetings and events represented the occasion to discuss the ways to ensure optimal contributions from BLUE-ETS to the implementation of the MEETS programme. BLUE-ETS project was presented in the ESSnet workshop in Rome on December 2012 and in a special session in the NTTS conferences on February 2011 and March 2013. BLUE-ETS project ended its activities with the final Conference held in Brussels the day after the end of the NTTS conference. Because the findings from the BLUE-ETS project had been presented on various occasions (including two days earlier on the NTTS conference) the final conference focused importantly on the question of ‘where to go from here’. This framed the BLUE-ETS results in a wider perspective on the ongoing issue of modernizing Business Statistics. The conference concluded with a panel debate with prominent statisticians outlining their core priorities for pushing ahead with a modernization agenda. The outcome of the final conference is carefully documented in the conference proceedings published as part of the BLUE-ETS project. It was the occasion to present an overview of the project’s findings in the perspective of modernization and quality of business statistics, with a contribute to the debate on future research priorities.

10) “Scientific coordination, Dissemination and External Evaluation of the Project”(Work package 11) has dealt with the release of important communication tools, that is project website and project prospectus, and with workshops to share and improve results from the research activities. All deliverables have passed an external review and once passed the EC evaluation they have been made available on the project’s website. The website and the numerous dissemination activities both included in the project’s work plan and external to the project, have contributed to the creation of a fertile ground for stimulating research and debates with the final aim of contribution to improve the quality of statistical information on businesses and of supporting policymaking and research in general.

---------------------Work performed and achievements for each work package-----------------------------

Work package 2 “NSIs’ practices concerning business burden and motivation”

Coordinated by Deirdre Giesen (CBS)

other partners involved: SSB, UL, SCB, SORS

The focus and main objectives of WP2 were on the ways and means NSIs use to:

• measure the (actual and perceived) response burden on businesses;
• reduce the (actual and perceived) response burden on businesses;
• motivate businesses to respond;
• motivate businesses to report accurately.

Work performed - The first step under this WP has involved a detailed review of the literature on the measurement and reduction of the response burden imposed by NSIs on enterprises (see Deliverable 2.1). This analysis has then been complemented by a survey among 45 NSIs (in 39 European and non-European countries). 41 NSIs have responded. The results of this study, together with a summary and update of the literature, are reported in Deliverable 2.2.

Deliverable 2.1 (M5), by D. Giesen and V. Raymond-Blaess (eds.) (2011) Response burden measurement and reduction in official business statistics. A review of the literature on NSIs practices and experiences -first draft circulated in October 2010. Comments were received from various consortium partners, CBS colleagues, participants to the SIMPLY2010 conference and, finally, an external reviewer. The final version was completed in February 2011.

Deliverable 2.2 (M15), by D. Giesen (ed.) (2011) Response Burden in Official Business Surveys: Measurement and Reduction Practices of National Statistical Institutes -First draft circulated in June 2011. It has been reviewed by two consortium partners and two external reviewers. Parts of this deliverable were presented at the BLUE-ETS conference on Burden and Motivation in Official Business Statistics.

In addition, and although not formally part of this work package, WP2 partners have cooperated to the successful organization and implementation of 4 events/deliverables: (a) Workshop on pilot data collection at NSIs and businesses, to discuss preliminary research results, challenges and directions in Ljubljana, Slovenia, 7-8 October 2010 (within D11.3); (b) BLUE-ETS Conference on Burden and Motivation in Official Business Surveys in Heerlen, the Netherlands, 22-23 March 2011 (within D10.2); (c) Workshop to select ideas gathered on promising approaches to burden reduction and motivation enhancement to discuss phase 1 advances, and present tools and research results in Heerlen, 24 March 2011 (D11.4); and (d) Workshop Serving Businesses Better? Communicating with businesses: lowering response burden and increasing motivation; Brussels, November 27, 2012 (WP 10).

Main conclusions and recommendations under WP2 - The main messages from WP2 research are:

• Most NSIs do not appear to have a “centralized library”, where knowledge on response burden measurement is “stored” and allied reduction methods/actions are coordinated.
• In the period 2006-2010 most NSIs (34 out of 41) have measured actual response burdens (defined as the time or money spent by businesses to respond to survey requests). Only 12 out of 41 institutes, have actually gauged perceived response burdens (defined as the respondents’ assessment of how burdensome they find to comply with NSIs data requests); while, only 17, out of 41 institutes, have conducted studies on how businesses perceive their organization (either in their capacity of data providers, or users, or both).
• The methods used to measure response burdens vary largely between and within NSIs. In these circumstances, it is not possible to compare levels and trends in burdens.
• Though differently committed, many EU NSIs are actively engaged in cutting response burdens, by usually combining various types of actions. Differences among EU NSIs are, however, significant. Still, there is a lack of a common strategy.
• There is a large variation in the extent to which NSIs have implemented actions aimed at reducing the response burden associated to their business surveys.
• There is too little quantitative research on the effects and effectiveness of burden reduction actions on the response burden, data quality and the costs of producing statistics.
• There is too little quantitative research on how actions aimed at checking response burdens may impact on different businesses, depending on, e.g. size, industry etc.
• Empirical knowledge and sounder foundations are needed to better understand how crucial concepts may be related. On the whole, relationships and notions e.g. actual and perceived response burdens, response behaviors, data quality, costs of statistics, etc. do not appear to be sufficiently understood and researched at all.

The main recommendations ensuing from WP2 research include:

• Eurostat should engage as soon as possible in the development and implementation of (a) standardized methodology (ies) for the measurement of the response burden.
• The limelight of research on business data collection methodologies should be moved from qualitative and exploratory research aspects to quantitative and, possibly, experimental research designs (please note that a first step in this direction is scheduled under WP8).
• The impact/effect of new efforts to check response burdens should be appropriately monitored, reviewed and duly documented and published.
• Burden reduction measurement and allied actions should be appropriately monitored and coordinated within NSIs.

----------------------

Work package 3 “Business perspectives related to NSIs’ statistics”

Coordinated by Mojca Bavdaz (UL)

other partners involved: CBS, SSB, UNIBG, SCB, SORS

The focus and main objectives of WP3 were on the:

1) examination of business practices in using NSIs’ statistics;
2) exploration of business motivation for participating and accurately reporting in NSIs’ surveys;
3) identification of the links and relationships between people participating in NSIs’ business surveys and users of NSIs’ statistics.

Work performed - WP3 investigated business perspectives related to statistics produced by NSIs. Its first step (summarised in Deliverable 3.1) was an examination of business practices in using NSI statistics based on a multitude of sources external to businesses (sources at the participating NSIs; experts on businesses in institutions/organisations outside NSIs; and academic publications and textbooks). The second step was a cross-country field study among businesses. In five participating countries, on-site visits were conducted to 41 businesses; at each site, between one and three people (data users and/or respondents to businesses surveys) were interviewed. Findings from both steps were summarised and reported in Deliverable 3.2.

Deliverable 3.1 (M5), by M. Bavdaž (ed.) (2011), Report on the business use of NSIs' statistics based on external sources (NSIs, publications, expert opinions). The report has been duly submitted to both internal and external peer reviewing. The final version was submitted to the coordinator (ISTAT) in January 2011.

Deliverable 3.2 (M15), by M. Bavdaž (ed.) (2011), Final report integrating findings on business perspectives related to NSIs’ statistics. It has positively been reviewed by three consortium members and an external reviewer. Parts of the deliverable were presented at the NTTS Conference in Brussels, 22-24 February 2011; the BLUE-ETS Conference on Burden and Motivation in Official Business Surveys in Heerlen, the Netherlands, 22-23 March 2011; and the 4th International Conference on Establishment Surveys (ICESIV), Montreal, Canada, 11-14 June 2012.

The results of Deliverable 3.1 were organised by data source (NSIs, publications, expert opinions), which allowed discerning common patterns across data sources and helped in the subsequent research work. The findings were used as an input into Deliverable 3.2. but to add value, results in this deliverable were organised by country. In this way, a comprehensive snapshot of the situation was provided by participating country, which allowed discerning common patterns across countries and helped in drawing conclusions and recommendations for participating and other NSIs.

In addition, and although not formally part of this work package, WP3 partners have cooperated to the successful organization and implementation of 4 events/deliverables: (a) Workshop on pilot data collection at NSIs and businesses, to discuss preliminary research results, challenges and directions in Ljubljana, Slovenia, 7-8 October 2010 (within D11.3); (b) BLUE-ETS Conference on Burden and Motivation in Official Business Surveys in Heerlen, the Netherlands, 22-23 March 2011 (within D10.2); (c) Workshop to select ideas gathered on promising approaches to burden reduction and motivation enhancement to discuss phase 1 advances, and present tools and research results in Heerlen, 24 March 2011 (D11.4); and (d) Workshop Serving Businesses Better? Communicating with businesses: lowering response burden and increasing motivation; Brussels, November 27, 2012 (WP 10).

Main conclusions and recommendations under WP3 - The main messages from WP3 research are:

• The participating NSIs use the same or similar channels of dissemination to general public with the website as the main channel of dissemination; services specifically dedicated to businesses are less frequent. New statistics are continuously published on the homepage (often with lengthy and complex metadata documentation). This reflects the publicity principle (Erlien, 1997) implying that information is made available by the source, but must be accessed by the user.
• Knowledge on the use of NSI statistics among businesses is scattered around the participating NSIs (from customer support service to subject-matter departments, field staff and staff of specialised units), tacit, fragmented, unsystematised and partial (mainly based on contacts initiated by business). Business requests for NSI statistics often remain unrecorded or in inadequate format for further analyses. Many potentially useful data sources such as website usage reports and satisfaction surveys do not distinguish businesses from other users.
• Complexity and use of internal and external data seem to increase with the business size and depend on industry as well as other business characteristics (e.g. international orientation, presence and maturity of evidence-based or data-driven decision making etc.). Smaller businesses seem to rather look for information (instead of data) and prefer simple presentation and short descriptions while larger businesses seem to favour raw data to analyse them on their own. Internal data seem to be more important to businesses than external data (unless they still lack them, e.g. at start-up). In decision making, businesses seem to often complement data with intuition and experience but also networking.
• When judging external data quality, businesses seem to agree on giving priority to relevance; only if data are relevant, timeliness and accuracy become important. Good estimates are often regarded as sufficient. NSI statistics were mainly described as trustworthy and accurate in this research but this seems more likely to originate from the NSI’s image (the perception of the source) than from the quality assessment of statistics used.
• Top categories of NSI statistics relevant to businesses in participating countries concern: prices and price indices; labour market statistics (wages and employment); general economic indicators; and indicators by industry (e.g. trade, construction, professional services etc.). Businesses use NSI statistics for different purposes, among them the most common ones are: benchmarking; market analysis; reporting; tenders, official applications; and contracts and agreements.
• When businesses use or try to use NSI statistics, various problems might arise: (a) they might not be aware of NSI statistics availability or origin, which seems to hold more for smaller but seems not to be completely absent in larger businesses, and might also be related to scarce attention to official statistics in higher business education; (b) it might not be easy and straightforward to find relevant data in the statistical databases or elsewhere on the NSIs’ websites and use them adequately, which might also reflect the lack of search skills and (sophisticated) knowledge about how to interpret and apply NSI statistics to their situation; (c) timeliness and level of detail might not satisfy business needs; (d) NSI statistics might lack comparability with business internal data because business classifications sometimes differ from those used in official statistics.
• Businesses organise their work in various ways. Nevertheless, in interviewed small businesses it was often one single person using data, if used at all, and responding to NSI surveys. Data-related work is also frequently outsourced to accounting firms. In interviewed larger businesses, on the other hand, the two tasks were typically kept separated and assigned to people who may or may not know each other. Given that small and medium-sized businesses are typically not heavy users of NSI statistics while respondents in large businesses are distant from users of those statistics, the potential of motivating respondents with NSI statistics (direct feedback or other statistics) may be limited.
• The strongest motivator for responding (accurately) to NSI surveys seems to be legal obligation; others could be social responsibility and awareness of social contribution when responding to NSI surveys, agreement with the survey purpose etc.. Facilitators of the response process seem to include routine with established system of reporting and competence of respondents. Other suggestions for improvement of reporting include improving clarity of instructions in the questionnaires, availability of other and more advanced modes of data collection, tailoring to business characteristics like size and industry, and general conditions (e.g. web option, automatic data collection, business classifications etc.).

The main recommendations ensuing from WP3 include:

• Knowledge on business uses of NSI statistics should be improved, codified, become more structured and placed in two relevant contexts: in relation to business use of data in general and in relation to the use of NSI statistics in general.
• NSIs should consider moving beyond the publicity principle that currently characterises dissemination practices of NSIs towards the principle of dialogue with users of NSI statistics, especially businesses as their important partners.
• NSIs should make more efforts to raise motivation for better reporting in NSI surveys, also based on knowledge on business uses of NSI statistics.
• NSIs should upgrade their work on delivering accurate data.
• NSIs should actively work to improve statistical literacy among their users in general and their business users in particular.

----------------------

Work package 4 “ Improve the use of administrative sources”

Coordinated by Piet Daas (CBS)

other partners involved: ISTAT, SSB, INFOSTAT, SCB

The focus and main objectives of WP4 were on improving the use of administrative data in official statistics. Since the production of high quality data largely depends on the quality of the input data, it is vital that NSI’s develop robust procedures for determining the quality of administrative sources that are available and potentially usable as inputs for official statistics, in ways that are quick, straightforward and standardized. As yet, no standard instrument or procedure has been developed or exists for evaluating administrative data in a statistical context. The objective of WP4 is exactly to develop such a tool for quality assessment and allied indicators.

Work performed - WP4 activities have focused on, first, ways and means to “gauge and appraise the potential” of administrative sources, i.e. the quality and usability, of available administrative data sources, whenever NSI’s envisage to use the latter, as inputs in the statistical data production process. And, second, on optimal measurement methods for the indicators which eventually need to be identified and developed.

A report has been prepared on each of these topics, to provide both an overview and an assessment of the results. Finally indicators and methods have been reviewed, in view of their potential inclusion in a software tool.

Deliverable 4.1 (M7), by P. Daas et al. (2011), “List of quality groups and indicators identified for administrative data sources”. This report identifies a list of quality indicators for administrative data, when these are used as an input source in the statistical process of NSI’s. Quality indicators are grouped according to five general dimensions of quality identified for administrative data sources: Technical checks, Accuracy, Completeness, Integrability, and Time-related dimensions. Whenever applicable, a distinction has been made in each dimension between quality indicators specific for objects (e.g. units and events) and for variables. Eventually, the use, the potential, and the implications of the different quality indicators are gauged. Deliverable 4.1.has been reviewed by all WP4-member partners and three external reviewers.

Deliverable 4.2 (M18), by P. Daas et al. (2011), “Report on methods preferred for the quality indicators of administrative data sources”. This deliverable provides an overview of the measurement methods developed for the assessment of the quality indicators identified in Del. 4.1 on administrative data, when the latter are used as an input source in the statistical process of NSI’s. Del 4.2 reviews and assesses the methods which are most commonly applied for the measurement of the quality indicators dealt with in Del. 4.1. These measurement methods form the basis for the quality-indicator instrument and allied software tool due to be developed for administrative data sources in the remainder of this WP. The deliverable has been reviewed by several of the WP4 members and two external reviewers. As a result, it has finally become possible to measure the input quality of administrative data in a uniform way.

Deliverable 4.3 (M27), by P. Daas et al. (2012), “Quality Report Card for Administrative data sources including guidelines and prototype of an automated version”. In this report authors consider for every measurement method described in the previous deliverable whether and how it can be implemented. For most of the measurement methods that can be implemented, preliminary examples of the results of the software package under development based on test data are provided. The report also contains a proposal of a Quality Report Card for Administrative data, i.e. a card that provides a comprehensive overview of the input quality of the data in an administrative source. Deliverable 4.3.has been reviewed by all WP4-member partners and was used as direct input for deliverable 8.2. The results of WP4 and the quality work included in WP7 were jointly presented at a special BLUE-ETS session on the European Conference on Quality in Official Statistics (Q2012), Athens, Greece, 30 May - 1 June 2012.

Main conclusions and recommendations under WP4 - The main messages ensuing from WP4 activities so far are:

• Measurement of the input quality of administrative data differs greatly from that of survey data. Hence there is a need for the establishment of a new framework with five general dimensions for the input quality of administrative data, viz. Technical checks, Accuracy, Completeness, Integrability, and Time-related dimension and also the need for new indicators and measurement methods.
• Determination of the input quality of administrative data is not a task on its own. It should always be preceded by evaluation of the quality of the metadata and delivery related quality components.
• Evaluation of the input quality of administrative data can be looked at from several points of view. It can focus on the quality of the data source itself and on the expected quality of the estimates based on the data in the source. Since this can affect the assessment of the data, the framework developed needs to be able to cope with that.
• It is very difficult to limit the number of indicators (and measurement methods) for the input quality of administrative data. It was found that each user or country representative had a need for indicators (or measurement methods) with a specific focus. To enable to serve each user, the set of indicators (or measurement methods) was set up as a general ‘toolbox’ covering every general quality component of the data in each of the five dimension established. From this ‘box’ each user can pick the tools needed.
• Visualization methods and Tableplots in particular (Tennekes, de Jonge and Daas 2011, Journal of Data Science 11, 43-58) are a highly effective, easy, and informative way to deliver information on the general content of administrative data (and other large data) sources to users.
• WP4-findings are not limited to administrative data sources used for business or economic statistics. They can be generally applied to any administrative or other secondary data source used for statistics.

The main recommendations ensuing from WP4 activities are:

• The increase in the use of administrative data makes NSI’s vulnerable to sudden changes in the quality of the data collected by administrative data holders. Guarding the quality of administrative data when it enters the NSI on a regular basis is one of the best ways to get a grip on this dependency risk. This is especially important for statistics produced on a monthly (or earlier) basis.
• A visualization method called a Tableplot is an easy and quick way to inspect the content and quality of administrative data used for statistical purposes.

----------------------

Work package 5 “New ways of collecting and analyzing information”

Coordinated by Miroslav Hudec (INFOSTAT)

other partners involved: UNINA

The focus and objectives of WP5 were on innovative methods, tools and procedures which need to be developed to exploit better and more efficiently the potential of statistical, administrative and textual data. These involve:

• data analysis -aspects related to new and more efficient ways of supporting statisticians in their tasks related to management of respondents, evaluation of algorithms for imputation, imputation and flash estimates under incomplete data;
• data collection -kindred aspects relating to new and more efficient ways to mine unstructured databases, such as information contained in administrative documents, e.g. in the BLUE-ETS specific case the annual reports of all companies listed in the Italian Stock Exchange, with a view to collect both information (analysis of high-dimensional data and new ways and means to “control dimensionality”) and data (availing of promising Natural Language Processing tools, together with Social Network Analysis).

Work performed - Human capabilities of approximate reasoning and communication with words constitute a very powerful way of solving problems of interpretation. On the other side, large amounts of data that contain valuable information, patterns and relations cannot be directly examined by humans due to information overload and therefore must be handled with other tools. In order to meet both goals, activities under this WP focused on (i) ways and means to study and apply theoretical approaches; and (ii) developing novel tools, mathematical equations, constructing new computational algorithms and testing these methods on real data. Foundations of fuzzy selection and classification were developed in a way which allows their integration into one tool. The second step focus on the review and the monitoring of existing and new experimental and promising practices in data collection.

Deliverable 5.1 (M30), by M. Hudec et al. (2012), Report on principles of fuzzy methodology and tools developed for use in data collection (Soft computing and text mining tools for Official Statistics) It has positively been reviewed by two consortium members and two external reviewers. The final version was submitted to the coordinator in September 2012. The report examined the applicability of fuzzy logic, neural networks, genetic programming (soft computing approaches) and text mining techniques to data sets in order to improve the collection and the quality of data for business and trade statistics. Developed mathematical foundations and tools built on these foundations were tested on small-scale data sets from three countries.

Deliverable 5.2 (M34), by S. Balbi et al. (2013), Report on analysis of existing practices in the data collection field. It has been positively reviewed by two consortium members and two external reviewers. The final version was submitted to the coordinator in January 2013. The deliverable focused on the review of existing and promising practices in data collection, more precisely on the re-use of administrative data, secondary (alternative) sources, data and metadata standards with a discussion of future promising perspectives and issues.

In addition, WP5 partners have cooperated in the organization of workshop “The potential for use of soft computing and text mining tools in NSIs” in February 2013 in CEPS, Brussels.

Main conclusions and recommendations under WP5:

The main messages from WP5 research are:

a) Statisticians possess knowledge of how to deal with their tasks, but this knowledge cannot always be expressed by precise and crisp rules. Linguistic terms and quantifiers are more flexible and less restrictive than crisp ones and they are directly usable on datasets;
b) Large complex data sets may exhibit features that cannot be detected by traditional tools. These features are mostly due to the large number of dimensions in these data sets; variety in data types, incomplete data sets and time restrictions;
c) Textual data sets could reveal valuable data and information if they are adequately mined, which could reduce response burden;
d) Genetic algorithms could prove itself to be useful for missing data estimation and help to produce flash estimates of foreign trade aggregates before official data release;
e) The properly proposed neural network enables classification of large datasets on the basis of similarity and can solve the problem of missing values;
f) Fuzzy logic supports direct application of linguistic terms which allows statisticians to naturally express their tasks in selection of respondents for tailored reminders, for evaluation of algorithms for imputation and reveal rules of respondents behavior;
g) Flexible classification has a potential to classify respondents into overlapped classes using rules expressed by natural language providing easy way for recognizing key respondents/customers and ensuring that similar respondents are similarly treated (motivated);
h) Data collection and dissemination, although on two different ends of statistical data production, are intertwined and they influence each other. Better data dissemination (by flexible queries) could motivate respondents to provide their own data timely and accurately and reduce the frequency of missing values implying more efficient imputation (less missing values). Moreover, textual data mining tools can be useful to improve the communication between respondents and NSIs by producing business and trade statistics using respondents own words. If they can find information in an easy readable way, businesses will be more willing to provide their data.

The main recommendations ensuing from WP5 activities are:

a) Developed methods based on fuzzy logic data selection constitute an easy and quick ways to inspect how current algorithms for imputation work by evaluation of rules from data, and to reveal relational knowledge (rules) of respondents behaviour;
b) Classification space based on fuzzy if-then rules and overlapped classes could be used for respondents/customers classification in order to detect key respondents and ensure that similar respondents (or other entities) are always similarly treated for motivation;
c) Fuzzy logic is not limited to data collection. Individuals rely on common sense and use linguistic terms when they query for data and information, select on the basis of several criteria at the same time and seeing selected entities downwards from the best to the worst. User-oriented dissemination could improve image of NSIs and indirectly motivate respondents to send their data. Equations developed in this WP are directly applicable, the only need is programming of functionality on websites;
d) There is a considerable room for concentrating new research on computation methodologies and optimization techniques utilized for linking different databases and analysing large data sets;
e) Textual data sets are relevant sources of information and they should not be neglected by NSIs.

Additionally, the researchers have also come to conclusion that approaches for mining textual data sets could be improved by fuzzy logic. The partners of this WP contributed to the creation of new project consortium. New consortium applied for the FP7 project (January 2013 call) with the proposal called EUPHORIA in order to produce a framework of statistical and computer science results for building a platform towards diseases evaluation. Findings could be also valuable for official statistics.

----------------------

Work package 6 “Enhancing quality of business statistics”

Coordinated by Ralf Münnich (UT)

other partners involved: ISTAT, UNIBO, UoS, UoM

The focus and main objectives of WP6 were on introducing statistical methods that deal with the problems allied to the production of disaggregate estimates for business statistics. The WP has investigated on:

• estimation and confidence intervals in business surveys
• the impact of single and multiple imputation methods on business statistics
• small area estimation in business surveys

Work performed - The work within this work package has been split into two parts, a contribution to the R community (as part of R routines or a package), as well as a report. The report consists of the current state-of-the art as well as of best practice recommendations which are elaborated from the above mentioned Monte-Carlo studies.

The first issue in WP6 was the definition of a business database to be used for the construction of an artificial population and to carry out simulation studies. This has led to the creation of the fully synthetic dataset TRItalia which will enable comparative research in the field of computational survey statistics. This synthetic data set aims at fulfilling two conflicting goals: to achieve statistical disclosure control and to preserve as much as of the structure of the real data as possible. TRItalia was used to conduct simulation studies to analyse the empirical performance of small area estimators as well as to study the behaviour of variance estimators under complex survey designs.

Further, the impact of different imputation routines on variance estimators has been studied by means of a design-based simulation study. The results indicate that naïve methods may lead to erroneous inferences.

In addition to that, the research on applying Information Theoretic (IT) methods and Hierarchical Bayesian estimation of small area parameters for business data indicated the tremendous potential of these techniques for the production of official statistics. In this context, the research focused on IT–based methods for estimating a target variable in a set of small areas, by exploring spatially heterogeneous relationships at the disaggregate level and by taking into account spatial dependencies between the small domains. Furthermore, the application of Bayesian methods yielded remarkable results for the estimation of small area parameters with skewed data.

Furthermore, composite estimators exploiting the correlation in rotational panels over time have been developed and the potential of these methods has been highlighted by simulation studies. In order to communicate statistical information to the users, the coherence between small area estimates and national estimates is considered an important task. Part of the research in this WP focused on benchmarking small area estimates as to achieve coherence of the different figures.

Besides that, the impact of sampling design on small area estimates has been carefully studied. The simulation results in this respect revealed the importance of incorporating the design-weights into the statistical models. Moreover, the impact of different multiple imputation models on small area estimators has been analyzed.

Furthermore, spatial robust and robust estimators have been applied on business data, because the distributions of business data are often highly skewed and characterized by influential outliers, thus violating the assumptions of standard model-based small area estimators

Major milestones have been achieved.

Deliverable 6.1 (M32), (Open sources R package) Best practice recommendations on variance estimation and small area estimation in business surveys Computer Codes

Deliverable 6.2 (M34), Best practice recommendations on variance estimation and small area estimation in business surveys

Main conclusions and recommendations under WP6 - The main messages from WP6 research are:

In view of the achievement of the main objective of introducing statistical methods that deal with the problems allied to the production of disaggregated estimates for business statistics, the research work has lead to the following results: 1) an integrated business database has been defined using Italian business data coming from small and medium enterprises (PMI) survey sample data as well as large enterprises (SCI) census and the statistical business register (ASIA). This integrated business data base was then used to generate a synthetic business data set of high quality that can provide a basis for further comparative research. The synthetic data set, after validation of respecting the existing legislation on confidentiality, will be made available in terms of open data policy, i.e. the dataset will be provided to interested users on the BLUE-ETS website and on Trier dataserver. With regards to the case studies a literature review on variance replication methods, in presence of single imputation and an exploratory analysis on the business database have been conducted. 2) A spatial robust empirical best linear unbiased predictor (SREBLUP) has been derived, which should be very suitable in the context of business statistics, since it combines the advantages of robust estimators with the ability of incorporating spatial information. Moreover, methodological developments in terms of composite Information Theoretic (IT) estimators have been produced. More specifically, IT estimators of variables and indicators at sub-area level by combining information at area (aggregate) and sub-area level were developed. 3) The usefulness of small area estimators based on non-linear transformation for skewed data has been underlined from both a Bayesian and a Frequentist perspective. The results indicate that these estimators may be superior to methods based on linear predictive models in the presence of skewness. 4) Furthermore, a new benchmarking method for the nested error unit level regression model was introduced. This can be used after a back-transformation with bias-correction (necessary in the case of skewed variables as encountered in business surveys). 5) The potential advantages of appropriate imputations methods for variance estimators and small area estimators have been underlined in simulation studies. 6) The problems when ignoring complex survey designs in model-based small area estimation procedures have been studied by means design-based simulation studies. The results shed further light on the importance of choosing the appropriate approach to incorporate design weights into the statistical modelling. 7) The potential benefits of exploiting the correlation in rotational panel designs over time has been underlined. Estimators using the correlation structure yield large reductions of the standard deviations of changes over time.

----------------------

Work package 7 “Business data integration, systematization and access”

Coordinated by Manlio Calzaroni (ISTAT)

other partners involved: CBS, UoS, IAB, UoM

The focus of WP7 was on improving business data integration systematization and access. Its objectives are to:

1. Develop a Linked Employer-Employee Database or LEED from existing administrative data in order to get information on employers and employees and follow their link over time (tasks 1-2-3).
2. Develop methodologies for releasing samples of anonymised administrative and business microdata (task 4).
3. Address the problem of international access to microdata integrated from administrative sources (task 5).
4. Develop a common methodology for supporting quality improvement in producing economic statistics (task 6).

Work performed - The research work, organised in the above mentioned 6 tasks, have progressed throughout the project life towards the achievement of the WP’s critical objectives, that is 3 deliverables.

The first three tasks concerned the work carried out to build up the Italian Linked employer-employee data base (LEED). It resulted in the construction of a first prototype data base, based on administrative data currently accessible for economic study and research in the labour market. The work was performed through the following main steps: identification of the most suitable sources; selection of the variables useful for identifying and characterizing the employee work; study of the legislation in order to understand the meaning of the variables included in the selected administrative sources and to compare definitions and classifications with official ones; selection of the variables useful for identifying and characterizing the employee work; integration of administrative records, dealing with duplicate record issues and measurement of over-coverage and under-coverage of administrative sources; identification and selection of the 'core workers' universe; construction of the variables that characterize the enterprise (economic activity sector, size class, geographical area of localisation), the employee (gender, age, country of birth) and the employment relationship (professional status, working time, permanency of the job), also among the most relevant variables, the gross wage and the net wage were calculated exploiting the fiscal source.

The database was built up in terms of job positions and has about 5 million records for both years (2005 and 2008). A portion of the prototype characterized by the economic areas of Made in Italy (i.e. the textile and clothing industry), consisting in around 300,000 records for both years, has been released to external users. Comparison with the German LIAB, that is the existing models of the Linked Employer-Employee-Data (LIAB) provided by the Research Data Centre of the Federal Employment Agency at the Institute for Employment Research, has been realized..

Concerning the analyses of the disclosure risk related to the release of a sample of microdata, Istat has developed a strategy for releasing a sample of microdata stemming from administrative sources. More generally, a strategy to obtain a disclosure risk measure for business microdata has been proposed,based on the robust fitting of finite multivariate gaussian mixture, by using an unsupervised learning approach. Moreover Istat has also investigated the use of some theoretical tools to generate artificial data as an approach to provide effective protection to business microdata. Moreover a strategy has been developed for the release of public use file for business microdata stemming from the corresponding file for research purposes. Istat has released a public use file for the Italian sample of the Community Innovation Survey (available at http://www.istat.it/en/archive/87787) and is going to release a public use file for the Structure of Earning Survey, a Linked Employer Employee type of data.

The IAB group has developed a code to generate dummy datasets from the linked-employer-employee-dataset (LEED) of the Institute for Employment Research (IAB) to facilitate the development of analysis code for external users working in a remote execution environment. Dummy datasets can help researchers when developing the analysis code to be submitted to the research data center. Further work has dealt with the development of different nonparametric methods to generate synthetic datasets. In particular, the work implemented by the IAB group has dealt with evaluating and comparing four synthesizers based on nonparametric regression algorithms from the machine learning literature, namely classification and regression tree, bagging, random forests, and support vector machines.

The UoS and then the UoM group worked on assessing disclosure risk of highly perturbed business microdata. The aim of the work was to quantify the disclosure risk so that the correct mode of access can be determined or assess the need for further perturbation to the dataset

The aim of task 5 was the implementation of an RDC-in-RDC approach for data access which should allow to overcome the existing legal barriers and to bring micro data access in Europe closer to the ideal perception of remote access. The basic idea is to allow remote data access from designated institutions with comparable standards The RDC-in-RDC approach may therefore be regarded as a first step towards remote access in Germany and may also represent a blue print for an intensified international data sharing.

Concerning task 6, which deals with managing the quality in producing economic statistics, the CBS group involved in the research reviewed the existing literature focusing on how to improve the quality of statistics, where a statistic is a chain of statistical processes that is interspaced with data storage points. Guidelines were provided in the form of a stepwise approach, to determine which indicators are needed within the production process, at which location, including the corresponding actions, given the requirements of users and the environment (NSI). The guidelines have been applied to (a chain of) end-to-end processes at two statistical institutes.

Deliverable 7.1 (M32), Report on the definition of the information sources and the variables to be included in the register; the quantitative analysis of the coverage by means of macro/micro comparisons with other information sources; the integration with the Business Register and implementation of a prototype of LEED; development of methodologies for the release of samples of business micro data from administrative registers and integrated micro data and on the progress in the area of methodologies for producing synthetic data. Case Study on: problems and solutions for the release of a Tax File for Italy; release of business micro data.

Deliverable 7.2 (M34), International accessibility: Investigation of ways to increase the accessibility at international level of integrated micro data with a view to harmonising dissemination strategies; both the release of samples and direct access to micro data will be tackled.

A technical concept for a RDC in RDC approach deciding to use a Citrix solution has been developed as it is a proved software and it is used by other RDCs (The Netherlands, UK, US) and Eurostat is considering it, too. This report includes reporting on the implementation of the test of the hardware for the Citrix solution for the secure internet connection for remote access.

Deliverable 7.3 (M30), by Robert Griffioen, Arnout van Delden and Peter-Paul de Wolf (2012), Report on methods for managing the quality of statistical output and development of a set of indicator types for monitoring the quality of output and of its half-products. The deliverable is split into two sub-products: (a) Overview of quality frameworks for possible use in NSIs and (b) Guidelines for deriving indicators for process and chain management at NSIs

Main conclusions and recommendations reached under WP7 - Important results have been reached by the methodological work on the release and access to microdata, that is: the development of a strategy for releasing a sample of microdata stemming from administrative sources; the use of nonparametric methods for reducing the modelling burden when generating synthetic datasets; the non feasibility of a synthetic data approach for generating dummy datasets of a LEED; initial testing shows that the disclosure risk measure is monotonic and comparable. On developing a technical concept for a RDC in RDC approach it was decided to use a Citrix solution.

----------------------

Work package 8 “Methodological case studies”

Coordinated by Miroslav Hudec (INFOSTAT)

other partners involved: ISTAT, CBS, SSB, UL, UNIBG, SCB, SORS

The focus of WP8 was on testing and evaluating the methodologies in NSIs practices, an essential part of the accomplishment of the BLUE-ETS goals. The work package was focused on the evaluation of case studies ensuing from previous research on reduction of response burden and motivation enhancement measurement of quality of administrative sources and innovative tools and procedures developed for data collection and analysis.

Accordingly, WP8 consisted of the following undertakings:

• designing methodological case studies, to test and evaluate promising approaches;
• implementing tailored case studies.

Work performed - Within the research framework of WP2 and WP3 on response burden and motivation, analysis focused on empirically testing and evaluating promising approaches, tools and methods developed to (1) study causes and consequences of response burden, and (2) to reduce response burden and to motivate businesses for better reporting.

Concerning the work on the quality of administrative sources, reporting presents the findings of the evaluation of various administrative data sources used by Dutch, Italian and Swedish National Statistical Institutes. Results include also the manual for the methods implemented in the data quality R-package and the most recent version of the Quality Report Card for Administrative data (QRCA).

Regarding soft computing, the analysis focused on empirically testing and evaluating promising approaches developed in WP5. Data used in experiments were taken from Slovak Intrastat database, Slovak administrative data related to respondents’ duty and realized trades, and data from interviewing of businesses in Slovenia.

Deliverable 8.1 (M35), edited by D. Giesen, M. Bavdaž and I. Bolko (2013), Comparative report on integration of case study results related to reduction of response burden and motivation of businesses for accurate reporting. It has been positively reviewed by three consortium members and two external reviewers. The final version was submitted to the coordinator in February 2013.

Deliverable 8.2 (M34), by P. Daas et al. (2013), Guidelines on using the prototype of the computerized QRCA version and Report on the overall evaluation results. The Report has been positively reviewed by three consortium members and one external reviewer. The final version has been submitted to the coordinator in March 2013. The main purpose of this report is to demonstrate that the quality framework developed can actually be used on real world administrative data. It also allowed us to test the methods included in the ‘data-quality’ R-package.

Deliverable 8.3 (M32), by M. Kľúčik et al. (2012), Final report on the case study results on usage of IT tools and procedures developed for data collection; It has been positively reviewed by two consortium members and three external reviewers. The final version was submitted to the coordinator in November 2012. The main purpose of this report is to demonstrate on real data the applicability of soft computing methodologies to improve some issues of the collection and the quality of data for business and trade statistics. The methodologies have been developed within the framework of Work Package 5.

Deliverable 8.4 (M35), edited by M. Hudec, Evaluation report on case studies

It has positively been reviewed by consortium members. The final version was submitted to the coordinator in February 2013. The aim of this report is to provide an overall synopsis of realised case studies in deliverables 8.1 8.2 and 8.3.

Research findings were presented at the 4th International Conference on Establishment Surveys (ICESIV), Montreal, Canada, 11-14 June 2012 and the NTTS Conference in Brussels, 5-7 March 2013.

Main conclusions and recommendations under WP8

Response burden can affect the response behavior of respondents and, through that, the quality and costs of data collection; perceived response burdens are affected by both actual burden (time spent) and respondents’ attitudes (how they feel about the statistical agency and the usefulness of its statistics), thus it is important to manage and to monitor actual and perceived response burden (at the individual level of every business and not just at the aggregate). It is not straightforward to affect motivation and ensuing response behaviour with communication strategies, so NSIs should develop strategies to make respondents feel more positively about NSIs and about the usefulness of statistics. Research should investigate causal relationships between characteristics of survey design, perceived and actual burden and response behaviour as well as motivation in this framework. In order to enhance data use and influence motivation for participating in surveys, NSIs should promote a more positive image of themselves and the usefulness of official statistics. Interventions should be tailored to the needs of a specific business. Questionnaire design should be improved as well as the collection and analyses of data stored for statistical production purposes. Finally, it would be useful to have a more centralised and systematic registration of business requests about statistics to develop and promote statistical products that businesses find useful and interesting.

NSIs use administrative data with various intensities but the common problem is quality measurement of these sources. NSIs have to cope also with the fluctuating quality of administrative sources. The quality indicators developed could be applied to all data sources that potentially provide valuable information for official statistics. As such, new sources with data collected by others, such as so-called Big Data, can be evaluated by them.

Examination of statistical knowledge and tasks reveals that traditional tools cannot cope efficiently with all demands.

Tools based on the neural networks and genetic programming could help in the estimation of missing values by searching for similar patterns in large incomplete databases. Moreover, soft computing approaches can estimate attributes which are not in the focus of data collection (optional fields in questionnaires) but could be relevant for national and international statistical institutes.

The strength of reminder letters should be accommodated to the length of non-response and date of establishment of reporting duty but it is a labor-intensive work. Flexible (fuzzy logic) selection offers a robust and easy way of selecting respondents for tailored reminders. An alternative is rule-based selection which detects relational dependencies (rules) in databases for evaluation of estimated values and extracts respondents’ behaviour.

Fuzzy classification can help identifying key data users and respondents and reveal their potential and weaknesses so as to ensure that similar respondents/data providers are always similarly treated. Flexible classification is a significant support for tailored motivation strategies.

Conclusion in WP3 led to the proposal of synergy between motivation of respondents and soft computing: As a result the proposal for the ERC Synergy Grant (acronym FuzzyMotiv) has been sent to the EC.

----------------------

Work package 9 “ New types of indicators-Applying results”

Coordinated by Donatella Fazio (ISTAT)

other partners involved: UNIBO, CEPS, UNIFI

Focus and main objectives of WP9 - This WP was focusing on linking data to indicators and issues, and tailoring the former to the latter, not just to generic averages or dimensions and concepts, which are not necessarily consistent or appropriate or support any analyses (no matter the level of specification or the dimension and, more generally, the issues which are being dealt with). The growing mismatch and heterogeneities between statistics, concepts and analyses make it increasingly important to develop practices and tools which permit flexibility and choice and, in particular, greater tailoring of the statistics and their matching with analytical requirements. The WP permitted to look at the challenges and the state-of-the-art in new, as yet mostly uncharted or badly charted, fields, and to link better the outputs and knowledge created under EU FP projects with statistics and measurement, and to assess how official statistics need and can adjust to reflect and catch structural jumps and change.

The scientific work under this WP addressed the following main objectives:

1. bringing new elements to our understanding of the scope for and the obstacles to, a more satisfactory handling of intangibles in a future revised system of national accounts, including also the feasibility of introducing extensions of the existing accounts so as to allow the calculation of alternative measures of intangible investment and assets in a consistent framework;
2. developing better indicators on business collaboration and Collaborative Working Environments (CWE);
3. discussing the need of economic and business statistics for portraying and defining appropriately “rural areas” whose implications goes beyond agriculture and cover many economic, social and environmental issues;
4. advancing on the measurement of competitiveness by means of an index for measuring the role of Public Institutions (and policies) on industrial competitiveness

Work performed - In order to achieve the above mentioned objectives different tasks have been implemented.

The first task, dealing with a review of the present state of the art on intangible investment and intangible assets in the structural business statistics and the national accounts, has produced a broad review of measurement problems and approaches concerning intangibles and associated statistical issues.

With regard to the task on Collaborative Working environments, the work performed has dealt with an analysis of the current initiatives in official statistics to develop new statistical indicators on business collaboration and Collaborative Working Environment (CWE) as well as improvement of the data availability not only on the use of collaboration tools but also on the skills needed for this practice. Analysis describes two recent initiatives introduced by Istat. Among them the introduction of an ad hoc module in the Community Survey on ICT Usage and e-Commerce in Enterprises could be considered as “best practices” to be extended to all EU NSIs.

Concerning the task on competitiveness, analysis has dealt with the construction of a multidimensional competitiveness index applied to enterprise micro-data and on the methodologies to decompose the multidimensional index. The index has been developed within the Information Theoretic framework following an approach similar to one of the multidimensional inequality index of well-being. The proposed tool allows analysing multidimensional competitiveness at different level of aggregation (country, region or sector) with a micro-level foundation. Further work has dealt with the development and implementation of a decomposable indicator for public institutions’ contribution to competitiveness developing an index that has been applied in a case study for Italy. Results has shown that the index can provide a measure of effectiveness for major types of public policies such as Health care, Public Transport, Water and waste management, Education, Police, etc.

With regard to the “rural areas” issue, first analysis dealt with rural development issues and definitions investigating the key statistical categories considered at National Accounts level. Then, the issues of “rural areas” identification and territorial disaggregation of indicators are specifically considered, and the applied methodology is explained. Finally, a summary of results and graphical representations based on Italian data from Italian National Statistical Office (Istat) have been reported, providing a useful tool for policy makers at a National, Regional and local level.

Deliverable 9.1 (M34), by J. Mortensen (2013), Report on the methodology and review of the current position as regards rules for handling intangibles in business and national accounting in the EU and in the international community; the implication of the findings of the FP7 projects INNODRIVE and COINVEST for the further development of permanent statistical tools; the findings and policy recommendations concerning the future handling of intangibles in business and national accounts.

Deliverable 9.2 (M34), by R. Bernardini Papalia, V. Patrizii, A. Righi et al. (2013), Report on decomposable indicators on industrial competitiveness and business collaboration and CWE

Deliverable 9.3 (M34), by C. Piccini and E. Pizzoli (2013), Report with statistical results, based on available data studies, and provision of suggested innovation on official Rural Development Statistics.

Main conclusions and recommendations under WP9

Conclusions and recommendations can be drawn from the work related to the measurement of intangibles in national accounts statistics and associated general measurement issues: the review of measurement problems and approaches concerning intangibles concludes without ambiguity that the compilation of all these and a number of other indicators poses a host of methodological questions that need further examination. In particular, there is a most pressing need to promote alternative indicators for the basic economic performance and, in particular, productivity.

Analysis on CWE has stressed the requirement for official statistics to investigate the opportunities offered to the enterprises by CWE tools and to monitor the use of these practices in the enterprises more frequently. Attention on these issues should be paid at a European and international level. ISTAT’s experiences on these issues can be exported in other European countries providing the basis for a new accumulation of knowledge that can be managed to the benefit of competitiveness and productivity of the economic system.

Analysis on competitiveness has shown that new indicators can be developed to translate the distribution of firms characteristics into measures of competitiveness designed to capture not only average performance but also the heterogeneity of firm performance. The decomposition of the index according to some characteristic of the units can supply information on the determinants of competitiveness. In the respect, the analysis should be extended to relevant characterization like the degree of internationalization or innovativeness. Further work is aimed at deriving some methods to assess the relative importance of several potential explanatory factors simultaneously, as opposed to the traditional decomposition methods.

With regards to the analysis dealing with the development and implementation of a decomposable indicator for public institutions’ contribution to competitiveness, results show a large territorial variability in Italian local public services productivity and a differentiation in terms of both levels of government and type of services. This type of analysis allows identifying areas, services and tiers of government’s lower efficiency that constitute a potential obstacle to growth. By means of this analysis, it is possible to identify with some degree of precision areas, services and levels of government that constitute an obstacle to growth and a cause of waste in terms of public expenditure.

Finally concerning the work on rural areas, analysis affords a wide spectrum of topics, which allows getting a broad view of the problems involved, and gives an indication about possible developments in socio-economic data spatialization. Finally, testing wider areas and larger databases – including, for example, a larger set of economic activities - this analysis provides the basis for a further attempt to improve the performance of geo-statistical methods applied to socio-economic data.

----------------------

Work package 10 “Improve the Dialogue across EU NSIs and Stakeholders”

Coordinated by Jorgen Mortensen (CEPS)

other partners involved: ISTAT, CBS, UL, UT

Focus and main objectives of WP10 - The activities under this WP consisted mainly of meetings, e.g. conferences, seminars and workshops with involvement of all stakeholders: policy makers (national and EU), public administrations, businesses, academics and researchers (statisticians, but not only), NGOs and users of statistics in general. An essential feature of these activities was to prepare issues papers, syntheses of proceedings and ensure that contributions are disseminated widely and efficiently through the web sites of the consortium partners, including, in particular, the web site of CEPS.

Key objectives of this WP:

- Ensuring high quality of dissemination, by subjecting all deliverables and publications to internal and external refereeing.
- Fostering the dialogue and ensuring dissemination among the business community and allied national and international organizations, NSIs, academics and research/users, governments and national and international organizations in general;
- Establishing operational contacts with EUROSTAT, relevant DGs of the European Commission, the OECD; and to the largest possible extent coordinating the programme and scheduling of these events in very close consultation with these organizations.
- Effectively disseminating BLUE-ETS “advancements and innovations”, and ensure that its R&D outputs are appropriately disseminated and fed into EU and non EU NSIs and Eurostat “production processes”.

Work performed

Activities under this work package have been completed, in accordance with the work programme.

To effectively plan and organize meetings and conferences, ad hoc focus groups involved consulting and discussing actively with the different stakeholder and key experts. This has permitted to better grasp the key issues and challenges and, more importantly, to focus conferences/workshops on exactly those subjects which have been deemed paramount under the MEETS and are actively being pursued by NSIs and EUROSTAT with other means, e.g. ESSnets. To this effect, BLUE-ETS representatives have participated in several ESSnet workshops and NTTS conferences.

The following activities have been performed:

- Presentation of the main features and objectives by the scientific coordinator, Paolo Roberti, to the BSDG “Business Statistics Directors Group meeting in June 2010 in Luxembourg, offering a first collection of comments and suggestions from this audience (deliverable 10.1).
- The collaboration with WP2 (NSIs’ practices concerning business burden and motivation) and WP3 (Business perspectives related to NSIs’ statistics), in the organization of a Conference on NSI practices concerning business burden and motivation and business perspectives. This conference was held on the premises of Statistics Netherlands, in Heerlen, the Netherlands, in March 2011. Synthesis of the proceedings of the Heerlen Conference has been prepared by CEPS for the general public and information of stakeholders.
- The preparation of the collection and dissemination of the proceedings of the Heerlen Conference by the teams of Statistics Netherlands and the University of Ljubljana.
- Participation of BLUE-ETS in ESSnet workshops in 2011 and 2012.
- Preparation of general workshops and proceedings with consortium partners and key stakeholders.
- The organization of a BLUE-ETS Session, in cooperation with Eurostat, plus posters at the NTTS Conference, 22-24 February 2011, Brussels.
- Participation of BLUE-ETS in the NTTS conference on March 6-7 2013 in Brussels.
- Organisation of the Final Conference of BLUE-ETS on March 8, 2013 in Brussels.

In addition BLUE-ETS intermediate findings were presented at the following events:

- SIS 2011 Statistical Conference – 8-10 June 2011, Bologna, where Istat researcher Luisa Franconi submitted the paper on “Current development and potential strategies for microdata access at Istat” (paper available in the private area of the Blue-ets website).
- 7th Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality (Tarragona, Spain, 26-28 October 2011) - where Istat researcher Flavio Foschi presented the paper on Disclosure risk for high dimensional business microdata”. The paper presents some simulation results from the projects research work on the control of the disclosure risk, together with an application to business census-like microdata stemming from the Enterprises' System of Accounts Survey (paper available in the private area of the Blue-ets website).
- 2nd European Establishment Statistics Workshop - EESW11 12-14 September 2011, Neuchatel, Switzerland, where Griffioen, A.R. Van Delden, A. and P.P. de Wolf, (CBS, NL) made a presentation on “Key elements of quality frameworks, to be applied to statistical processes at NSI’s”.
- 4th IAB Workshop on Confidentiality and Disclosure, Nuremberg, Germany, June 2011 where Natalie Shlomo (UoS) made a presentation on “Assessing the Disclosure Risk of Perturbed Enterprise Data” (available in the private area of the Blue-ets website).
- Deliverables 10.1 (M2), “Report on the Initial Workshop” (presentation made by Paolo Roberti, the Scientific Coordinator of Blue-ets project, to the Business Statistics Directors Group Meeting held in Luxembourg on June 10-11/2010). Prepared by ISTAT. Submitted on time.
- Deliverable 10.2 (M11), Report (prepared by ISTAT) on the Conference on Burden and Motivation in Official Business Surveys, it includes an annex by Jorgen Mortensen which is a very comprehensive paper on the proceedings of the Conference. Submitted on time
- Deliverables 10.3 (M36), Final Conference, the event, organized by CEPS, was held in Brussels, European Commission, Charlemagne building, Jenk room, on March 8, 2013.

Main conclusions and recommendations under WP10 - The main recommendations ensuing from WP10 R&D activities include:

The presentation made by the Scientific Coordinator at the BSDG “Business Statistics Directors Group has mainly given the opportunity to state the need for co-ordination between MEETS and BLUE-ETS project. Actually, the BSDG took note of the BLUE-ETS initiative and thanked Mr Roberti for his presentation expressing at the same time the need to define the way the project and MEETS relate to each other in order to clarify and coordinate the work into the two areas without overlapping.

The BLUE-ETS Conference on Burden and Motivation in Official Business Surveys was held March 22 and 23, 2011 at Statistics Netherlands’ premises in Heerlen, the Netherlands, which gathered several BLUE-ETS Consortium partners and many representatives of NSIs and other organisations. There were 69 participants from 35 organisations and 21, mostly European, countries. The conference provided an opportunity for presenting and discussing the findings of research on data collection methodology for business surveys. Some of the recommendations that emerged from the event were: the need to understand burden elements from the perspective of business; the public good nature of official statistics that should be assessed from a use/usefulness perspective; the need to switch attention from burdens to gratifications, from reactivity to pro-activity, and from traditional internal, administrative attitude to communication and learning from businesses; the need to specify operational definitions for the concepts of burden and quality. Finally, it was pointed out that future research should study together processes in business and burden, search for balance between time burden and cognitive burden in survey planning, and start cognitive testing of products offering tailored statistics.

The BLUE-ETS Final Conference was held on March 8, 2013 in Brussels at the European Commission, Charlemagne building. The Conference was organised by CEPS the day after the end of the NTTS conference which hosted a session on BLUE-ETS project. The Final Conference was the opportunity to submit to a large audience of stakeholders the main results of the project and to discuss future research priorities in the field of modernisation and quality of business statistics.

----------------------

Work package 11 “Scientific coordination, Dissemination and Evaluation”

Coordinated by Paolo Roberti (ISTAT)

other partners involved: CBS, UNIBO, INFOSTAT, UL, UNINA

The focus and main objectives of WP11 - were on:

- Scientific governance, quality-across-the-board and decision making, which cannot be just expected, but have to be actively and cooperatively sustained. They are key for smooth and successful implementation; quality, effective delivery and impact.
- Dissemination and take-up in a European dimension: not just few NSIs, but the ESS.
- “Learn lessons”, set standards, Internal and External evaluation, and impact assessment; which, as per the BLUE-ETS Annex I, cuts across the board, from overall coordination and implementation to planning and core-group supervision, issues, quality, standards, timeliness, overall and specific evaluation of BLUE-ETS deliverables, under the supervision of the BLUE-ETS Consortium. and, more specifically, the:

o Steering Committee
o Project Management Board

and a commitment which goes well beyond fulfilling a contract or just delivering high quality and robust statistical information per se, but deliver new statistical information and tools, which backs the:

- renewed Lisbon strategy and Europe 2020, i.e. three EU Commission Communications on Better Regulations for Growth and Jobs; Action Programme for Reducing Administrative Burdens; and Reduction of the response burden, simplification and priority setting in the field of Community statistics); and
- EU Parliament and Council MEETS commitment(s) (on reducing the response burden; simplifying and setting priorities; cutting costs on enterprises, stemming from red-tape, over-regulations and duplications; making data collection less burdensome and providing more information; modernizing and re-engineering the methods for the production of statistics).

and fosters

- cooperation/collaboration and networking, between WP leaders, NSIs statisticians, academics, national and international organizations and kindred organizations and, more generally, excellence in BLUE-ETS delivery and dissemination;

Work performed

During the life cycle of the BLUE-ETS project, activities have overwhelmingly focused on issues allied to the business statistics MEETS Programme, and challenges, which notably bear on response burdens, simplifying, cutting costs, modernizing and re-engineering methods and providing more information.

Implementation and dissemination of the BLUE-ETS R&D have focused on “spill over” BLUE-ETS new knowledge all across the NSIs community and ESS and, in particular, on networking with EUROSTAT, with a view to join forces and cooperate (e.g. complementing the work done with allied ESSnet R&D investments, and joint seminars and workshops) and thus have an impact.

Seven deliverables were scheduled under WP11.

• Deliverable 11.1 (M1) Project Prospectus -A first draft version of the BLUE-ETS Project Prospectus (realised by ISTAT in charge for this deliverable) was presented at the Kick-off meeting of the project held on 08-09 April 2010 in Rome and was officially delivered in occasion of the two meetings of the Blue-ets project - Steering Committee and Project Management Board, organised in Brussels on 21 February 2011. Copies were distributed at several national and international events in order to inform the scientific community on the project and on its aims. The brochure is downloadable from the project’s website.
• Deliverable 11.2 (M6), Development of a website for the project -the 1st version was timely delivered. Since then the project website has been constantly maintained in compliance with the recommendations defined in the EC guidelines. To this effect, the publication of the "Best Practices for Project Coordinators for the FP7 SSH project websites' development and maintenance" of EC published at the end of 2010, required a reshaping of the website. The release of the reshaped version of the website was done at the end of October 2011. The website of the project has represented an efficient tool for the dissemination of the on going activities of the project and of the diverse initiatives related to its topics which took place during the 36 months of Blue-Ets.
• Deliverable 11.3 (M7), Workshop on pilot data collection at NSIs and businesses and to discuss other preliminary research results challenges and directions -held in Ljubljana on 7-8 October 2010. The workshop was organized by partners UL and CBS while all the partners involved in WP2 and WP3 participated and contributed.
• Deliverable 11.4 (M16), Workshop to select ideas gathered on promising approaches to burden reduction and motivation enhancement for testing and to discuss phase 1, present tools and research results. Held in Heerlen on 24 March 2011. The workshop was focused on the results from WP2 and WP3 and preliminary plan for WP8.
• Deliverable 11.5 (M 32), Workshop to discuss phase 2 advances, future tools and research results- held in Rome on 8-9 November 2012. The focus of the workshop was on issues related to data collection methodologies, statistical methods for the production of disaggregated estimates for business statistics, methodologies to deal with business data integration, access and disclosure risks and new indicators for better tailor data to research needs. In particular, the workshop intended to focus on the work performed in work packages 5,6,7,8,9 and 10 presenting results in preparation also of the Final Conference of the project.
• Deliverable 11.6 and 11.7 (M36) refer to the Final plan for the use and dissemination of foreground and to the Report on Societal Implications that are parts of the Final Report (Deliverable 1.2)

Main Conclusions and recommendations under WP11 - The main message

The scientific coordination implemented during the whole life cycle of the project has been finalised to assure the fully compliance of the activities with the objectives stated in the work plan. All planned deliverables have been submitted and all the envisaged tasks have been performed. To this regard BLUE-ETS has successfully meet its objectives fulfilling its promise of supporting the EU MEETS commitment and ensuring valuable “spillovers” and spread of benefits and valuable “feed-backs” across the ESS.

All deliverables have passed an external review and once passed the EC evaluation they have been made available on the project’s website. The website and the numerous dissemination activities both included in the project’s work plan and external to the project, have contributed to the creation of a fertile ground for stimulating research and debates with the final aim of contribution to improve the quality of statistical information on businesses and of supporting policymaking and research in general.

Finally, the research effort performed by the project’s consortium, which has seen the participation of several EU NSIs, should also help to map the road for injecting more resources in FP calls in areas considered strategic for the development of official statistics.

One important lesson emerging from the review of the process of modernisation of European statistics is that this process involves innovation both of concepts and statistical methodology, extended use of administrative and accounting data coupled with constant efforts to ensure a certain degree of harmonisation between different sources of statistical information and, last but not least, exploitation of the scope for technological innovation both as regards the collection of data and the transmission within the European Statistical System.

The review however also suggests a need for enhanced attention to the consistency of the activities in the different areas and, in particular, a need for a high degree of harmonisation of concepts and methods with the aim of ensuring that the overall quality of statistical data is not lost in the process:

• Whereas the establishment of annual accounts constitutes a significant share of the administrative and regulatory burden on enterprises, available data show that pure statistical reporting is only a tiny part (about 1%) of this burden. In fact, regulation related to social issues, health and safety on the work place, protection of consumers, the environment, chemical substances etc. remain the essential part of the regulatory burden and is often framed by EU directives.
• The endeavours to reduce the burden of statistical reporting through, notably, reduction of the obligation for SMEs to prepare annual accounts, should be weighed against the risk of reducing the quality and quantity of available data and the potential increase in the increase in the burden of and obligation for the NSIs to undertake additional sample surveys to allow compilation of essential and high-quality information on different aspects of the economy.
• The detailed examination of differences between EU accounting standards and the definitions in the Structural Business Statistics has identified a number of differences of which some can be dealt with thanks to appropriate breakdown of the relevant items while others persist due to conceptual disparities. As such disparities cause either a considerable additional burden of work for NSIs or, worse, a persistent loss of quality of the data, these disparities should be eliminated through adjustment of the accounting standards or, if not feasible, adjustment of the SBS definitions.
• The need for approximation of concepts and methods is also a necessary condition for modernisation of statistics through increased use of administrative and accounting data through added efforts with regard to integration of data sources and the setting up of “data warehouses”. As demonstrated through the experience of certain member states with advanced application of ICT in the field of production of derived statistical data an efficient collaboration between various branches of public administration is essential and may justify the creation of a single “window” for collection of data from enterprises and households.
• In the perspective of the creation of a single window for collection of data from enterprises the need for harmonisation of concepts (including definitions and breakdown of series) is even more apparent as a main feature of modernisation.
• However, reconsideration of modernisation through harmonisation of concepts and methods will also need to take into account the need for renewal of paradigms and for elaborating new statistics for new activities or branches which, hitherto, have been the source of serious mismeasurement. This is the case with respect to the measurement of immaterial investments and assets, which have been in the focus of analysis and debates already for decades. However, more recently the needs for modernisation are also enhanced by the recognition of the pressure for better measurement of activities which hitherto have been neglected as being outside the conventional measurement of output (GDP).

The overall conclusion and recommendation is therefore that the various efforts to modernise European statistics would benefit from additional measures to ensure consistency between the various activities and concepts and with, possibly, enhanced concern for the quality and relevance for the decision-making process in an increasingly complex socio-economic environment.

Potential Impact:

In an ever more integrated word economy statistics have to catch the fundamental changes associated to globalisation by reflecting the new phenomena adequately. The current economic crisis, moreover, has called statisticians to provide policymakers and civil society with reliable indicators and impartial and objective statistical information. To this effect the MEETS Programme, adopted in December 2008, stated the necessity to adapt business statistics to the new needs and at the same time to adjust the production system to new sources of information and reduce burden on businesses. It is against this background and in line with the renewed Lisbon Strategy and the Europe 2020 strategy that BLUE-ETS project has provided its contribution by demonstrating that the two challenges are not contradictory each other, efforts to reduce statistical burden can be combined by developments of new methodologies that improve the quality of statistical information on businesses and the relevance of data for policy use. Scientific developments and new opportunities offered by BLUE-ETS can be viewed in different perspectives.

- On the perspective of the data producers, the project have proposed new methodologies for measuring and reducing the statistical burden by stressing the need for establishing international standardization about measurement and the necessity to monitor and document the effects of actions intended to reduce response burden. Research on new methodologies to measure and to improve the quality of administrative data used for official statistics on enterprises has developed a Quality Report Card and scripts to assist NSI’s. It has been stressed the need to determine the effect of the quality of administrative data on the statistical products of NSI’s and to develop a standardised methodology. Moreover, the research activity has focused on the applicability of soft computing and text mining techniques on data sets to improve collection and quality of data for business and trade statistics. Research has developed text mining tools for extracting useful information in a quickly and cheaply way; fuzzy logic tool for tailored reminders and classification of respondents and tool based on genetic programming and achievements on how the use of neural networks.
- On the perspective of the data users, the project has investigated and proposed how to improve statistic availability from existing data by constructing disaggregate estimates for business statistics, methodologies to deal with business data integration and by testing methodologies to increase the accessibility and release of business, administrative and integrated microdata. In more details, analysis has investigated variance estimation methods as well as small area estimation techniques in order to enhance the quality of business statistics and has developed a synthetic database to conduct simulation in a realistic setting. Research on data integration has developed a first prototype of the Italian Linked Employer-Employee Database. As to access and release issues, new methodologies have been developed to release complex microdata; methods to create microdata presenting the same structure as the original ones and approaches to develop practical ways to access microdata in foreign countries.
- On the perspective of the enterprises, as data users and providers, the project has analysed and proposed ways to reduce statistical burden and to improve the motivation for using NSIs’ data and reporting in NSIs’ surveys assuming that enterprises represent an important stakeholder for business official statistical data. Research has highlighted the need for NSIs to improve, codify and structuralize knowledge on business use of NSI statistics and to actively work on improving statistical literacy among (business) users. Research should investigate into the ways of raising motivation for better reporting in NSI surveys and to move from the publicity principle towards the dialogue with users of NSI statistics.
- Finally, on the perspective of policy makers the project has developed new indicators assessing how official statistics can adjust to reflect and catch structural jumps and changes improving the quality and relevance of statistics for policy use. To this effect, research has dealt with measurement problems of intangible assets; competitiveness issues by developing a multidimensional competitiveness index and a decomposable indicator for public institutions’ contribution to competitiveness. Moreover, analysis has investigated on indicators for business collaboration and Collaborative Working environments and on indicators for rural areas and their territorial disaggregation.

Three broad main conclusions or ways forward for further modernisation of business statistics emerge from the BLUE-ETS project. They reflect the fact that modernisation of business statistics touches upon a multitude of themes and that, at present, there is not a clearly defined and broadly accepted list of priorities to guide the process.

First, data quality is important and it should be a trademark of NSIs in the information market, independent on where NSIs should be place in the current much more nuanced information market compared to the monopoly of information provision NSIs enjoyed earlier.

Second, collaboration and standardisation are keys to efficient cost-effective modernisation of the production of business statistics. Both among NSIs in order to avoid duplication of effort – and the Commission and Eurostat should play a role in fostering this – but also between NSIs and academia in order to facilitate mutual learning and knowledge transfer, but also to improve research by building on each other’s strength. Standardisation should facilitate adoption of a plug-and-play environment which encourage sharing of tools among NSIs. International collaboration should however not only be organised and fostered on high levels, but also from bottom-up, with regard to implementation issues and sharing experiences. This becomes even more important within standardisation and plug-and-play systems, in order to guarantee success.

Third, communication and dissemination are ever more important. Improved communication among NSIs, academia and the policy environment, notably the Commission and Eurostat, is needed to get the most out of collaborative projects. This should help to get everyone on “the same page” as regards issues and outcomes. Furthermore, communication with other fields, in particular within social sciences, can help bring statistical concepts and classification more in line with the real world and requirements of policy research. Improving NSIs communication with businesses should also be a clear priority in moving forward with modernisation of business statistics. This includes a holistic approach encompassing survey communication, communication from helpdesks and communication/dissemination on the usefulness of statistical products to businesses.

In this perspective the BLUE-ETS final conference offered a unique opportunity for exchange between major stakeholders in the process of innovation of tools and methods: The United Nations Economic Commission for Europe, the OECD, Eurostat and key NSIs and researchers.

The debate at the BLUE-ETS final conference in particular illustrated a need for priorities to be set and integrated into the current structure of on-going cooperation among NSIs and Eurostat within the ESSnet framework (as well as other initiatives).

The overall conclusion and recommendation is therefore that the various efforts to modernise European statistics would benefit from additional measures to ensure consistency between the various activities and concepts and with, possibly, enhanced concern for the quality and relevance for the decision-making process in an increasingly complex socio-economic environment.

Results attained by BLUE-ETS can hopefully be fed into the EU NSIs community distilling lessons and guidelines. The project’s website and the numerous dissemination activities have contributed to the creation of a fertile ground for stimulating research and debates with the final aim of supporting policymaking and research in general. In a nutshell, the research effort performed within BLUE-ETS has tackled some of the major challenges which are being faced into the field of European official business statistics.

There is thus a pressing need to set the priorities and get to work, since the future is already here!

List of Websites:

www.blue-ets.eu

final1-blue-ets-figure-1.pdf