Blog

We are glad to announce the establishment of the new group on Applying Data Science and Modern Methods in 2022.

In view of the increasing importance of new data sources and methods for the compilation of official statistics, this new group aims to identify data science initiatives and new methods needed for modernising existing business processes.

Examples of potential activities cover: (i) developing supporting materials to help implement case studies and good practices, (ii) organising workshops and trainings to promote and ensure consistent use of the HLG-MOS supported data science initiatives and modern methods, and (iii) managing the periodic reviews of the new data science initiatives and methods to measure their impacts.

The group has already started conducting a market landscape analysis to take stock of existing work in the field of data science and modern methods. It will help identify specific needs, challenges, and opportunities that support modernising statistical production and services.

Some potential areas of work that the group is considering include (i) use cases on data collection methods, (ii) data platform that facilitates the access to and integration of data from different sources, (iii) responsible AI, (iv) modelling methods for different type of data, and (v) automated dissemination, with user-centric services.

We look forward to welcoming you to participate in the group and other activities.

Please contact Wai Kit Si Tou if you are interested in joining the group.

Machine learning for official statistics: UNECE help statistical organisations harness the power of machine learning  

The new UNECE publication Machine Learning for Official Statistics helps national and international statistical organizations harness the power of machine learning to modernize the production of official statistics.

Machine learning in the context of modernisation for statistical organizations

Statistical organizations produce crucial indicators that portray various aspects of the economy and society that we live in. These include measures such as the gross domestic product, the inflation rate, the population growth, and the unemployment rate on which government and business alike depend when making important decisions. While these may be just a few digit numbers from the end-user side, statistical organizations employ a series of carefully designed and executed processes to distil this key information from the vast amount of raw data.

With increasing challenges arising from new data sources, technological developments and competitions with private companies, statistical organizations have been striving to modernise every part of this production process to provide more relevant and detailed official statistics in a more timely and accessible manner. They utilise the computer-assisted interview and web scraping tool to collect data more efficiently, and build infrastructure for data and IT tools to manage them across the organizations more easily.

Yet, one area that is difficult to modernise is the processes that require “human-like” decision-making, such as reading a textual description to assign a matching classification code or looking at the image to identify what it represents. Traditionally, this has been done either manually or through a complex rule-based system, both of which are costly, time-consuming and hard to manage. This is particularly daunting when statistical organizations try to use big data sources (e.g., price information web-scrapped from online stores) as the cost of resources needed to process such a large amount of data in the traditional manual way is simply too prohibitive.

Machines learning holds great potential for statistical organizations

The recent developments in machine learning technique are pushing the boundary of tasks considered for humans and machines - machines can now draw a painting in the style of an old master and write an article just like humans.

How does this technology work? In one of the most popular approaches called “supervised learning”, machines are first trained on the data that humans labelled, for example, images labelled as “urban” or “rural”. With this data, they figure out patterns associated with labels by adaptively improving their internal logic that maps from the input (image) to output (label). In this way, machines can determine whether an area shown in an image is urban or rural without us providing all possible rules explicitly.

As the machine learning technique can carry out tasks that we used to solely rely on manual works, it holds a great potential to increase the efficiency of statistical organizations, just like the use of machinery powered by steam engines made a huge leap in the productivity in the manufacturer industry few centuries ago. Also, their capability to process various types of data such as text, image and video offers statistical organizations to take advantage of new data sources to produce new statistics that could meet the evolving needs of society.

Challenges in using machine learning for official statistics

Like with any innovation, however, the journey of integrating machine learning in the organization abounds with challenges and setbacks. The technology itself is still relatively new and requires a different skill set that many statistical organizations do not possess; hence it needs to be built inside or acquired from outside.

The real difficulty, however, starts when the machine learning solution needs to move to production, meaning that it is connected to existing processes seamlessly and used for the regular business, beyond an “experiment” stage. Unfortunately, even after successful pilot studies, many machine learning solutions end up being left on the shelf. The difficulty is experienced widely across sectors and domains. It is said that over 80% of machine learning projects never make it to production. Moving machine learning into production requires changes in infrastructure, culture, organizational structure or business processes, none of which is a small task with a lasting effect.

UNECE supports statistical organizations in advancing the use of machine learning

Based on the two international initiatives, the UNECE High-Level Group for the Modernisation of Official Statistics (HLG-MOS) Machine Learning Project (2019-20) and the United Kingdom Office of National Statistics (ONS) – UNECE Machine Learning Group 2021, the publication Machine Learning for Official Statistics aims to help statistical organizations navigate the difficult journey of advancing the use of this new technology. It presents the practical applications of machine learning in three working areas within statistical organizations and discusses their value-added, challenges and lessons learned. The publication also includes a quality framework that could help guide the choice of methods, demonstrates key steps for moving machine learning from the experimental stage to the production stage, and key messages to facilitate the use of machine learning in the statistical organizations.

The machine learning field is fast evolving with new methods, platforms and approaches coming out every month. To keep up with the pace of change and avoid duplication of efforts, there is a great need for knowledge sharing and collaboration within the official statistics community. UNECE continues its engagement in the international initiative this year, through Machine Learning Group 2022 with the ONS, to support statistical organizations to harness the power of machine learning.


According to the 2021 survey on the use of “ModernStats” models[1], the Generic Statistical Business Process Model (GSBPM) is widely implemented, while the use of Generic Activity Model for Statistical Organisations (GAMSO), Generic Statistical Information Model (GSIM), and Common Statistical Production Architecture (CSPA) is more limited.

To highlight, out of the 45 respondents, two-thirds answered that there is widespread awareness and high level of familiarity with the application of GSBPM. For the organisations that took part in both surveys in 2018[2] and 2021, half of them reported increased use of GSBPM to modernise their statistical production process. Many of the organisations reported to have further developed a national version of GSBPM, with modification of phases and sub-processes in GSBPM and addition of a finer level of details based on the local context.[3] The top three benefits of using GSBPM identified by the respondents are (i) ease of internal communication, (ii) facilitation of quality management, and (iii) ease of comparing processes and identifying inefficiency within the organisation.

Regarding the limited use of GAMSO, GSIM, and CSPA, the major reasons behind are the lack of awareness and related knowledge, while some reported that they have their own model and do not feel the need to adapt to these models. In addition, organisations generally lack human resources and expertise to implement the model. It hinders further application of the model, despite the widely recognized benefits of facilitating the share of methods, service or capability within the organisation as well as easing internal communication.


The ModernStats models discussed above provide a common language and tool to map all activities within and between statistical organisations to a common approach. They enable statistical organisations to collaborate and facilitate exchange of information and the sharing of statistical services, thereby contributing to the advancement of official statistics at the national and international levels for evidence-based policymaking and assessing progress towards achieving the Sustainable Development Goals.


To promote the use of modernisation statistical models, a proactive communication strategy and capacity building programme are essential. UNECE will also continue to work closely with statistical organisations to update the models as well as improve linkage and consistency among the models to support wider integration and implementation.


[1] The online survey was conducted in February and March 2021.

[2] Results of the 2018 survey can be found at https://unece.org/fileadmin/DAM/stats/documents/ece/ces/ge.58/2018/mtg1/UNECE_CHOI_Presentation.pdf

[3] Under the High-level Group for the Modernisation of Official Statistics (HLG‑MOS), the Supporting Standards Group provides support for the implementation of the “ModernStats models”. A new task team was created in 2021 to compile examples of finer level of GSBPM activities, identify commonalities among them, and prepare a supplementary document to support countries using GSBPM in more detailed level as well as provide input for the next GSBPM revision.

We have combined the output of the HLG-MOS Strategic Communication Framework projects (2018-2019) and additional use cases and examples into a paper publication. Please find below a reprint of an article kindly prepared by our colleague Fiona Willis-Nuñez for UNECE news. (click here for the original)

01 July 2021

Statistics are all around us. All the more since the COVID-19 pandemic began. Rarely do we see a news story, political debate, press conference or even a social media debate that doesn’t reference statistics. In this environment, the custodians of the figures—the national statistical offices (NSOs) which produce, curate and publish official statistics—have an unparalleled role to play in communicating with their users.

The new Strategic Communication Framework for Statistical Institutions serves as a guide to help them navigate the changing demands of this role.

News statistics

Strategic Communications Framework for Statistical Institutions

Statistical communication is about more than writing press releases or answering user questions and requests. NSOs need a modern, proactive communication strategy with clearly defined key messages, and must use different channels to reach various target audiences. This may seem obvious, but it is a relatively new concept for many NSOs, which have traditionally focused their efforts and resources on dissemination—publishing their figures in tables, databases and sometimes analytical reports and leaving users to get on with the task of finding what they need, processing it and making sense of it. Dissemination has often been designed principally for expert users who know what they are looking for and how to interpret it.

Increasingly, though, NSOs are embracing the idea that their ‘target audience’ includes all types of users, or even non-users, and that two-way communication with citizens and improving statistical literacy fall within their remit.

With this in mind, the High-Level Group for the Modernization of Official Statistics (HLG-MOS), a group of chief statisticians reporting to UNECE’s Conference of European Statisticians (CES), decided back in 2018 to make strategic communication a key priority. They launched a project to develop a common framework that can serve as a guide for individual NSOs as they rethink their approaches to communicating with data users and with the public. The resulting framework, developed by a group of experts from 11 countries - Australia, Bosnia and Herzegovina, Canada, Croatia, Ireland, Italy, Mexico, Netherlands, Poland, United Kingdom and the United States - plus OECD and Eurostat and endorsed by CES, highlights examples of success and lessons learned from countries in a range of aspects of communication: rebranding in Canada and Poland; issue and risk management in Australia; and crisis communications in the United States of America.

The framework is designed to give NSOs the tools they need to build communication into their overall corporate strategy, helping to increase their visibility, relevance, and brand recognition. It covers the full range of communications considerations for NSOs, with recommendations on such elements as conducting stakeholder engagement, evaluating external communications, building a mission and vision, and enhancing communication with employees.

The framework offers guidance not only on communicating statistical information but also on communicating about values, purpose and the unique role of official statistics within the broader statistical landscape. The need for this has been highlighted during the pandemic, as producers of official statistics have demonstrated and defended the importance of making trustworthy, politically independent, rapid and comparable figures available for policymakers in response to ever-changing demand.

An evolving collection of case studies, published alongside the framework and still growing as countries’ experiences continue to be shared, demonstrates how NSOs have communicated with their users during the COVID-19 crisis—both about the impact of the pandemic on their operations and with tailored statistical communications about the pandemic itself.  

The work will not end here. An HLG-MOS Task Team devoted to Capability and Communication is now extending the Framework, developing practical guidelines, examples and tools for managing crises and for brand management.

Statistical data combined with location information can provide critical knowledge through the integration with other data in the data ecosystem to understand multi-faceted issues of the current society. To address the information needs of the users in an increasingly complex and intertwined society, there is a great need for statistical data to be geospatially enabled using consistent and common geographies, in an accessible and usable format.

The production of geospatially enabled statistics should be a routine operation for statistical organisations, not just one-off exercise. The crisis such as global pandemic highlighted that statistical organisations should be prepared to produce them in an efficient and timely manner. To ensure this occurs, geospatial-relevant activities and considerations should be integrated into the regular production processes of statistical organisations, so that the design and production of geospatially enabled statistics can be conducted in a systematic and consistent way.

Using two global frameworks, Generic Statistical Business Process Model (GSBPM) and Global Statistical Geospatial Framework (GSGF), the Geospatial Task Team under the Supporting Standards Group of the High-Level Group on the Modernisation of Official Statistics (HLG-MOS) developed a Geospatial view of GSBPM (GeoGSBPM). It describes geospatial-related activities and considerations needed to produce geospatially enabled statistics using the structure of GSBPM while taking into account GSGF principles so that the resulting statistics have a higher level of standardisation and geospatial flexibility, as well as a greater capacity for data integration.

More details about the GeoGSBPM can be found on this wiki page.

Upcoming Expert meetings

Various Expert meetings or workshops are organized as part of the programme or work of the High-Level Group for the Modernisation of Official Statistics. These statistical process based events bring together experts to discuss innovative, best practices and future work in these areas. We welcome submission of abstracts and you can preliminary register for the events.

INE_Edificio.JPG

There were not only more participants in workshops (as these were moved from in-person to online) but also groups and projects saw and increase in participants. We thank all of you for joining the ModernStats community and hope to see you all back in 2021!

UPDATE:

2020 involvement in HLG-MOS Activities:


2019 involvement in HLG-MOS Activities:

HLG-MOS promotional video

That's Where We Come In: the High-Level Group for the Modernisation of Offiical Statistics (HLG-MOS).

HLG-MOS 2020 video 360p.mp4 | Download HD version by clicking here | Click here to Watch in Vimeo

Statistics Canada has kindly produced an excellent video showcasing the importance of the HLG-MOS for Official statistics in the UNECE region and beyond. In a nutshell, it explains the role of the HLG-MOS and the range of groups, projects and activities taking place on this collaboration platform to move our statistical community into the future. 

What is the role of Statistical Agencies in data driven societies? How do we adapt to a dynamic and ever changing environment? In a world moving faster than ever, citizens need timely, relevant and quality information from a source they can trust without being overburdened. That's where we come in. We are the High-Level Group for the Modernisation of Official Statistics. Guided by the Conference of European Statisticians, we are a network for leaders of statistical agencies; a place to collaborate on the global modernisation agenda. We originate from the UNECE. That is the United Nations Economic Commission for Europe which represents Europe, North-America and parts of Asia. But our work is shared with all. Our groups are open to all and our network is global. Statistical agencies are facing new challenges: a rapid evolving environment, new channels, increasing costs and difficulty of acquiring data, competition for skilled resources, changing expectations of citizens visa viz data, need for new standards for the management of digital information, new ethical questions. These problems might be too big for a single organisation to handle on its own. That's where we come in. Collaboration is key. We coordinate global efforts to modernise statistics by providing experts with the venue to develop strategies and solutions. Our mission is to work collaborative to identify trends, risk and opportunities as well as to provide a common platform to develop solutions.

Stuck in the Past? I don't think so. Together we are moving the Statistical Community into the Future. You can count on it!

Top year for HLG-MOS

2019 Saw again an increase in participation in the work organized under the HLG-MOS. The ModernStats Community for statistical modernisation grew again in its ninth year of existence to new heights. Over 750 colleagues participated in its activities and the various groups and projects had over 150 members. Results and plans will be presented at the HLG MOS Workshop on the Modernisation of Official Statistics, Geneva, 18-20 November 2019


The 2019 Workshop on the Modernisation of Official Statistics will be held on 19-20 November 2019 in Geneva. The purpose of the workshop is to ensure that the work is community driven and that activities and initiatives are aligned with the implementation of the vision of the High Level Group for the Modernisation of Official Statistics (HLG-MOS), avoiding duplication and maximising efficiency. The expected outputs are a set of agreed and prioritised implementation actions.

The target audience are statisticians with technical knowledge of modernisation of statistics combined with a broader understanding of the statistical process and its modernisation, including some knowledge about international developments in this area. Members of the following groups are invited: The Statistical Modernisation Community, the CES Bureau, the HLG‑MOS, the Executive Board, Modernisation Groups and task teams under the auspices of the HLG‑MOS. Other staff working in areas related to modernisation of official statistics or interested in the work of the HLG‑MOS are also welcome to participate.

Register nowhttps://uncdb.unece.org/app/ext/meeting-registration?id=abpRrO

Workshop on Culture Evolution, Geneva, 11-13 September:

The purpose is to share knowledge and effectivepractices inhuman resources management relatedto the creation and maintenance of organisational culture that are ‘fit for purpose’. Importantly, this includes the associated change management challenges. The target audience of the workshop is mid to senior level staff members responsible for human resources management, training, risk management and other related areas in their respective organisations. Register here: https://uncdb.unece.org/app/ext/meeting-registration?id=pzEtwv

Statistical Data Collection Workshop 'New Sources and New Technologies', Geneva on 14-16 October 2019.

The objective of this workshop was to identify innovative ways and best practices in statistical data collection, and to provide a platform for practitioners to exchange experiences and foster collaboration in this area. In addition to the more traditional presentations, the agenda of the workshop included target-driven small group discussions to identify best practices and new opportunities. The target audience for the workshop includes senior and middle-level managers responsible for data collection activities and new data sources, across all statistical domains from Statistical Offices and other agencies from national and international statistical systems. Register here: https://uncdb.unece.org/app/ext/meeting-registration?id=WjfsPc

Work Session on Statistical Data Confidentiality 2019, hosted by Statistics Netherlands in The Hague, on 29-31 October. 

The main objectives of the meeting are to facilitate the exchange of experience and identify the best practices in dealing with technical issues related to statistical data confidentiality in statistical offices.The meeting is primarily intended for experts from national and international statistical offices as well as invited academics dealing with statistical disclosure limitation. Register here: https://statswiki.unece.org/x/3gNqDg


image2018-11-20_17-55-2.png

The latest version of the Generic Statistical Business Process Model (GSBPM 5.1) and the Generic Activity Model for Statisitical Organisations (GAMSO 1.2) were endorsed by the Conference of European Statisticians at their annual conference in June 2019. Both models were reviewed during 2018 and made up to date with the needs of Statistical Organisations. This work was carried out by a Task Team of specialist under the Modernisation Group on Supporting Standards working under the High-level Group for the Modernisation of Official Statistics and all CES countries were consulted twice in this process.

The changes made in the new version of GSBPM aim to make the model more applicable to new data sources and improve the clarity of the description. The most notable changes made to the model are the following:
• Descriptions of the phases and the sub-processes were updated to be less survey-centric. Activities related to working with non-statistical data providers were added where necessary.
• Descriptions were expanded to include tasks needed to use geospatial data, in recognition of the growing importance of integrating statistical data with geospatial data.
• Examples and descriptions were updated and expanded to improve clarity.
• The duplication between GSBPM (version 5.0) and GAMSO was resolved by removing corporate-level overarching processes from GSBPM as these are covered by GAMSO.

The review approach adopted for GAMSO was different than for GSBPM as the latest version of GAMSO was released only in 2017 and the usage of the model was still limited compared to GSBPM. The changes were minor and concerned some improvements of descriptions in GAMSO to maintain its alignment with GSBPM.

Find out more here:


Figure 3.PNG



The UNECE Workshop on Dissemination and Communication of Statistics was organized on 12-14 June in Gdanks, Poland. Over 80 experts in communication and dissemination from Statistical organisations, including from Central Banks, gathered to exchange their experiences, to share their lessons learned and to discuss the future of communication and dissemination of statistics. The quality of the presentations was high and the colleagues from Statistics Poland were an excellent host. Click here for more. #disscomm2019

The ModernStats World Workshop was held on 26-28 June in Geneva. This is the platform for learning about the ModernStats standards and recent developments as well as for getting input to the current work being done on the models. There were six interactive sessions where participants gathered in small groups to discuss progress. Additionally, there were voting rounds and quizzes as well as a soapbox presentations and a market place where participants could promote their products or to get others join their activities. Click here for more information. #ModernStatsWorld


DSC_2745.JPG

ModernStats World Workshop