Seitenhierarchie

Versionen im Vergleich

Schlüssel

  • Diese Zeile wurde hinzugefügt.
  • Diese Zeile wurde entfernt.
  • Formatierung wurde geändert.
Panel
borderColorgrey
bgColorwhite
borderWidth1


 




UNECE – HLG-MOS Machine Learning Project

Project Report

PDF version

Author: Claude Julien (UNECE Project Manager)*









* with assistance from the UNECE Secretariat (InKyung Choi) and work package leaders (Eric Deeben, Wesley Yung and Alex Measure).

This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. If you re-use all or part of this work, please attribute it to the United Nations Economic Commission for Europe (UNECE), on behalf of the international statistical community.

Panel
borderColorgrey
bgColorwhite
borderWidth1

Table of Contents 

Preliminaries

Generic Pipeline for Production of Official Statistics Using Satellite Data and Machine Learning

Motivation

Organizational Context

Data Context

Machine Learning Solutions

Results

ReferencesIntroduction

Modernisation of statistical organisations

Proposal for a machine learning project

About the Machine Learning Project

Lessons learned and other thoughts

Key aspects to accepting ML solutions

Key aspects to facilitating ML solutions

Conclusion – Is Machine Learning a buzz, a must or bust?

Panel
borderColorgrey
bgColorwhite
borderWidth1

Anker
_Ch1
_Ch1
1. Introduction

The Machine Learning project was launched by the UNECE High-Level Group for the Modernisation of Official Statistics in March 2019 and concluded its work in December 2020. During this period, over 120 participants from 23 countries, 33 national organisations and 4 international organisations got together to work and collaborate on advancing the use of ML in the production of official statistics. They did so by demonstrating the added value of machine learning in the production of official statistics, developing a quality framework to guide its further development and identify and addressing challenges in integrating ML solutions in production processes. Reports, documents, code, data and numerous references were released on the public UNECE Statistics Wiki on November 13, 2020 for the benefit of the official statistics community. This release was quickly followed by a webinar held on November 16 and 17. 

Based on the knowledge, experience and insights gained during the project, it is clear that machine learning is not just a buzzword anymore. The studies conducted demonstrate that it can be integrated into coding and classification operations to produce better quality results at the same or lower cost. It produces some positive results for edit and imputation in some contexts, but more studies and developments are needed in this area. It is essential to exploit big data, such as the analysis of satellite or aerial images. Its success highly depends on combining the knowledge and efforts of experts in varied disciplines, notably to produce and maintain a sufficient quantity of quality data to train the algorithms and monitor the performance of ML assisted operations in an efficient manner. 

In spite having the value added of ML demonstrated by the pilot studies and other recent developments elsewhere, their integration into production processes remains a challenge. The project proposes a quality framework for statistical algorithms and addresses other integration challenges to facilitate its development and acceptance in organisations. 

This report provides background for launching the project and describes how it was conducted. After listing its main outputs, the report shares the main lessons learned on accepting and facilitating the advancement of machine learning, as well as highlighting related project outputs and suggesting future work.

...


Table 1: Work packages investigated by the ML 2020 project (hyperlinks) and potential themes to be explored by the ML 2021 project (red text)

The Journey

Moving from idea to valid solution (demonstration)

Moving from valid solution to production (Operationalisation)

Ensuring production robustness (Maintenance)

All WP1 pilot studies

Some WP1 pilot studies

Very few WP1 pilot studies

Other applications of Machine Learning

Some other applications of Machine Learning

Very few other applications of Machine Learning

WP3 Integration (Q5 & Q6)

WP3 Integration (Q5)


Workstream 1: Support current studies towards production; welcome new studies in other processes (e.g. record linkage) and/or data sources (e.g. satellite data)

Supported by

Quality (accuracy, timeliness, efficiency, explainability and reproducibility)

Good Training Data

Skills/Competences

Computing Infrastructure

Interoperability / Business Process

Ethics and Legal

Security

WP2 Quality


WP3 Integration (Q3 & Q4)





Workstream 2: Experiment with practices and methods on some dimensions of QF4SA (WP2);

Workstream 3: Review and improve the Framework 

Workstream 4: How to get good training data, how to keep it up to date, when to relearn a model, what does 'good' mean, how to measure that?

Workstream 5: What skills? How to learn? Where to find them?

To be defined

To be defined

Workstream 6: Ethics handbook, regulations , etc.

To be defined

Facilitated by

Organisation

Sharing and Collaboration

WP3 Integration (Q1 & Q2)

HLG-MOS Machine Learning Project

Initiatives to accelerate the integration of machine learning solutions

ML Studies and Codes

Workstream 7 : Create/maintain a network of data science unit leaders;
Workstream 8: Beyond 2021: How can we better prepare for next 2-5 years? What technology and data sources can we expect? What skills will we need?

Learning and Training

HLG-MOS ML Project webinar


...

Footnotes Display

Report inappropriate content