| Panel | ||||||
|---|---|---|---|---|---|---|
| ||||||
This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. If you re-use all or part of this work, please attribute it to the United Nations Economic Commission for Europe (UNECE), on behalf of the international statistical community. |
| Panel | ||||||
|---|---|---|---|---|---|---|
| ||||||
The Machine Learning project was launched by the UNECE High-Level Group for the Modernisation of Official Statistics in March 2019 and concluded its work in December 2020. During this period, over 120 participants from 23 countries, 33 national organisations and 4 international organisations got together to work and collaborate on advancing the use of ML in the production of official statistics. They did so by demonstrating the added value of machine learning in the production of official statistics, developing a quality framework to guide its further development and identify and addressing challenges in integrating ML solutions in production processes. Reports, documents, code, data and numerous references were released on the public UNECE Statistics Wiki on November 13, 2020 for the benefit of the official statistics community. This release was quickly followed by a webinar held on November 16 and 17. Based on the knowledge, experience and insights gained during the project, it is clear that machine learning is not just a buzzword anymore. The studies conducted demonstrate that it can be integrated into coding and classification operations to produce better quality results at the same or lower cost. It produces some positive results for edit and imputation in some contexts, but more studies and developments are needed in this area. It is essential to exploit big data, such as the analysis of satellite or aerial images. Its success highly depends on combining the knowledge and efforts of experts in varied disciplines, notably to produce and maintain a sufficient quantity of quality data to train the algorithms and monitor the performance of ML assisted operations in an efficient manner. In spite having the value added of ML demonstrated by the pilot studies and other recent developments elsewhere, their integration into production processes remains a challenge. The project proposes a quality framework for statistical algorithms and addresses other integration challenges to facilitate its development and acceptance in organisations. This report provides background for launching the project and describes how it was conducted. After listing its main outputs, the report shares the main lessons learned on accepting and facilitating the advancement of machine learning, as well as highlighting related project outputs and suggesting future work. |
...
Table 1: Work packages investigated by the ML 2020 project (hyperlinks) and potential themes to be explored by the ML 2021 project (red text)
| The Journey | Moving from idea to valid solution (demonstration) | Moving from valid solution to production (Operationalisation) | Ensuring production robustness (Maintenance) | ||||
Workstream 1: Support current studies towards production; welcome new studies in other processes (e.g. record linkage) and/or data sources (e.g. satellite data) | |||||||
| Supported by | Quality (accuracy, timeliness, efficiency, explainability and reproducibility) | Good Training Data | Skills/Competences | Computing Infrastructure | Interoperability / Business Process | Ethics and Legal | Security |
Workstream 2: Experiment with practices and methods on some dimensions of QF4SA (WP2); | Workstream 4: How to get good training data, how to keep it up to date, when to relearn a model, what does 'good' mean, how to measure that? | Workstream 5: What skills? How to learn? Where to find them? | To be defined | To be defined | Workstream 6: Ethics handbook, regulations , etc. | To be defined | |
| Facilitated by | Organisation | Sharing and Collaboration | |||||
Initiatives to accelerate the integration of machine learning solutions | |||||||
Workstream 7 : Create/maintain a network of data science unit leaders; | |||||||
| HLG-MOS ML Project webinar | |||||||
...
| Footnotes Display |
|---|