• Under the HLG-MOS ML Project Work Package 1, a total of 21 studies were conducted with three broad themes: coding and classification, edit and imputation and imagery analysis
  • Work package 1 report provides executive summary of all three application areas.
  • Theme report provides overview of context, methods, practices and lessons learned from pilot studies under each theme.
  • Pilot study paper contains details about each study, please see Studies and Codes page for information about programming language and codes 
WP1Pilot Study ThemePilot Study PaperPresentation 
Work Package (WP) 1 - Pilot Studies Executive Summary Report (to be updated)

Mexico - Occupation and Economic activity coding using natural language processing

Presentation (April 2020)

Canada - Industry and Occupation Coding

Presentation (April 2020)

Belgium Flanders - Sentiment Analysis of twitter data

Presentation (April 2020)

Serbia - Coding textually described data on economic activity collected from Labour Force Survey

Presentation (April 2020)

USA - Coding Workplace Injury and Illness

Presentation (April 2020)

Poland - Production description to ECOICOP

Presentation (April 2020)

IMF - Automated Coding using the IMF’s Catalog of Time Series

Presentation (April 2020)

Iceland - Automatic coding of occupation and industry in social statistical surveys

Presentation (April 2020)

Norway - Standard Industrial Code Classification by Using Machine Learning

Presentation (April 2020)

Italy - Imputation of the variable “Attained Level of Education” in Base Register of Individuals


Poland - Imputation in the sample survey on participation of Polish residents in trips


Germany - Machine learning for imputation 


Belgium VITO - Early estimates of energy balance statistics using machine learning


UK - Editing of Living Cost and Food Survey Income data


Italy - Editing in the Italian Register of the Public Administration


Italy - Machine Learning for Data Editing Cleaning in NSI : Some ideas and hints


Australia - Address Register Automated Image Recognition (AIR) model


Netherlands - Learning statistical information from images: a proof of concept


Switzerland - Arealstatistik Deep Learning (ADELE)


Mexico - Use of Landsat satellite data for the mapping of urban areas in non-census years 


UNECE - Generic Pipeline for Production of Official Statistics Using Satellite Data and Machine Learning