• Under the HLG-MOS ML Project Work Package 1, a total of 21 studies were conducted with three broad themes: coding and classification, edit and imputation and imagery analysis
  • Work package 1 report provides executive summary of all three application areas.
  • Theme report provides overview of context, methods, practices and lessons learned from pilot studies under each theme.
  • Pilot study paper contains details about each study, please see Studies and Codes page for information about programming language and codes 
WP1Pilot Study ThemePilot Study PaperPresentation (April 2020 Sprint)
Work Package (WP) 1 - Pilot Studies Executive Summary Report (to be updated)

Mexico - Occupation and Economic activity coding using natural language processing


Canada - Industry and Occupation Coding


Belgium Flanders - Sentiment Analysis of twitter data


Serbia - Coding textually described data on economic activity collected from Labour Force Survey


USA - Coding Workplace Injury and Illness


Poland - Production description to ECOICOP


IMF - Automated Coding using the IMF’s Catalog of Time Series


Iceland - Automatic coding of occupation and industry in social statistical surveys


Norway - Standard Industrial Code Classification by Using Machine Learning


Italy - Imputation of the variable “Attained Level of Education” in Base Register of Individuals


Poland - Imputation in the sample survey on participation of Polish residents in trips


Germany - Machine learning for imputation 


Belgium VITO - Early estimates of energy balance statistics using machine learning


UK - Editing of Living Cost and Food Survey Income data


Italy - Editing in the Italian Register of the Public Administration


Italy - Machine Learning for Data Editing Cleaning in NSI : Some ideas and hints


Australia - Address Register Automated Image Recognition (AIR) model


Netherlands - Learning statistical information from images: a proof of concept


Switzerland - Arealstatistik Deep Learning (ADELE)


Mexico - Use of Landsat satellite data for the mapping of urban areas in non-census years 


UNECE - Generic Pipeline for Production of Official Statistics Using Satellite Data and Machine Learning