• This page contains pilot studies conducted under the HLG-MOS Machine Learning Project and programming codes (if available). If you want your study or code to be added, please contact UNECE
  • You can search by Theme, ML method, Programme code availability and Programming Language using filter below. 
ThemeTitleCountry/Organisation

Data Source

ML methodsProgramme code availabilityProgramming LanguageNote
Coding & Classification

Occupation and Economic activity coding using natural language processing

MexicoSurvey data

Extra tree, Naive bayes, XGBoost, Support vector machine, Multilayer perceptron, Decision tree, Random forest, K-nearest neighbors, Logistic regression, Ensemble

Yes (Click File attachment)

Python
Coding & Classification

Industry and Occupation Coding

CanadaSurvey data

FastText

Yes (Click GitHub link)Python
Coding & Classification

Sentiment Analysis of twitter data

Belgium FlandersSocial media data

Word embedding, Logistic regression, XGBoost, Random forest

Yes (Click GitHub link)Python
Coding & Classification

Coding textually described data on economic activity collected from Labour Force Survey

SerbiaSurvey dataRandom forest, Support vector machine, Logistic regression


Coding & Classification

Coding Workplace Injury and Illness

USASurvey data

Neural network

Yes (Click GitHub link)Python
Coding & Classification

Production description to ECOICOP

PolandWeb scraping data

Naive bayes, Logistic regression, Random forest, Support vector machine, Neural network

Yes (Click Github link)Python
Coding & Classification

Automated Coding using the IMF’s Catalog of Time Series

IMF




Coding & Classification

Automatic coding of occupation and industry in social statistical surveys

IcelandSurvey dataDeep learningYes (See section 5 of the report)R
Coding & Classification

Standard Industrial Code Classification by Using Machine Learning

NorwayAdministrative data

Logistic regression, Random forest, Naive bayes, Support vector machine, FastText, Neural network


Python
Edit & Imputation

Imputation of the variable “Attained Level of Education” in Base Register of Individuals

Italy

Administrative data, Survey data, Census data

Multilayer perceptron, Log linear

Yes (Click GitHub link)

Python


Edit & Imputation

Imputation in the sample survey on participation of Polish residents in trips

PolandSurvey data

CART, Random forest, Optimal weighted nearest neighbor, Support vector machine


R
Edit & Imputation

Machine learning for imputation

Germany

Survey data

K-nearest neighbors, Bayesian network, Random forest, Support vector machine


R


Edit & Imputation

Early estimates of energy balance statistics using machine learning

Belgium VITO


Lasso regression, Linear regression, Neural network, Random forest, Ridge regression

Yes (Click GitHub link)

Python


Edit & Imputation

Editing of Living Cost and Food Survey Income data

UK

Survey data

Decision tree, Random forest, Neural network




Edit & Imputation

Editing in the Italian Register of the Public Administration

Italy

Administrative data 

Decision tree, Random forest


R


Edit & Imputation

Machine Learning for Data Editing Cleaning in NSI : Some ideas and hints

Italy




Imagery AnalysisAustraliaAerial imagery

Convolutional neural network 


R
Imagery Analysis

Learning statistical information from images: a proof of concept

Netherlands

Aerial imagery,

Satellite imagery

Convolutional neural network 


Python
Imagery Analysis

Arealstatistik Deep Learning (ADELE)

SwitzerlandSatellite imagery, Administrative data

Convolutional neural network, Random forest

To be made availablePythonLand cover statistics, Land use statistics
Imagery Analysis

Use of Landsat satellite data for the mapping of urban areas in non-census years

MexicoSatellite imagery

Convolutional neural network, Extra tree


Python
Imagery Analysis

Generic Pipeline for Production of Official Statistics Using Satellite Data and Machine Learning

UNECE