Comment

ThemeTitleCountry/OrganisationML methods

Data Source

Programming LanguageProgramme code Note
Imagery AnalysisAddress Register Automated Image Recognition (AIR) modelAustralia

Convolutional neural network 

Aerial imageryR

Imagery Analysis

Learning statistical information from images: a proof of concept (UPDATED)

Netherlands

Convolutional neural network 

Aerial imagery,

Satellite imagery

Python

Imagery Analysis

Arealstatistik Deep Learning (ADELE) (UPDATED)

Switzerland

Convolutional neural network, Random forest

Satellite imagery, Administrative dataPython
Land cover statistics, Land use statistics
Imagery Analysis

Use of Landsat satellite data for the mapping of urban areas in non-census years (UPDATED)

Mexico

Convolutional neural network, Extra tree

Satellite imageryPython

Imagery Analysis

Generic Pipeline for Production of Official Statistics Using Satellite Data and Machine Learning (UPDATED)

UNECE


Not applicableNot applicable

Edit & Imputation

Imputation of the variable “Attained Level of Education” in Base Register of Individuals (UPDATED)

Italy

Multilayer perceptron, Log linear

Administrative data, Survey data, Census dataPythonGitHub link 
Edit & Imputation

Imputation in the sample survey on participation of Polish residents in trips (UPDATED)

Poland

CART, Random forest, Optimal weighted nearest neighbor, Support vector machine

Survey dataR

Edit & Imputation

Machine learning for imputation (UPDATED)

Germany

K-nearest neighbors, Bayesian network, Random forest, Support vector machine

Survey dataR

Edit & Imputation

Early estimates of energy balance statistics using machine learning (UPDATED)

Belgium VITO

Lasso regression, Linear regression, Neural network, Random forest, Ridge regression


PythonGitHub link
Edit & Imputation

Editing of Living Cost and Food Survey Income data (UPDATED)

UKDecision tree, Random forest, Neural networkSurvey data


Edit & Imputation

Editing in the Italian Register of the Public Administration (UPDATED)

Italy

Decision tree, Random forest

Administrative data R

Edit & ImputationMachine Learning for Data Editing Cleaning in NSI : Some ideas and hints (NEW)Italy




Coding & Classification

Occupation and Economic activity coding using natural language processing - with comments (UPDATED)

Mexico

Extra tree, Naive bayes, XGBoost, Support vector machine, Multilayer perceptron, Decision tree, Random forest, K-nearest neighbors, Logistic regression, Ensemble

Survey dataPython
Coding & Classification

Industry and Occupation Coding (UPDATED)

Canada

FastText

Survey dataPythonGitHub link
Coding & Classification

Sentiment Analysis of twitter data (UPDATED)

Belgium Flanders

Word embedding, Logistic regression, XGBoost, Random forest

Social media dataPythonGitHub link
Coding & Classification

Coding textually described data on economic activity collected from Labour Force Survey (UPDATED)

SerbiaRandom forest, Support vector machine, Logistic regressionSurvey data


Coding & Classification

Coding Workplace Injury and Illness (UPDATED)

USA

Neural network

Survey dataPythonGitHub link
Coding & Classification

Production description to ECOICOP (UPDATED)

Poland

Naive bayes, Logistic regression, Random forest, Support vector machine, Neural network

Web scraping dataPythonGithub link
Coding & Classification
Australia






Coding & Classification

Automated Coding using the IMF’s Catalog of Time Series (UPDATED)

IMF




Coding & Classification
Iceland




Coding & Classification

Standard Industrial Code Classification by Using Machine Learning (UPDATED)


Norway

Logistic regression, Random forest, Naive bayes, Support vector machine, FastText, Neural network

Administrative dataPython