| Title | Theme | Statistics Area | Country/Organisation | Reports | ML methods | Data Source | Data Type | Programming Language | Code Availability | Note |
|---|
| Address Register Automated Image Recognition (AIR) model | Imagery Analysis | Geospatial statistics | Australia | To be uploaded | Convolutional neural network | Aerial Imagery | Imagery data | R | Ask for availability |
|
| Learning statistical information from images: a proof of concept | Imagery Analysis | Geospatial statistics, Income-based Poverty statistics | Netherlands | To be uploaded | Convolutional neural network | Aerial Imagery, Satellite Imagery | Imagery data | Python | ?? - GitLab link (Joep: not public, yet? ) |
|
| Arealstatistik Deep Learning (ADELE) | Imagery Analysis | Geospatial statistics | Switzerland | To be uploaded | Convolutional neural network, Random forest | Satellite Imagery, Administrative data | Imagery data | Python | Ask for availability |
|
| Use of Landsat satellite data for the mapping of urban areas in non-census years | Imagery Analysis | Geospatial statistics, Urban statistics | Mexico | To be uploaded | Convolutional neural network, Extra tree | Satellite Imagery | Imagery data | Python | Ask for availability |
|
| Generic Pipeline for Production of Official Statistics Using Satellite Data and Machine Learning | Imagery Analysis | Not applicable | UNECE | To be uploaded | | Not applicable | Not applicable | Not applicable | Not applicable |
|
| Imputation of the variable “Attained Level of Education” in Base Register of Individuals | Edit & Imputation | Education statistics | Italy | To be uploaded | Multilayer perceptron, Log linear | Administrative data, Survey data, Census data |
| Python | GitHub link |
|
| Imputation in the sample survey on participation of Polish residents in trips | Edit & Imputation | Tourism statistics | Poland | To be uploaded | CART, Random forest, Optimal weighted nearest neighbor, Support vector machine | Survey data |
| R | Local, not public |
|
| Machine learning methods for imputation | Edit & Imputation | ? | Germany | To be uploaded | K-nearest neighbors, Bayesian network, Random forest, Support vector machine | Survey data |
| R | Not available |
|
| Early estimates of energy balance statistics using machine learning | Edit & Imputation | Energy statistics, Economic and Financial statistics, Weather statistics | Belgium VITO | To be uploaded | Lasso regression, Linear regression, Neural network, Random forest, Ridge regression | ? |
| Python | GitHub link |
|
| Edit & Imputation |
| UK | To be uploaded |
|
|
|
|
|
|
| Editing in the Italian Register of the Public Administration | Edit & Imputation | Economic and Financial statistics | Italy | To be uploaded | Decision tree, Random forest | Administrative data |
| R |
|
|
| Occupation and Economic activity coding using natural language processing | Coding & Classification | Demographic and Social statistics, Economic and Financial statistics, Labor statistics | Mexico | To be uploaded | Extra tree, Naive bayes, XGBoost, Support vector machine, Multilayer perceptron, Decision tree, Random forest, K-nearest neighbors, Logistic regression, Ensemble | Survey data | Text data | Python | |
|
| Industry and Occupation Coding | Coding & Classification | Labor statistics, Business Statistics | Canada | To be uploaded | | Survey data | Text data | Python | GitHub link |
|
| Sentiment Analysis of twitter data | Coding & Classification | Life statistics | Belgium Flanders | To be uploaded | Word embedding, Logistic regression, XGBoost, Random forest | Social media data | Text data | Python | GitHub link |
|
| Coding & Classification |
| Serbia | To be uploaded |
|
|
|
| Not available |
|
| Coding Workplace Injury and Illness | Coding & Classification | Labor Statistics | USA | To be uploaded | | Survey data | Text data | Python | GitHub link |
|
| Product Description to ECOICOP | Coding & Classification | Price statistics | Poland | To be uploaded | Naive bayes, Logistic regression, Random forest, Support vector machine, Neural network | Web Scraping data | Text data | Python | Github link |
|
| Coding & Classification |
| Australia | To be uploaded | |
|
|
|
|
|
| Automated Coding of IMF's Catalog of Time Series | Coding & Classification |
| IMF | To be uploaded |
|
|
|
|
|
|
| Coding & Classification |
| Iceland | To be uploaded |
|
|
|
|
|
|
Standard Industrial Code Classification by Using Machine Learning
| Coding & Classification | Business Registration Statistics? | Norway | To be uploaded | Logistic regression, Random forest, Naive bayes, Support vector machine, FastText, Neural network | Business Registration data? | Text data | Python | GitHub Link (Ask for availability) |
|