Comment
- Intended purpose - central list for use cases and available code corresponding to the case (something like GSBPM Resources Repository)
- This kind of place has good potential, but has to be designed in advance so that there is minimum effort needed to maintain.
Data Source Convolutional neural network Aerial imagery, Satellite imagery Convolutional neural network Convolutional neural network, Random forest Convolutional neural network, Extra tree Multilayer perceptron, Log linear CART, Random forest, Optimal weighted nearest neighbor, Support vector machine K-nearest neighbors, Bayesian network, Random forest, Support vector machine Lasso regression, Linear regression, Neural network, Random forest, Ridge regression Decision tree, Random forest Extra tree, Naive bayes, XGBoost, Support vector machine, Multilayer perceptron, Decision tree, Random forest, K-nearest neighbors, Logistic regression, Ensemble Yes (Click File attachment) FastText Word embedding, Logistic regression, XGBoost, Random forest Neural network Naive bayes, Logistic regression, Random forest, Support vector machine, Neural network Logistic regression, Random forest, Naive bayes, Support vector machine, FastText, Neural networkTheme Title Country/Organisation ML methods Programme code availability Programming Language Note Imagery Analysis Australia Aerial imagery R Imagery Analysis Learning statistical information from images: a proof of concept
Netherlands Python Imagery Analysis Arealstatistik Deep Learning (ADELE)
Switzerland Satellite imagery, Administrative data To be made available Python Land cover statistics, Land use statistics Imagery Analysis Use of Landsat satellite data for the mapping of urban areas in non-census years
Mexico Satellite imagery Python Imagery Analysis Generic Pipeline for Production of Official Statistics Using Satellite Data and Machine Learning
UNECE Not applicable Not applicable Edit & Imputation Imputation of the variable “Attained Level of Education” in Base Register of Individuals
Italy Administrative data, Survey data, Census data Yes (Click GitHub link) Python Edit & Imputation Imputation in the sample survey on participation of Polish residents in trips
Poland Survey data R Edit & Imputation Machine learning for imputation
Germany Survey data R Edit & Imputation Early estimates of energy balance statistics using machine learning
Belgium VITO Yes (Click GitHub link) Python Edit & Imputation Editing of Living Cost and Food Survey Income data
UK Survey data Decision tree, Random forest, Neural network Edit & Imputation Editing in the Italian Register of the Public Administration (UPDATED)
Italy Administrative data R Edit & Imputation Machine Learning for Data Editing Cleaning in NSI : Some ideas and hints Italy Coding & Classification Occupation and Economic activity coding using natural language processing
Mexico Survey data Python Coding & Classification Industry and Occupation Coding
Canada Survey data Yes (Click GitHub link) Python Coding & Classification Sentiment Analysis of twitter data
Belgium Flanders Social media data Yes (Click GitHub link) Python Coding & Classification Coding textually described data on economic activity collected from Labour Force Survey
Serbia Survey data Random forest, Support vector machine, Logistic regression Coding & Classification Coding Workplace Injury and Illness
USA Survey data Yes (Click GitHub link) Python Coding & Classification Production description to ECOICOP
Poland Web scraping data Yes (Click Github link) Python Coding & Classification Automated Coding using the IMF’s Catalog of Time Series (UPDATED)
IMF Coding & Classification Automatic coding of occupation and industry in social statistical surveys (UPDATED)
Iceland Survey data Deep learning Yes (See section 5 of the report) R Coding & Classification Standard Industrial Code Classification by Using Machine Learning
Norway Administrative data Python