- This page contains pilot studies conducted under the HLG-MOS Machine Learning Project and programming codes (if available). If you want your study or code to be added, please contact UNECE
- You can search by Theme, ML method, Programme code availability and Programming Language using filter below.
Data Source Convolutional neural network Learning statistical information from images: a proof of concept Aerial imagery, Satellite imagery Convolutional neural network Convolutional neural network, Random forest Use of Landsat satellite data for the mapping of urban areas in non-census years Convolutional neural network, Extra tree Generic Pipeline for Production of Official Statistics Using Satellite Data and Machine Learning Imputation of the variable “Attained Level of Education” in Base Register of Individuals Italy Administrative data, Survey data, Census data Multilayer perceptron, Log linear Yes (Click GitHub link) Python Edit & Imputation Imputation in the sample survey on participation of Polish residents in trips CART, Random forest, Optimal weighted nearest neighbor, Support vector machine Germany Survey data K-nearest neighbors, Bayesian network, Random forest, Support vector machine R Edit & Imputation Early estimates of energy balance statistics using machine learning Belgium VITO Lasso regression, Linear regression, Neural network, Random forest, Ridge regression Yes (Click GitHub link) Python Edit & Imputation UK Survey data Decision tree, Random forest, Neural network Edit & Imputation Editing in the Italian Register of the Public Administration Italy Administrative data Decision tree, Random forest R Edit & Imputation Machine Learning for Data Editing Cleaning in NSI : Some ideas and hints Occupation and Economic activity coding using natural language processing Extra tree, Naive bayes, XGBoost, Support vector machine, Multilayer perceptron, Decision tree, Random forest, K-nearest neighbors, Logistic regression, Ensemble Yes (Click File attachment) FastText Word embedding, Logistic regression, XGBoost, Random forest Coding textually described data on economic activity collected from Labour Force Survey Neural network Naive bayes, Logistic regression, Random forest, Support vector machine, Neural network Automated Coding using the IMF’s Catalog of Time Series Automatic coding of occupation and industry in social statistical surveys Standard Industrial Code Classification by Using Machine Learning Logistic regression, Random forest, Naive bayes, Support vector machine, FastText, Neural networkTheme Title Country/Organisation ML methods Programme code availability Programming Language Note Imagery Analysis Australia Aerial imagery R Imagery Analysis Netherlands Python Imagery Analysis Switzerland Satellite imagery, Administrative data To be made available Python Land cover statistics, Land use statistics Imagery Analysis Mexico Satellite imagery Python Imagery Analysis UNECE Not applicable Not applicable Edit & Imputation Poland Survey data R Edit & Imputation Italy Coding & Classification Mexico Survey data Python Coding & Classification Canada Survey data Yes (Click GitHub link) Python Coding & Classification Belgium Flanders Social media data Yes (Click GitHub link) Python Coding & Classification Serbia Survey data Random forest, Support vector machine, Logistic regression Coding & Classification USA Survey data Yes (Click GitHub link) Python Coding & Classification Poland Web scraping data Yes (Click Github link) Python Coding & Classification IMF Coding & Classification Iceland Survey data Deep learning Yes (See section 5 of the report) R Coding & Classification Norway Administrative data Python