As it progressed, the Machine Learning project was informed about other developments of ML to produce official statistics. In particular, during a series of virtual sessions held in October 2020, several speakers were invited to provide an introduction on ML developments conducted in their statistical organisations, It is important to note that they were not carried out within the ML project. The presentations are shared to further highlight the interest in advancing the use of ML. The list includes abstracts and presentations to papers delivered at the BigSurv20 - Big Data Meets Survey Science virtual conference held in 2020.

Oops, it seems that you need to place a table or a macro generating a table within the Table Filter macro.

The table is being loaded. Please wait for a bit ...

Main statistical processDevelopmentData source
Edit and ImputationAustralia - Repairing Big Data sets using KNNCombination of sources
Edit and ImputationAustralia - Census Occupancy Imputation for Census 2021Census
Edit and ImputationCanada - Investigating the use of ML methods in Banff and G-Sam.pdfAll sources
Edit and ImputationSwitzerland - Data validation with machine learning - Plausi++ (document)Administrative
Edit and ImputationSwitzerland - Improving Data Validation using Machine Learning (presentation)
Estimation and AnalysisCanada - Deploying Machine Learning Techniques for Crop Yield PredictionCombination of sources
Estimation and AnalysisGermany - Using administrative data and machine learning to address nonresponse bias in establishment surveysAdministrative
Estimation and AnalysisOECD - Nowcasting Services TradeAggregates
Estimation and AnalysisSwitzerland - ML_SoSi: Individual trajectories in the social security systemCombination of sources
Data collectionBigSurv20 - Automated double-barreled question classification using machine learningSurvey
Record linkage or matchingCanada - Machine Learning for Record Linkage at Statistics CanadaAll sources
Record linkage or matchingBigSurv20 - A novel approach to combine survey and bibliometric data for science policy researchSurvey and register
Record linkage or matchingUSA BLS - Matching fatal injury records with supervised machine learningSurvey and administrative
Text classificationBelgium Flanders - A better statistic on innovative companies in Flanders using web scraping and machine learningWeb scraped
Text classificationCanada - Automation of Information Extraction from Financial Statements using Graph-Based TechniquesPaper documents
Text classificationEuropean Central Bank - Estimating the institutional sectors of legal entities on a large scale European Central Bank - A supervised learning approach Register/Administrative
Text classificationFrance - Coding occupations and coding products: two NLP applications for Official StatisticsScanner and census
Text classificationIsrael - Development of ML model for coding of economic activities and occupations in household surveys Survey and census
Text classificationDetecting innovative companies via the text on their website (Piet Daas)
Text classificationOECD - 2020-10 Presentation_OCDE_SDG_FinLab_UNECE_Oct13.pdf Administrative and metadata
Text classificationSwitzerland - Automation of General Classification of Economic Activities coding - NOGAutoAll sources
Text classificationUK - Automated classification of web scraped clothing data in consumer price statisticsWeb scraped
Text classificationUSA USCB - Comparison of Machine Learning Algorithms to Build a Predictive Model for Classification of Survey Write-in ResponsesWrite-in responses
Text classificationUSA USCB - Shared AI Services Hosting Application

Write-in responses

Text classificationBigSurv20 - A framework for using machine learning to support qualitative data codingWrite-in responses
Estimation and AnalysisBigSurv20 - A novel approach to combine survey and bibliometric data for science policy researchCombination of sources
Text classificationBigSurv20 - A text mining and machine learning platform to classify businesses into NAICS codesCombination of sources
SamplingBigSurv20 - Advances in big data and the impact on grid based samplingImagery and other
Quality and EthicsBigSurv20 - Big data: big claims or bigger calamities? Getting real about survey research in the fourth eraNot applicable
Estimation and AnalysisBigSurv20 - Boosted kernel weighting - Using statistical learning to improve inference from nonprobability samplesNot defined
Data collectionBigSurv20 - Can recurrent neural networks code interviewer question-asking behaviors across surveys?Survey
Image classificationBigSurv20 - Complimenting agricultural surveys in Rwanda: Using deep learning to classify crops from drone imageryImagery
Data collectionBigSurv20 - Detecting difficulty in computer-assisted surveys through mouse movement trajectories: A new model for functional data classificationSurvey
Text classificationBigSurv20 - Evaluating and improving a text classifier for subpopulations: the case of cyber crimeSurvey and Administrative
Text classificationBigSurv20 - Explaining and predicting web survey response with time-varying factors and incidental dataWeb survey
Estimation and AnalysisBigSurv20 - How unequal is Croatia? Results from combined survey and administrative tax dataSurvey and Administrative
Estimation and AnalysisBigSurv20 - Identifying depression related behaviour in Facebook – an experimental studySurvey and Web
Text classificationBigSurv20 - Improving SHARE translation verificationText
Estimation and AnalysisBigSurv20 - LEARN4SDGis–A machine learning based poverty mapping exercise in AustriaSurvey
Text classificationBigSurv20 - Measuring the validity of open-ended questions: Application of unsupervised learning methodsSurvey
Estimation and AnalysisBigSurv20 - Application of data mining methods using demographic survey data: Analyses of attitudes towards gender roles and domestic violence in TurkeySurvey
Text classificationBigSurv20 - Predicting basic human values from digital traces on social mediaWeb
Estimation and AnalysisBigSurv20 - Subjective wellbeing and the intention to emigrate: A cross-national analysis of 157 countries, 2006-2017Survey
Several processesBigSurv20 - Use of big data and machine learning at statistics CanadaSeveral sources
Text classificationBigSurv20 - Prediction of author’s educational background using text miningText
Image classificationBigSurv20 - Remote sensing in support of agricultural surveys: use of Sentinel images and UAV-acquired ground truth for crop mapping in RwandaImagery
Quality and EthicsBigSurv20 - The challenges of legal analysis, between text mining and machine learningText
Text classificationBigSurv20 - Training deep learning models with active learning framework to classify “other (please specify)“ commentsSurvey
Estimation and AnalysisBigSurv20 - Using administrative data and machine learning to address nonresponse bias in establishment surveysAdministrative
Data collectionBigSurv20 - Using generative adversarial active learning to identify poor closed-ended survey responsesSurvey
Text classificationBigSurv20 - Using supervised classification for categorizing answers to an open-ended question on panel participation motivationSurvey
Estimation and AnalysisBigSurv20 - What can be predicted from a national health survey? Is cancer one of them?Survey

  • No labels