- This page contains pilot studies conducted under the HLG-MOS Machine Learning Project and programming codes (if available). If you want your study or code to be added, please contact UNECE
- You can search by Theme, ML method, Programme code availability and Programming Language using filter below.
Report Title Country/Organisation Reference Poster Canada Crop Analysis ML techniques Schnaubelt, Mahias (2019) : A comparison of machine learning model validation schemes for non-stationary time series data, FAU Discussion Papers in Economics, No. 11/2019, Friedrich-Alexander-Universitat Erlangen-N ¨ urnberg, Institute for Economics, N ¨ urnberg. hp ://hdl.handle.net/10419/209136 Canada Coding & Classification ML application Justin J. Evans, Isaac Ross, Julie Portelance. StatisticsCanada_CCHS_ML_Production_Report. [Online] 2020. https://statswiki.unece.org/display/MLP/Working+documents?preview=/244092601/256970399/Statistics_Canada_FastText_Techniques_Report.pdf Canada Coding & Classification ML code and data https://github.com/UNECE/CodingandClassification_Statcan Canada Coding & Classification ML techniques YanPeng Gao, Isaac Ross, Justin J. Evans. Statistiscs_Canada_FastText_Techniques_Report. [Online] 2019. https://statswiki.unece.org/download/attachments/244092601/Statistics_Canada_FastText_Techniques_Report.pdf?version=2&modificationDate=1567626783886&api=v2 Flanders Coding & Classification ML code https://github.com/jmaslankowski/WP7-Population-Life-Satisfaction Flanders Coding & Classification ML code https://github.com/mireusen/hlmos-statistiek-vlaanderen-twitter Flanders Coding & Classification ML code https://github.com/wimulkeman/dutch-sentiment-analysis Flanders Coding & Classification ML model https://github.com/wietsedv/bertje/blob/master/README.md Flanders Coding & Classification ML model https://tfhub.dev/google/universal-sentence-encoder-multilingual-large/3 Poland Coding & Classification ML code https://colab.research.google.com/drive/1Epn2NeFRuFC_XyXtQ4qezGVBA5aAzqIh Poland Coding & Classification ML code and data https://github.com/statisticspoland/ecoicop_classification Poland Coding & Classification ML library https://scikit-learn.org/stable/index.html Poster Flanders Coding & Classification ML application https://www.cbs.nl/nl-nl/over-ons/innovatie/project/innovatieve-hotspots Theme report Coding & Classification ML library https://en.wikipedia.org/wiki/FastText Theme report Coding & Classification ML tutorial https://machinelearningmastery.com/types-of-classification-in-machine-learning/ Theme report Coding & Classification ML tutorial https://www.analyticsvidhya.com/blog/2017/09/common-machine-learning-algorithms/ Theme report Coding & Classification Naive Bayes https://www.analyticsvidhya.com/blog/2017/09/naive-bayes-explained/ Theme report Coding & Classification Random Forest https://builtin.com/data-science/random-forest-algorithm Theme report Coding & Classification Random Forest https://towardsdatascience.com/understanding-random-forest-58381e0602d2 Theme report Coding & Classification Subject matter https://www.ons.gov.uk/methodology/classificationsandstandards/standardoccupationalclassificationsoc/soc2010/soc2010volume2thestructureandcodingindex#electronic-version-of-the-index Theme report Coding & Classification XGBoost https://machinelearningmastery.com/gentle-introduction-xgboost-applied-machine-learning/ US BLS Coding & Classification ML application https://www.bls.gov/iif/deep-neural-networks.pdf US BLS Coding & Classification ML application https://www.bls.gov/iif/deep-neural-networks.pdf US BLS Coding & Classification ML application https://www.bls.gov/osmr/research-papers/2014/pdf/st140040.pdf US BLS Coding & Classification ML application https://www.bls.gov/osmr/research-papers/2014/pdf/st140040.pdf US BLS Coding & Classification ML code https://github.com/USDepartmentofLabor/soii_neural_autocoder US BLS Coding & Classification ML tutorial https://github.com/ameasure/autocoding-class/blob/master/machine_learning.ipynb Extra Edit & Imputation Terminology https://www.analyticsvidhya.com/glossary-of-common-statistics-and-machine-learning-terms/ Germany Edit & Imputation Bayesian Networks Cheng J., Greiner R., Kelly J., Bell D. A., & Liu W. (2002). Learning Bayesian Networks from Data: An Information-Theory Based Approach. Artificial Intelligence, 137, 43–90. Germany Edit & Imputation Bayesian Networks Di Zio M., Sacco G., Scanu M., & Vicard P. (2004). Multivariate Techniques for Imputation Based on Bayesian Networks. Compstat 2004 Symposium. Germany Edit & Imputation Bayesian Networks Di Zio M., Scanu M., Coppola L., Luzi O., & Ponti A. (2004). Bayesian Networks for Imputation. Journal of the Royal Statistical Society Series A, 167(2), 309–322. Germany Edit & Imputation Bayesian Networks Jensen F. V. & Nielsen T. D. (2007). Bayesian Networks and Decision Graphs. Second edition. Springer. Germany Edit & Imputation Bayesian Networks Kalisch M., Bühlmann P. (2007). Estimating High-Dimensional Directed Acyclic Graphs with the PC-Algorithm. Journal of Machine Learning Research, 8, 613–636. Germany Edit & Imputation Bayesian Networks Lauritzen S. L. (1995). The EM Algorithm for Graphical Association Models With Missing Data. Computational Statistics and Data Analysis, 19, 191–201. Germany Edit & Imputation Bayesian Networks Moore A. & Wong W. (2003). Optimal Reinsertion: A New Search Operator for Accelerated and More Accurate Bayesian Network Structure Learning. In Proceedings of the Twentieth International Conference on Machine Learning (ICML 2003), 552–559. Germany Edit & Imputation Bayesian Networks Rey del Castillo P. (2012). Use of Machine Learning Methods to Impute Categorical Data. Conference of European Statisticians WP. 37. Germany Edit & Imputation Bayesian Networks Riggelsen C. (2006). Learning parameters of Bayesian networks from incomplete data via importance sampling. International Journal of Approximate Reasoning, 42(1-2), 69–83. Germany Edit & Imputation Bayesian Networks Spirtes P., Glymour C., & Scheines R. (2000). Causation, prediction, and search. Second edition. MIT Press. Germany Edit & Imputation Bayesian Networks Tsamardinos I., Brown L. E., & Aliferis C. F. (2006). The Max-Min Hill-Climbing Bayesian Network Structure Learning Algorithm. Machine Learning, 65, 31–78. Germany Edit & Imputation K-nearest neighbour Beretta L. & Santaniello A. (2016). Nearest Neighbor Imputation Algorithms: A Critical Evalutation. Medical Informatics and Decision Making, 16, 197–208. Germany Edit & Imputation K-nearest neighbour Cucala L., Marin J. M., Robert C. P., & Titterington D. M. (2009). A Bayesian Reassessment of Nearest-Neighbor Classification. Journal of the American Statistical Association, 104, 263–273. Germany Edit & Imputation K-nearest neighbour Devroye L., Györfi L., & Lugosi G. (1996). A Probabilistic Theory of Pattern Recognition. Springer. Germany Edit & Imputation K-nearest neighbour Liao S. G., Lin Y., Kang D. D., Chandra D., Bon J., Kaminski N., Sciurba F. C., & Tseng G. C. (2014). Missing Value Imputation in High-Dimensional Phenomic Data: Imputable or not, and how? Bioinformatics, 15, 346. Germany Edit & Imputation K-nearest neighbour Troyanskaya O., Cantor M., Sherlock G., Brown P. O., Hastie T., Tibshirani R., Botstein D., & Altman R. B. (2001). Missing Value Estimation Methods for DNA Microarrays. Bioinformatics, 17, 520–525. Germany Edit & Imputation ML application Beck M., Dumpert F., & Feuerhake J. (2018). Proof of Concept Machine Learning – Abschlussbericht. Online available on: https://www.destatis.de/GPStatistik/receive/DEMonografie_monografie_00004835 (in German) Germany Edit & Imputation ML application Bertsimas D., Pawlowski C., & Zhuo Y. D. (2017). From predictive methods to missing data imputation: an optimization approach. The Journal of Machine Learning Research, 18(1), 7133–7171. Germany Edit & Imputation ML application Park S., Pannekoek J., & van der Loo M. P. J. (2018). Imputation of Economic Data based on Random Forest. Technical Report. Online available on statswiki. Germany Edit & Imputation ML application Richman M. B., Trafalis T. B., & Adrianto I. (2009). Missing data imputation through machine learning algorithms. In Artificial Intelligence Methods in the Environmental Sciences (pp. 153–169). Germany Edit & Imputation ML application Yang B., Janssens D., Ruan D., Bellemans T. & Wets G. (2013). A data imputation method with support vector machines for activity-based transportation models. In Computational Intelligence for Traffic and Mobility (pp. 159–171). Germany Edit & Imputation ML code Crookston N. L. & Finley A. O. (2007). yaImpute: An R Package for kNN Imputation. Journal of Statistical Software, 23(10), 1–16. Germany Edit & Imputation ML code Mayer M. (2019). missRanger: Fast Imputation of Missing Values. Online: https://cran.r-project.org/web/packages/missRanger/index.html Germany Edit & Imputation ML code Scutari M. (2010). Learning Bayesian Networks with the bnlearn R Package. Journal of Statistical Software, 35(3), 1–22. Germany Edit & Imputation ML code Steinwart I. & Thomann P. (2017). liquidSVM: A Fast and Versatile SVM package. Online: https://arxiv.org/abs/1702.06899. Germany Edit & Imputation ML code van Buuren S. & Groothuis-Oudshoorn K. (2011). mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45(3), 1–67. Germany Edit & Imputation ML Code Wright M. N. & Ziegler A. (2017). ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R. Journal of Statistical Software, 77(1), 1–17. Germany Edit & Imputation ML techniques Hamner B., Frasco M., & LeDell E. (2018). Metrics: Evaluation Metrics for Machine Learning. Online: https://CRAN.R-project.org/package=Metrics. Germany Edit & Imputation ML techniques Honghai F., Guoshun C., Cheng Y., Bingru Y., & Yumei C. (2005). A SVM regression based approach to filling in missing values. In International Conference on Knowledge-Based and Intelligent Information and Engineering Systems (pp. 581–587). Germany Edit & Imputation ML techniques Mikhchi A., Honarvar M., Kashan N. E. J., & Aminafshar, M. (2016). Assessing and comparison of different machine learning methods in parent-offspring trios for genotype imputation. Journal of theoretical biology, 399, 148–158. Germany Edit & Imputation ML techniques Stekhoven D. J. & Buehlmann P. (2012). MissForest – non-parametric missing value imputation for mixed-type data. Bioinformatics, 28(1), 112–118. Germany Edit & Imputation ML techniques van Buuren S. (2018). Flexible Imputation of Missing Data. 2nd edition. CRC. Germany Edit & Imputation ML tutorial Torgo L. (2010). Data Mining with R, learning with case studies Chapman and Hall/CRC. Online: http://www.dcc.fc.up.pt/~ltorgo/DataMiningWithR. Germany Edit & Imputation Not published Dumpert F., Hansen M., Peters F., & Spies L. (2018). Bericht zur Maßnahme Machine Learning Methodik. Internal Paper, yet unpublished, in German. Germany Edit & Imputation R library //cran.r-project.org/ Germany Edit & Imputation Random Forest Athey S., Tibshirani J., & Wager S. (2019). Generalized Random Forests. The Annals of Statistics, 47(2), 1148–1178. Germany Edit & Imputation Random Forest Biau G. & Scornet E. (2016). A random forest guided tour. Test, 25(2), 197–227. Germany Edit & Imputation Random Forest Breiman L. (2001). Random forests. Machine learning, 45(1), 5–32. Germany Edit & Imputation Random Forest Burgette L. F. & Reiter J. P. (2010). Multiple imputation for missing data via sequential regression trees. American journal of epidemiology, 172(9), 1070–1076. Germany Edit & Imputation Random Forest Caiola G. & Reiter J. P. (2010). Random Forests for Generating Partially Synthetic, Categorical Data. Trans. Data Privacy, 3(1), 27-42. Germany Edit & Imputation Random Forest Ding Y. & Simonoff J. S. (2010). An investigation of missing data methods for classification trees applied to binary response data. Journal of Machine Learning Research, 11, 131–170. Germany Edit & Imputation Random Forest Doove L. L., Van Buuren S., & Dusseldorp E. (2014). Recursive partitioning for missing data imputation in the presence of interaction effects. Computational Statistics & Data Analysis, 72, 92–104. Germany Edit & Imputation Random Forest Feelders, A. (1999). Handling missing data in trees: surrogate splits or statistical imputation? In European Conference on Principles of Data Mining and Knowledge Discovery (pp. 329–334). Germany Edit & Imputation Random Forest Mentch L. & Hooker G. (2016). Quantifying uncertainty in random forests via confidence intervals and hypothesis tests. Journal of Machine Learning Research, 17(1), 841–881. Germany Edit & Imputation Random Forest Reiter J. P. (2005). Using CART to generate partially synthetic public use microdata. Journal of Official Statistics, 21(3), 441–462. Germany Edit & Imputation Random Forest Saar-Tsechansky M. & Provost F. (2007). Handling missing values when applying classification models. Journal of Machine Learning Research, 8, 1623–1657. Germany Edit & Imputation Random Forest Wager S., Hastie T., & Efron B. (2014). Confidence intervals for random forests: The jackknife and the infinitesimal jackknife. Journal of Machine Learning Research, 15(1), 1625–1651. Germany Edit & Imputation Statistics Bankier M., Lachance M., & Poirier P. (2000). 2001 Canadian census minimum change donor imputation methodology. UNECE Work Session on Statistical Data Editing 2000, Working Paper No. 17. Online: http://www.unece.org/fileadmin/DAM/stats/documents/ece/ces/2000/10/sde/17.e.pdf Germany Edit & Imputation Statistics Breiman L. (2001). Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statistical science, 16(3), 199–231. Germany Edit & Imputation Statistics Chambers R. (2001). Evaluation Criteria for Statistical Editing and Imputation. Online available: https://www.cs.york.ac.uk/euredit/ Germany Edit & Imputation Statistics Little R. J. & Rubin D. B. (1987; 2002). Statistical analysis with missing data. Wiley. Germany Edit & Imputation Statistics Little R. J. (2011). Imputation. In: Lovric M., International Encyclopedia of Statistical Science. Springer. Germany Edit & Imputation Statistics Rubin D. B. (1987). Multiple imputation for nonresponse in surveys. Wiley. Germany Edit & Imputation Support Vector Machine Boser B. E., Guyon I. M., & Vapnik V. N. (1992). A training algorithm for optimal margin classifiers. Fifth Annual ACM Workshop on Computational Learning Theory, 144–152. Germany Edit & Imputation Support Vector Machine Chechik G., Heitz G., Elidan G., Abbeel P., & Koller D. (2007). Max-margin classification of incomplete data. In Advances in Neural Information Processing Systems (pp. 233–240). Germany Edit & Imputation Support Vector Machine Cortes C. & Vapnik V. N. (1995). Support-vector networks. Machine Learning, 20, 273–297. Germany Edit & Imputation Support Vector Machine Drechsler J. & Reiter J. P. (2011). An empirical evaluation of easily implemented, nonparametric methods for generating synthetic datasets. Computational Statistics & Data Analysis, 55(12), 3232–3243. Germany Edit & Imputation Support Vector Machine Drechsler J. (2010). Using support vector machines for generating synthetic datasets. In International Conference on Privacy in Statistical Databases (pp. 148–161). Germany Edit & Imputation Support Vector Machine Hable R. (2012). Asymptotic normality of support vector machine variants and other regularized kernel methods. Journal of Multivariate Analysis, 106, 92–117. Germany Edit & Imputation Support Vector Machine Honghai F., Guoshun C., Cheng Y., Bingru Y., & Yumei C. (2005). A SVM regression based approach to filling in missing values. In International Conference on Knowledge-Based and Intelligent Information and Engineering Systems (pp. 581–587). Germany Edit & Imputation Support Vector Machine Pelckmans K., De Brabanter J., Suykens J. A., & De Moor B. (2005). Handling missing values in support vector machine classifiers. Neural Networks, 18(5-6), 684–692. Germany Edit & Imputation Support Vector Machine Rogers S. D. (2012). Support Vector Machines for Classification and Imputation. Master thesis. Brigham Young University. Germany Edit & Imputation Support Vector Machine Smola A. J., Vishwanathan S. V. N., & Hofmann T. (2005). Kernel Methods for Missing Variables. In AISTATS 2005 – Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics (pp. 325–332). Germany Edit & Imputation Support Vector Machine Steinwart I. & Christmann A. (2008). Support Vector Machines. Springer. Germany Edit & Imputation Support Vector Machine Stewart T. G., Zeng D., & Wu M. C. (2018). Constructing support vector machines with missing data. Wiley Interdisciplinary Reviews: Computational Statistics, 10, 1–16. Germany Edit & Imputation Support Vector Machine Wen Z., Shi J., Li Q., He B., & Chen J. (2018). ThunderSVM: A fast SVM library on GPUs and CPUs. Journal of Machine Learning Research, 19(21), 1–5. Germany Edit & Imputation Support Vector Machine Yang B., Janssens D., Ruan D., Bellemans T., & Wets G. (2013). A data imputation method with support vector machines for activity-based transportation models. In Computational Intelligence for Traffic and Mobility (pp. 159-171). Germany Edit & Imputation Support Vector Machine Zhang Y. & Liu Y. (2009). Data imputation using least squares support vector machines in urban arterial streets. IEEE Signal Processing Letters, 16(5), 414–417. Italy-E Edit & Imputation ML application Martin Beck, Florian Dumpert, Joerg Feuerhake (2018). Machine Learning in Official Statistics (Shorter English version available on arXiv: https://arxiv.org/abs/1812.10422) Italy-E Edit & Imputation Standards GSBPM (2019). Generic Statistical Business Process Model. Version 5.1, January 2019, UNECE. Available at: https://statswiki.unece.org/display/GSBPM/Generic+Statistical+Business+Process+Model. Italy-E Edit & Imputation Standards GSDEM (2019). Generic Statistical Data Editing Models - GSDEMs, Version 2.0, April 2019, UNECE. Available at: https://statswiki.unece.org/display/sde/GSDEM Italy-E Edit & Imputation Standards GSIM (2019). Generic Statistical Information Model, Version 1.2, May 2019, UNECE. Available at: http://www1.unece.org/stat/platform/display/gsim. Italy-E Edit & Imputation Statistics EDIMBUS (2007). Recommended Practices for Editing and Imputation in Cross-sectional Business Surveys, EDIMBUS project report, https://ec.europa.eu/eurostat/documents/64157/4374310/30-Recommended+Practices-for-editing-and-imputation-in-cross-sectional-business-surveys-2008.pdf. Italy-E Edit & Imputation Statistics MEMOBUST (2014). Handbook on Methodology of Modern Business Statistics, CROS-portal, Eurostat, https://ec.europa.eu/eurostat/cros/content/handbook-methodology-modern-business-statistics_en. Italy-E Edit & Imputation Statistics Van der Loo M. (2015) A Formal Typology of Data Validation Functions, UNECE, Conference of European Statisticians, Budapest. Available at: http://www.markvanderloo.eu/files/statistics/WP_5_Netherlands_A_formal_typology_of_data_validation_functions.pdf Italy-E Edit & Imputation Statistics Waal, T.de, Pannekoek, J. and Scholtus, S. (2011). Handbook of Statistical Data Editing and Imputation. Wiley, Hoboken. Italy-I Edit & Imputation ML application [1] Di Zio M., Di Cecco D., Di Laurea D., Filippini R., Massoli P., Rocchetti G. “Mass imputation of the attained level of education in the Italian System of Registers”, Workshop on Statistical Data Editing, Neuchâtel, Switzerland, 18-20 September 2018 Italy-I Edit & Imputation ML application [2] Di Zio M., Filippini R., Rocchetti G. “An imputation procedure for the Italian attained level of education in the register of individuals based on administrative and survey data”, Workshop on Statistical Data Editing, Neuchâtel, Switzerland, 31 August - 2 September 2020 Italy-I Edit & Imputation ML application [3] Bernasconi, Eleonora, et al. "Satellite-Net: Automatic Extraction of Land Cover Indicators from Satellite Imagery by Deep Learning." arXiv preprint arXiv:1907.09423 (2019). Italy-I Edit & Imputation ML application [4] De Fausti Fabrizio, Pugliese Francesco and Diego Zardetto. "Toward Automated Website Classification by Deep Learning." arXiv preprint arXiv:1910.09991 (2019). Italy-I Edit & Imputation ML code https://github.com/defausti/MLP_Imputation.git Italy-I Edit & Imputation ML techniques [6] Yoon, Jinsung, James Jordon, and Mihaela Van Der Schaar. "Gain: Missing data imputation using generative adversarial nets." arXiv preprint arXiv:1806.02920 (2018). Italy-I Edit & Imputation Statistics [5] Cybenko, George. "Approximation by superpositions of a sigmoidal function." Mathematics of control, signals and systems 2.4 (1989): 303-314. Poster Canada GenSyst Edit & Imputation ML code Stekhoven, D. J. (2015). missForest: Nonparametric missing value imputation using random forest. Astrophysics Source Code Library Poster Canada GenSyst Edit & Imputation Statistics Gray, D. (2019). A Generalized Framework to Evaluate Imputation Strategies: Recent Developments. In JSM Proceedings, Government Statistics Section. Alexandria, VA: American Statistical Association. 1861-1870 Poster Canada GenSyst Edit & Imputation Statistics Gray, D. (2020). Evaluating Imputation Methods using ImpACT: First Case Study, United Nations Statistical Commission and Economic Commission for Europe – Workshop on Statistical Data Editing Poster Canada GenSyst Edit & Imputation Statistics Stelmack, A. (2018). On the Development of a Generalized Framework to Evaluate and Improve Imputation Strategies at Statistics Canada, United Nations Statistical Commission and Economic Commission for Europe – Workshop on Statistical Data Editing. Theme report Edit & Imputation Data Science Cao L. (2017). Data science: a comprehensive overview. ACM Computing Surveys, 50(3), 1–42. Theme report Edit & Imputation Statistics Chambers R. (2001). Evaluation Criteria for Statistical Editing and Imputation. VITO Edit & Imputation Big Data Daas, P.J.H., Puts, M.J., Buelens, B. and van den Hurk, P. (2015). Big data as a source for official statistics. Journal of Official Statistics, 31, 249–262. VITO Edit & Imputation Big Data Hassani, H., Saporta, G. and Silva, E.S. (2014). Data mining and official statistics: the past, the present and the future. Big Data, 1, 34–43. VITO Edit & Imputation ML code https://github.com/VITObelgium/energy-balance-ml VITO Edit & Imputation ML tutorial Hastie, T., Tibshirani, R., Friedman, J. & Franklin, J. (2009). The Elements of Statistical Learning: Data Mining, Inference and Prediction, 2nd ed. New York: Springer. VITO Edit & Imputation Random Forest Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. VITO Edit & Imputation Statistics Claeskens, G. & Hjort, N. L. (2008). Model Selection and Model Averaging. Cambridge: Cambridge University Press. VITO Edit & Imputation Statistics Gelman, A. & Hill, J. (2007). Data Analysis Using Regression and Multilevel/Hierarchical Models, Vol. 1 New York: Cambridge University Press. Mexico Imagery Data https://ieeexplore.ieee.org/document/8518312 Mexico Imagery Data https://www.opendatacube.org/ Netherlands Imagery Data https://www.cbs.nl/nl-nl/dossier/nederland-regionaal/geografische-data/kaart-van-100-meter-bij-100-meter-met-statistieken Netherlands Imagery Data Persian cat, Model T, Granny Smith; http://image-net.org/challenges/LSVRC/2015/browse-synsets Switzerland Imagery ML application https://www.bfs.admin.ch/bfs/de/home/statistiken/raum-umwelt/erhebungen/area.assetdetail.5687737.html Theme report Imagery Big Data Curzi, G., Modenini, D., & Tortora, P. (2020). Large Constellations of Small Satellites: A Survey of Near Future Challenges and Missions. Aerospace, 7, 133. doi:10.3390/aerospace7090133 Theme report Imagery Big Data Safyan, M. (2020). Handbook of Small Satellites, Technology, Design, Manufacture, Applications, Economics and Regulation. 1057-1073. doi:10.1007/978-3-030-36308-664 Theme report Imagery Data http://aws.amazon.com/es/public-data-sets/landsat/ Theme report Imagery Data http://landsat.gsfc.nasa.gov/?p=10221 Theme report Imagery Data https://eur-lex.europa.eu/eli/reg_del/2013/1159/oj Theme report Imagery Data Toth, C., & Jóźków, G. (2016). Remote sensing platforms and sensors: A survey. ISPRS Journal of Photogrammetry and Remote Sensing, 22-36. Theme report Imagery ML application Ferreira, B., Iten, M., & Silva, R. G. (2020). Monitoring sustainable development by means of earth observation data and machine learning: a review. Environmental Sciences Europe, 32, 120. doi:10.1186/s12302-020-00397-4 Theme report Imagery ML application Holloway, J., & Mengersen, K. (2018). Statistical Machine Learning Methods and Remote Sensing for Sustainable Development Goals: A Review. Remote Sensing, 10, 1365. doi:10.3390/rs10091365 Theme report Imagery ML application Youssef, R., Aniss, M., & Jamal, C. (2020). Machine Learning and Deep Learning in Remote Sensing and Urban Application: A Systematic Review and Meta-Analysis. Proceedings of the 4th Edition of International Conference on Geo-IT and Water Resources 2020, Geo-IT and Water Resources 2020. New York, NY, USA: Association for Computing Machinery. doi:10.1145/3399205.3399224 Theme report Imagery ML techniques Bishop, C. M. (2006). Pattern Recognition and Machine Learning. USA: Springer. UNECE Imagery Big Data [1] Conference of European Statisticians (2019) In-depth Review on Satellite Imagery and Earth Observation Technology in Official Statistics UNECE Imagery Big Data [1] United Nations Global Working Group on Big Data (2017) Satellite Imagery and Geospatial Data Task Team Report UNECE Imagery Big Data Committee on Earth Observation Satellites (2015) Satellite Earth Observations in Support of Climate Information Challenges UNECE Imagery Data [1] Lewis, A. et al. (2017) Remote Sensing of Environment UNECE Imagery Data [1] UCS Satellite Database (accessed Feb. 2020) UNECE Imagery Data Roberts, D., Dunn, B. and Mueller, N. (2018) Open Data Cube Products Using High-Dimensional Statistics of Time Series UNECE Imagery Standards United Nations Economic Commission for Europe (2019) Generic Statistical Business Process Model (version 5.1) UNECE Imagery Statistics [1] United Nations Statistics Division (2019) Guidelines on the use of electronic data collection technologies in population and housing censuses WP2 Quality Quality Framework Australian Bureau of Statistics (2005). Data Quality Framework, Australian Bureau of Statistics, (https://www.abs.gov.au/websitedbs/D3310114.nsf//home/Quality:+The+ABS+Data+Quality+Framework) WP2 Quality Quality Framework Eurostat (2017). European Statistics Code of Practice , Eurostat, https://ec.europa.eu/eurostat/web/quality/european-statistics-code-of-practice. WP2 Quality Quality Framework Statistics Canada (2017). Quality Assurance Framework, Statistics Canada, https://www150.statcan.gc.ca/n1/pub/12-539-x/12-539-x2019001-eng.htm WP2 Quality Quality Framework United Nation (2019). National Quality Assurance Frameworks Manual for Official Statistics, United Nations, https://unstats.un.org/unsd/methodology/dataquality/) WP2 Quality Quality Framework United Nations (2012). Guidelines for the template for a generic national quality assurance, United Nations, https://unstats.un.org/unsd/statcom/doc12/BG-NQAF.pdf. WP2 Quality Quality ML application Luque, A., Carrasco, A., Martín, A. and de las Heras, A. (2019). The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recognition, 91, 216–231. WP2 Quality Quality ML application Pepe, M.S. (2003). The Statistical Evaluation of Medical Tests for Classification and Prediction. Oxford University Press. WP2 Quality Quality ML application Vanwinckelen, G. and Blockeel, H. (2014). Look before you leap: Some insights into learner evaluation with cross-validation. JMLR Workshop and Conference Proceedings, 1, 3–19. WP2 Quality Quality ML techniques Goldstein, A., Kapelner, A., Bleich, J., and Pitkin, E. (2014). Peeking Inside the Black Box: Visualizing Statistical Learning with Plots of Individual Conditional Expectation. arXiv WP2 Quality Quality ML techniques Hastie, T., Tibshirani, R. and Friedman, J. (2009). The Elements of Statistical Learning. 2nd edition. Springer. WP2 Quality Quality ML techniques Japkowicz, N. and Shah, M. (2011).Evaluating Learning Algorithms.Cambridge University Press. WP2 Quality Quality ML techniques Stothard, C. (2020). Evaluating Machine Learning Classifiers: A review. Australian Bureau of Statistics, available upon request. WP2 Quality Quality Practices Arrieta, B.A., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., Garcia, S., Gil-Lopez, S., Molina, D., Benjamins, R., Chatila, R. and Herrera, F. (2020). Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58, 82–115 WP2 Quality Quality Practices Begley C, Ioannidis J. (2015). Reproducibility in science: Improving the standard for basic and preclinical research. Circ. Res. P 116-126. WP2 Quality Quality Practices Bhatt, U., Xiang, A., Sharma, S., Weller,A., Taly, A., Jia, Y., Ghosh, J., Puri, R., Moura, J.M.F. and Eckersley, P. (2020). Explainable machine learning in deployment. arXiv WP2 Quality Quality Practices Goodman, S., Fanelli, D. and Ioannidis, J. (2016). What does research reproducibility mean? Science Translational Medicine, p 341-353 WP2 Quality Quality Practices Hanson, B., Sugden, A. and Alberts, B. (2011) Making data maximally available. Science, p 331-649. WP2 Quality Quality Practices Molnar (2019) Interpretable Machine Learning - A Guide for Making Black Box Models Explainable WP2 Quality Quality Practices Petkovic (2020) AI and trust: explainability, transparency. Ethical implications of AI and AI Tools Lab, Frankfurt Big Data Lab, Goethe University WP2 Quality Quality Practices Ribeiro, M.T., Singh, S. and Guestrin, C. (2016) “Why Should I Trust You?” Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–1144 WP2 Quality Quality Practices Stodden, V., Seiler, J. and Ma, Z. (2018). An empirical analysis of journal policy effectiveness for computational reproducibility. Proc Natl Acad Sci USA p 2584–2589. WP2 Quality Quality Practices Szabo, L. (2019) Artificial intelligence is rushing into patient care—and could raise risks. Scientific American, December 2019 WP2 Quality Quality Practices Vilone, G. and Longo, L. (2020) Explainable artificial intelligence: a systematic review. arXiv WP2 Quality Quality Statistics Bengio, Y. And Grandvalent, Y. (2004). No Unbiased Estimator of the Variance of K-Fold Cross-Validation. Journal of Machine Learning Research, 5, 1089–1105. WP2 Quality Quality Statistics Bickel, P. J. and Freedman, D. A. (1981). Some Asymptotic Theory for the Bootstrap. The Annals of Statistics, 9(6), 1196–1217. WP2 Quality Quality Statistics Biemer, P.P. (2010). Total Survey Error – Design, Implementation, and Evaluation. Public Option Quarterly, 74(5), 817–848. WP2 Quality Quality Statistics Borra, S. and Di Ciaccio, A. (2010). Measuring the prediction error. A comparison of cross-validation, bootstrap and covariance penalty methods. Computational Statistics and Data Analysis, 54, 2976–2989. WP2 Quality Quality Statistics DiCiccio, T. and Efron, B. (1996). Bootstrap confidence intervals. Statistical Science, p 189-212 WP2 Quality Quality Statistics Efron, B. (1979). Bootstrap Methods: Another Look at the Jackknife. The Annals of Statistics. 7(1), 1–26. WP2 Quality Quality Statistics Eurostat (2014). Handbook on Methodology of Modern Business Statistics, CROS-portal, MEMOBUST, https://ec.europa.eu/eurostat/cros/content/handbook-methodology-modern-business-statistics_en. WP2 Quality Quality Statistics Groves, R.M. and Lyberg, L. (2010). Total Survey Error – Past, Present, and Future. Public Opinion Quarterly, 74(5), 849–879. WP2 Quality Quality Statistics Hand D.J. (2012) Assessing the performance of classification methods. International Statistical Review. 80(3), 400–414. WP2 Quality Quality Statistics Kim, J.-H. (2009). Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap. Computational Statistics and Data Analysis, 53, 3735–3745. WP2 Quality Quality Statistics Platek, R. and Särndal, C.-E. (2001). Can a Statistician Deliver? Journal of Official Statistics, 17(1), 1–20. WP2 Quality Quality Statistics Quenouille, M.H. (1956). Notes on Bias in Estimation. Biometrika, 43, 353–60. WP2 Quality Quality Statistics Stone, M. (1974). Cross-validatory Choice and Assessment of Statistical Predictions. Journal of the Royal Society B, 36, 111–147. WP2 Quality Quality Statistics Wolter, K. M. (2007). Introduction to Variance Estimation.2nd edition.Springer. Poster Canada RecLink Record Linkage ML application Christen, P. (2007). “A two-step Classification to Unsupervised Record Linkage”, in Proceedings of the 6-th Australian Conference on Data Mining and Analytics, 70, 111-119. Poster Canada RecLink Record Linkage ML library De Bruin, J. (2019). “Python Record Linkage Toolkit: A toolkit for record linkage and duplicate detection in Python”. Zenodo. https://doi.org./10.5281/zenodo.3559043 Poster Canada RecLink Record Linkage Statistics Fellegi, I.P., and Sunter, A.B. (1969), ”A theory of record linkage”, Journal of the American Statistical Association, 64, 1183–1210