Seitenhierarchie

Versionen im Vergleich

Schlüssel

  • Diese Zeile wurde hinzugefügt.
  • Diese Zeile wurde entfernt.
  • Formatierung wurde geändert.
Panel
borderColorgrey
bgColorwhite
borderWidth1


 




UNECE – HLG-MOS Machine Learning Project

Imagery Theme Report

PDF version










Image Modified

This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. If you re-use all or part of this work, please attribute it to the United Nations Economic Commission for Europe (UNECE), on behalf of the international statistical community.

...

Panel
borderColorgrey
bgColorwhite
borderWidth1

Anker
_Ch2
_Ch2
Generic Pipeline for Production of Official Statistics Using Satellite Data and Machine Learning

After noting the lack of a generalized approach to describe how satellite data can be used by NSOs, as well as, acknowledging that the issue is even more complicated because use of satellite data often requires ML techniques which themselves are being experimented and not yet integrated in the production process in many NSOs, the development of the generic process pipeline is one of the first the deliverables in the Imagery Theme team. A generic process model describes high-level activities that need to be followed to achieve a certain objective or to deliver a specific output. This pipeline focuses on the specific use of satellite data to produce official information.

This pipeline aims to address following issues:

  • There is lack of understanding about business process needed to use satellite data for statistical production.
  • Processing and analysing satellite data require techniques that are not in traditional skill set of statistical organizations.
  • Lack of common reference points to consolidate and link them, even though there is increasing body of works related to use of satellite data for production of official statistics.

The pipeline developed as in diagram and outlines the six main stages (business understanding, data collection and preparation, ML modelling, prediction, dissemination, evaluation) and the main specialized roles (thematic expert, E0 scientist, data scientists, statisticians and computer scientists) involved in each of the steps.

The diagram of the pipeline is provided below. More detailed description for this activity can be found in the specific report as well as additional examples related to the pilot projects of the Imagery Theme team.

Diagram of the pipeline

Image RemovedImage Added

Panel
borderColorgrey
bgColorwhite
borderWidth1

Anker
_Ch3
_Ch3
Motivation

In order to explore the potential of alternative data sources to those already known in the Official Statistics (Censuses, Surveys and Administrative Records) or to enrich existing projects, several projects were carried out aiming to take advantage of satellite images with Machine Learning (ML) techniques.

This document is intended to summarize the pilot projects carried out by Australia, the Netherlands, Switzerland and Mexico.

Machine Learning involves the automatic discovery of patterns in the data using computational algorithms and, from those regularities, proceeding to carry out tasks such as the detection of various categories (Bishop, 2006) in a training set. This is called Supervised Learning. The pilot projects reported in this document belong to this category and show the application of various classification algorithms that seek to relate the implicit or explicit patterns found in data carefully labeled by experts, with equivalent patterns in unlabeled data, intending for the algorithms to identify generalization rules that allow assigning categories to objects that have not been manually analyzed. Once the algorithms assign the “predicted” category, it is important to perform the evaluation of the ability of the algorithm to generalize with previously labeled testing sets, but never used in training procedures, and reporting the corresponding performance metrics for each project.

Each country wrote a detailed report of their work and corresponding experiments, we invite the reader to review the specific details of each country, in this document we will present the essential aspects.

Problem to solve

Each Statistical Office established the characteristics of the pilot test to be carried out, in which satellite images were used in the context of Machine Learning applications in order to solve specific office problems. As stated by the NSOs themselves, they try to solve problems related to the reduction of human intervention in the process of updating the Address Register (AR) or the measurement of statistical variables such as poverty or expansion of urban areas, as well as the detection of change in land use and land cover (LULC). Regularly finding a link to satellite images implies having some type of geographically referenced statistical information, as well as field work for validation, which is the basis for training automatic classification algorithms. The participating countries have a georeferenced source for such training.

The countries established the main motivation of their pilot test, identifying a relevant motivation that allows them to explore the validity of the approach, through the execution of the pilot project and a subsequent evaluation with respect to the original motivation once the project is completed. Some countries are still in the preliminary stages so definitive results are not yet available in some cases.

The expectations of the participants involve the need to create a new process that complements the activities of the NSOs or simply to improve existing processes. Either way, progress will be based on the application of Machine Learning techniques to satellite images.

Country

Problem to Solve

Contribution

Value Assessment

Australia

Use a model of ML to reduce the amount of manual intervention required during regular Address Register (AR) maintenance processes.

Reduce costs (time) by improving the current process that is a resource intensive process.

The number of automatically classified addresses.

Netherlands

Explore the potential of ML for detecting poverty and population distribution from aerial or satellite imagery.

Learn how to use machine learning to exploit imagery as a new data source in the production of official statistics and assist other countries who do not have income data in measuring poverty from imagery.

A working computer prototype.

Switzerland

Facilitating land use and cover classification and by improving change detection

Improvement existing process to reduce costs (time). At present, internal resources are almost entirely allocated to visual interpretation, at the expense of other activities.

A working computer prototype that allows to demonstrate the innovative potential of the FSO in the use of artificial intelligence to process images.

Mexico

Detect the extension of urban areas nationwide using ML

Reduce time and money. Generate information products that contribute to the cartographic update. It will also be possible to incorporate urban growth data into the population estimation models. Finally, it will be possible to generate new types of statistics that allow observing the evolution of the extension of the cities of Mexico

Clear objectives with links to potential impacts on existing and future data products.

...

Report inappropriate content