Synthetic Data Guide

Progress

Draft guide delivered at the HLG-MOS workshop

Successful presentations and workshop on the guide

Next Steps

Data Challenge from January 24 to 28, 2022. 

Registration has opened and closes on January 21, 2022 

Finalizing details on evaluation of the challenge 

Targeting final guide by end of March 2022

Risks and Issues

IssueMitigation



Input Privacy-preserving Techniques 

Progress

Successful webinar in November.

Presentation at the Expert meeting on Statistical Data Confidentiality

Deepening and further expanding the mini pilot of Private Machine learning.

Preparing the public consultation.


Next Steps

Conducting the public consultation.
Collect and discuss new use cases.
Deepening the mini pilots.

Risks and Issues

IssueMitigation
The decision on the extension of the project will be taken in March. The question is how do we anticipate the potential granting of the project:
Focus on writing the final report or on plans to implement next year,
The public consultation has not yet started. With the potential extension, we have more time to implement this properly.



Image result for input privacy-preserving techniques


News from the Groups

Blue-skies Thinking

Identifying Topics/Opportunities


CONTINIOUS

Project proposals:

Following pitching at BSTN, the following 2022 project proposals where approved at the November HLG workshop (along with the extension of IPP project that was proposed by project team):

  • Meta-Academy for the Modernization of Official Statistics
  • Data Governance Framework

Activity proposals:

In addition, the following activities were identified:

-    Digital Twins. Proposal available (BSTN)
-    Future of Work. Proposal upcoming (CapComm)
-    Interacting with Generation Y. Proposal available (CapComm)
-    Mobile Survey Data Collection for Climate Change. Not yet articulated
-    Nowcasting. Not yet articulated

Community proposal:
-    Data Virtualisation. New community idea modelled loosely after ML community.

Network Data

IN PROGRESS

Work is ongoing.
Covid-19 Hotspot Joint Biosecurity Centre Platform

IN PROGRESS

This is still in the scoping phase, although recent meetings have been held to develop the basis for this work.
User Research for Official Statistics

IN PROGRESS

Rapid survey systems

IN PROGRESS

From experimentation to implementation in official statistics

IN PROGRESS

A Project Proposal 'Meta-Academy for the Modernization of Official Statistics' was prepared as a project proposal for 2022, that was approved at the November HLG workshop.  The purpose of the meta-academy is to raise the standards of virtual learning on topics necessary  for  the  modernization  of  statistics  but  are  missing  or  inconsistent  from academic,  commercial  or  in-house  offerings.  The  meta-academy  project  sets  out  to create a benchmark to better map existing initiatives and offerings in order to better coordinate efforts, reduce duplication and fill in training gaps. This project will facilitate sharing  of  skills  strategies,  as  well  as  catalogues  of  contents  and  pedagogical artefacts,  and  more  generally  good  practices  and  standards  in  that  space,  so  that scopes for reuse or co-creation in learning capabilities can be more easily and more systematically spotted and leveraged by all National Statistical Offices (NSOs).
Microdata for understanding declining response rates

IN PROGRESS

Currently postponed: Has been identified as an element of the 2022 work programme.
Facebook survey covid19 related symptoms and behaviour

IN PROGRESS

On hold

Capabilities and Communication







Future of work, future workplace

 and future skills

COMPLETED

Last call of the Task Team was on 29 November. All three activity proposals for the next year were submitted to the Workshop and accepted. So the team need to find resources for all of them: (1) Reaching youths, (2) Job of the future and (3) Future of work toolkits. The team will consider if two first activities should be merged into one.

Task Team will start working with finalizing activity proposal on toolkits, with populating wiki pages with the information. Afterwards the team will look into refining other two proposals and will start working around April 2022 depending on the resources available.

Call for the new members for all HLG-MOS Groups will be sent out to the countries in January by the head of the HLG-MOS.

Next call of the Task Team will be on 25 January 2022.

Ethical leadership

 as part of culture evolution 

COMPLETED

Last call of the Task Team was on 24 November. Activity proposal submitted to the Workshop on the Modernisation of Official Statistics was accepted, and group is starting to plan next steps of their work.

For the next call it was agreed that it will be prepared first draft of the placemat/picture that links ethical activities to GAMSO and GSBPM and afterwards to add case studies from the countries.

UNECE will prepare a list of all groups/teams that work with Ethics.

Next call of the Task Team is on 19 January 2022. 

Role of market research,

digital marketing & communication strategies

and tools in managing a crisis communication situation

and in promoting public engagement in surveys

IN PROGRESS

Last call of the Task Team was on 2 December.

Some final changes to the Guidelines are made, to be soon circulated to the Task Team members for their comments. Afterwards Statistics Canada will harmonize document and will add graphics. It was decided that use cases from the countries will only be posted on the wiki, and the reference to the wiki will be in original document. New wiki space was created for this purpose and will be populated with the relevant materials.

When work on the guidelines and wiki with use cases is finalized, Task Team will start on scoping their activity proposal and deciding what they will start working on.

Next call of the Task Team will be on 20 January 2022.

Strategic Communication Framework Publication

COMPLETED

HRMT Workshop 2022

IN PROGRESS

Topic 6


placeholder

Other

Supporting Standards

Linking GSBPM and GSIM

IN PROGRESS

The last remaining sub-process descriptions have been prepared and are currently being discussed by the group. The final report is yet to be fully updated, but the main bulk (Chapter 3) and introduction part (Chapter 1) are more or less ready. The implementation level examples have been updated based on what was agreed on the specification level.

All the sub-processes are updated here on this information flow diagram  https://statswiki.unece.org/display/GSBPM/Information+flow+within+GSBPM+using+GSIM

Core Ontology for Official Statistics

IN PROGRESS

Work is progressing as scheduled. The expert review is now finished, the group received valuable feedback from the experts. One of the next steps is to check the feedback and update the deliverables/define solutions for the issues. This is included in the "Phase 2" of COOS for next year.

You can find the COOS main document at https://linked-statistics.github.io/COOS/coos.html. It includes links to the formal vocabulary, the governance document and the URI policy.

Next phase of the work will focus on development of use cases, operationalization & further development of COOS outputs.

Updating GSIM

IN PROGRESS

Work is progressing as scheduled. The Task Team has recently closed the issues associated with the Business Group. The task team will start the full revision process of GSIM from next year: the task team will be divided into sub-teams that will work on GSIM Groups in parallel and the new version will be shared for feedback.

GSBPM Task

IN PROGRESS

Work is progressing as scheduled. The Task Team has divided its work into 8 smaller groups for the 8 Phases of the GSBPM. The work of the task team will continue in 2022 with a planned closing in June 2022.
Application of GSBPM for Geospatial Information

COMPLETED

CSPA

NOT STARTED

No success in finding a potential chair for the group yet. Need support from the Executive Board.
ModernStats World Workshop 2022

IN PROGRESS

Members for the Organising Committee has been nominated from the Supporting Standards Group members and discussions on potential date for the workshop has started. Target period: first half of June - first half of July 2022 (3 days).
Other

The group discussed the activity proposals for 2022, in light of the conclusions of the workshop, made some minor changes in scoping and prioritized the proposals. The final planned activity for 2022 are:

Continuation:

  • GSIM revision / with significant scope changes. Finish: June 2023
  • Core Ontology for Official Statistics: Phase 2. Finish: December 2022
  • GSBPM Tasks. Finish: June 2022

New activities:

  • CSPA capacity building. Start: July 2022 (?) / Finish: December 2022
  • Further development of GSBPM overarching processes. Start: July 2022 / Finish: December 2022
  • Relationship between SDMX/DDI and GSBPM. Start: July 2022 / Finish: December 2023. Need to clarify the cooperation with the SDMX/DDI community.

Our group will also collaborate and coordinate work with the Statistical Data Governance Framework project and start an internal discussion on how CSDA is interrelated with other ModernStats models and can be reviewed considering recent developments. 

The generic revision policy (ModernStats Governance Guide) for the ModernStats models is available here: https://bit.ly/2XOrdhe. The Supporting Standards Group will consider it final by the end of the year and make it publicly available.

Machine Learning 2021




Poster.jpg

  

WS1 – Pilot studies: from Idea to Valid solutions

IN PROGRESS

Coding and classification was the most popular application area in this year’s research of ML applications. New application areas investigated included modelling and route optimization. One study highlighted the benefit that NSOs gain from allowing their ML projects to be replicated by other NSOs.
WS2 – From Valid Solution to Production 

IN PROGRESS

The workstream explored issues related to how to make the operationalization of ML solutions smooth and efficient such as how to develop a user-friendly interface and how to build a data lake that data scientists can efficiently draw data from. It also produced a paper outlining typical steps and challenges that statistical organizations take from ML experiment to deployment.

WS3 – Data Ethics and Governance

COMPLETE

The group produced high-level guidance on ethical considerations that arise in ML projects to support analysts, researchers, data scientists, and statisticians. It has been published by the UK Statistics Authority (see link).
WS4 – On The  Quality of Training Data

IN PROGRESS

The workstream carried out a simulation study exploring how to identify the circumstances under which an ML model should be retrained in order to maintain the predictive power and quality of the model.  It also held a discussion panel at the all-group meeting in October where different NSOs shared their experiences and approaches to the topic. 
WS5 – On The Quality Framework for Statistical Algorithms

IN PROGRESS

A framework was developed to see how well ML performs compared to other methods. This year the framework was tested using a real NSO use case. It reaffirmed the importance of having a holistic view, with quality dimensions having different priority for different stakeholders at different stages of the production cycle. 
Other

Update to UNECE HLG MOS EB meeting  - November / December 2021 

Overview 

The Group ended a successful year by presenting highlights of its research to the global official statistics community at its annual webinar on 19 November on the sidelines of the UNECE HLG MOS workshop. Over 279 attendees from 48 different countries for presentations of the group’s research highlights and discussion with national data science leads of the way forward for ML in official statistics involving data science leads from Canada, Sweden and the UK. A post-event survey showed a 96% satisfaction rate and that 74% of attendees were now interested in taking part in the ML Group activities next year 

Research and knowledge sharing. In the workstreams activity leads have been writing up their research results and preparing final reports. These are being published on the new public page for the 2021 activities hereThe last all group meeting held at the end of October some interesting presentations including Supervised Text Classification with Leveled Homomorphic Encryption from Saeid Molladavoudi, Statistics Canada, and MLOps in the Australian Taxation Office, from James Beck, Australian Taxation Office. A discussion panel on model retraining saw representatives of different NSOs (US, UK, Norway, Canada) present their perspectives on the challenges and possible approaches to model retraining. 

Evaluation and strategy. The Coordination Team ran a survey evaluating the activities of the group in 2021 and seeking to understand priorities for next year. Feedback showed high levels of satisfaction with the group’s activities and the benefits of connecting with others to exchanging ideas, information and experience when many NSOs are still in the early stages of their machine learning journey. As one respondent commented “The project has created the opportunity to spread the knowledge of ML among other members of the institution, benefiting the institution as a whole.” The survey flagged interest in more training activities such as the Coffee and Coding sessions, and looking at the challenges of moving ML models into production. It also highlighted some obstacles to the group’s plans to expand next year, notably that many members found it difficult to contribute to activities in a sustained way due to the pressures of their day-to-day work commitments.  

ML 2022: The Team has developed detailed plans to deliver a bolder, bigger and more collaborative project next year. It will launch the next stage with a series of training sessions, presentations and workshops at the UN CEBD / UAE Big Data conference at Expo 202 in Dubai in January 2022. We will issue a call for proposals for activities next week and would encourage NSOs to promote the call and put forward their ideas for research and knowledge sharing activities they could contribute.



  • No labels