Generative AI

Dublin Sprint, scheduled for September 25-27, at the CSO Central Statistics Office. This event has garnered attention from various sectors, including colleagues from the OECD who have expressed plans to attend. Additionally, we are arranging presentations by domain experts on the governance of GenAI, focusing on providing practical insights into its application. Further information on the agenda and speakers will be shared as it becomes available. A presentation on the work has also been delivered at the Irving Fisher Committee on Central Bank Statistics on August 22.

Open-Source Software Project

The Open-Source Software Project (OSSP) has been conducting sub-team and plenary meetings. The two sub-teams are two sub-groups are "Governance and Maintenance" (chair Kate Burnett-Isaacs) and "Repositories and Discoverability".

Drafting work is continuing on the Project Report. Documents and links can be found in the Project homepage.

The Sprint meeting will be held on September 18-20 in Belgrade, hosted by local NSO SORS.

Updates from the Modernization Groups

Blue-skies Thinking

Identifying Topics/Opportunities


  • Data Integration Sprint
    • The BSTN group has been arranging a sprint on the topic of Data Integration (see: proposal paper). The sprint will take place in the form of two 3hr online meetings, on Sept 9th and 17th.
    • Planning of the meetings has begun, and is almost completed for the 1st meeting. The 2nd meeting timetable will be finalised after the 2nd meeting. 
    • The 1st meeting will focus on background and analysis of established work and practices. The 2nd meeting will narrow discussions with mind to determine possible work avenues for 2025.
  • The June BSTN meeting heard a pitch on AI Governance at NSO from Marie Haldorson, and the group discussed the topic of Data Integration.
  • The July BSTN meeting heard an update on the Future of NSIs exercise, and further discussed the Data Integration sprint proposal. The group then discussed recent news concerning security risks, and suggested an activity could be created for a series of presentation on common risks for NSOs.
  • The August BSTN meeting will hear a pitch from Jeremy Visschers and Luca Mancini on Trust and the Public, and Jon Wylie will provide an update on Work Package 1 of the ModernStats Carpentries project (i.e. the development of OS lessons for statisticians).
Future of NSIs


Al chapter workshops have been completed, and all section drafts were submitted to the General Editors for review.

The General Editors meeting in August reviewed the drafts and will provide comments to the editors of the sections. 

Osama will summarise the key points of the section drafts in a single document and provide such in a few weeks time.

Andrew is creating a single document to combine the section drafts.

The event in Japan will not go ahead, otherwise plans remain unchanged.

Applying Data Science and Modern Methods

General Overview


Good progress has been made by the 2 Task Teams. Last Report I reported the potential for a HLG-MOS meeting in Japan - unfortunately, due to various challenges this meeting will not proceed (yet).

We have unfortunately paused work on the Applied Data Science Collaboration initiative due to current resource pressures in several NSOs. We hope to be able to pick it up again in the next financial year. We will keep the three short-listed topics (PETs – EO/Mobile Data/Geospatial – Nowcasting) on the priority list for when we’re able to restart.

Task Team #1: Uncertainty Quantification


The project team have held a few meetings since the last update, although they skipped the monthly task team meeting in August.

Here are some of the highlights:

Task Team Meetings and Presentations

  • 2024-07-02: Mohammed gave a presentation on Prediction-Powered Inference (PPI).
    • Title: Prediction-Powered Inference: A Review of the Original Paper and Recent Key Papers.
    • Overview: PPI is a framework for performing valid statistical inference when an experimental dataset is supplemented with predictions from a machine learning system.

Subgroup Meetings

  • 2024-06-21: The "Conformal Prediction" subgroup met to discuss the literature review and next steps.
  • 2024-06-24: The "Traditional Methods" subgroup met to review completed tasks, provide updates, and discuss next steps.

Task Team #2: Advancing Responsible AI.


The project has been divided into 9 modules, each with designated responsible persons. Work is being advanced independently within each module. Over the summer, substantial content has already been developed for almost every module. The plan for the summer also included identifying the target group or groups to whom each module will be specifically directed. Additionally, the aim was to consider what methods each module will use to ensure that the content is easily understood by the target groups. The next joint meeting is scheduled for September 12th, where the progress of each individual group will be reviewed.

The deadline for all modules (except for the regulatory framework module) is the end of August, by which time a draft of the content must be ready. Similarly, a draft of the methods to be used should also be prepared. In the next phase, in addition to the methods, the time each module will need to advance its topic should also be determined. There should also be a preliminary designation of the individuals who will be responsible for practically implementing the modules.

Capabilities and Communication

Work and Job of the Future - Extended work on Generic Growth Model


A message requesting wider distribution and use of the Generic Growth Model was sent to countries at the end of May together with the surveys. We have received responses from a few countries confirming that they will distribute the models in their offices, but no comments on the model were provided.

Data Analytics


Task Team is working on paper on data analyses in HR. It should be ready and presented during HRMT workshop and HLG MOS workshop.

Evaluation of blended (hybrid working)


Task Team (TT) is working on analysing answers from survey. TT is also working on paper which will be presented during HRMT workshop and HLG MOS workshop.

Employer branding


Task Team (TT) is working on analysing answers from survey. TT is also working on paper which will be presented during HRMT workshop and HLG MOS workshop.

Ethical management (Data and Business)


Task Team is working on Reference Book on Ethics making necessary revisions to 6 chapters.

Ethics Workshop 2024 (26-28 March 2024)


Workshop on HRMT OC


Task Team is working on agenda and contributions.

AI for official statistics - communication perspective


Task Team is working on paper on AI for communication of official statistics.

Supporting Standards


Activity proposals for 2025


A number of possible activity ideas were discussed, including:

  • AI and standards: There is a firm belief among SSG members that GenAI systems could benefit from the structuring of metadata and semantic context to facilitate their interpretation of data, noting Gartner's recent assertion that "at least 30% of GenAI projects will be abandoned after proof of concept due to [inter alia] poor data quality...". Such standards may also make GenAI results less of a black box. However, to make headway in establishing the interplay between standards and AI, case study examples are needed, for which we would need assistance from AI experts. This area will be discussed in upcoming SSG calls, and there'll be some AI-related presentations at the ModernStats World workshop, which should generate active debate there, though it's unclear whether this activity proposal will be solidified in time for a 2025 activity.
  • Contributing to the anticipated HLG project on Data Integration: The SSG has been asked by the CES to align its work to include Data Integration, and it has been suggested that the anticipated HLG project on Data Integration will task the SSG with making specific contributions, for example in relation to data architecture, etc.
  • CSPA for the Cloud: While the existing version of CSPA has not enjoyed widespread adoption due to technical implementation barriers, it has been suggested that recent developments in the cloud domain (Kubernetes, Docker, Onxyia, etc) might somewhat overcome these obstacles, while developments in use of open source may provide fresh impetus for looking at CSPA again (since sharing of code requires its modularization). Gary also mentioned CSPA in the last EB call. If SSG members are able to formulate a clear proposal, this will be submitted to the EB/HLG workshop.
  • GAMSO revision: To be rolled over from the currently active GSBPM/GAMSO revision activity, as GSBPM is nearing the end of its work, while the GAMSO revision is the next stage of that work to commence.
  • CSDA revision: This activity did not start in 2024 as an activity leader couldn’t be found (people were interested but busy). We intend to propose the same activity for 2025.

ModernStats World Workshop


  • Unlike previous workshops, this year's one will not focus so much on introducing statisticians to ModernStats models, but rather to take a strategic view of the role that people see them playing in the broader modernisation effort, and how to steer our future direction, given recent developments, such as cloud and generative AI. This is reflected by the abstracts now available on the webpage:  https://unece.org/statistics/events/MWW2024
  • Given the short period of time between this workshop and the HLG workshop, it cannot be the place where 2025 activity proposals are conceived. (For this reason, the process for discussing 2025 activity proposals has already started.)
  • There will be a side-meeting after the workshop (exact topic tbd).

Revision of GSBPM and GAMSO activity


  • Going backwards through GSBPM phases, we have nearly finished examining feedback received on phase 2 (Design phase), and after that need to consider the phase 1 (Specify Needs) and the overarching processes.
  • We are aiming to finalise GSBPM by the end of the year, but with 5 remaining calls before the end of the year, this is going to be tight.



Work as slowed down somewhat, but the output document is taking shape, and isn’t too far from completion. The next phase of work on GSIM in the context of DDI and SDMX will not start until the work on GSBPM is completed.

Common Statistical Data Architecture


On hold until next year, for reasons previously mentioned (finding a chair)

Core Ontology for Official Statistics


It has been decided not to start this work until the GSBPM revision is released (next year).

