Sub-Teams Questionnaire:
https://forms.gle/GK6L8agRVayX4Kqe9


 

Background

The idea behind this project is to develop a better common understanding of the pros and cons, do’s and don’ts, of moving to further and more comprehensive use of open source software as a cornerstone for official statistics production, based on concrete experiences through use cases of broad interest.

Currently, the project is in early stages and the focus is on scoping (i.e. determining what exact topics to focus on and what deliverables to pursue).

Work Packages
The following is a preliminary plan, based on the initial project pitch: 
  •  WP0. Scoping and ambition level
    • Agreed scope and ambition.
  • WP1. Generic aspects
    • Guidelines, principles, frameworks.
  • WP2. Use cases
    • Focus on co-development and community building.
  • WP3. Management, synergies and communication
    • Communication plan.


 



Resources

Initiatives in the official statistics community

NameDescriptionURL
Awesome List

A list of open-source software, organised into categories and regularly maintained.

HLG-MOS Carpentries Project

Ongoing work on a introduction to Python, R, and Git for NSOs

OECD .stat suite


ESS's OS4OS





Participants: Andrew Tait; Aleksandra Skoko; Nevena Pavlovic; Pierpaolo Massoli; Carlo Vaccari; Inkyung Choi; Pubudu Senanayake; Kevin Townend; Samanta Pietropaoli; Martin Ralphs; Ken Rennoldson; Christie Glover; Jonathan Challener; Francesco Isidori; Karl McKenzie; Olav ten Bosch; Lorenzo Asti

  1. Project Manager Hired
    • Carlo Vaccari has been hired as project manager.
    • Carlo previous worked for Istat and has been a manager on other UNECE projects such as Big Data, Data Governance or Common Statistical Data Architecture (CSDA), and has worked on open source topics for many years.
    • Please feel free to contact Carlo as you see fit: vaccaricarlo@gmail.com
  2. Scheduling of Meetings
    • The main group will meet at least once a month.
    • Sub teams will meet more frequently.
    • It is noted that we wish to produce deliverables by November, which leaves 7 months do to so.
  3. Andrew Tait's Draft Group Structure
    • Andrew created a draft group structure based off the discussions from the first meeting (see: Draft Group Structure (dated 16.04.2024)).
      • Andrew's draft structure proposed having two sub-teams:
        • Sub-Team 1: Governance
          • This group would explore questions of maintenance, licensing, and open source culture.
        • Sub-Team 2: Repository
          • This group would stock-take existing repositories like the awesome list, and provide recommendations.
  4. Group Discussion
    • The group was in favor of the idea of having two sub-teams.
    • Team members are welcome to join either or both sub-teams, especially given the overlap in topics.
    • Previous Work:
      • Lessons should be drawn from past OS project failures (why did they fail when others succeeded? e.g. CSPA)
    • Current Work:
      • The group agreed that it is important to not "re-invent the wheel".
      • Jonathan Challenger referred to work already done on licencing options.
      • The Awesome List for Official Statistics Software was provided as an example of a OS respository.
      • It was suggested that the work that has been done on OS could be collected and referred to in one place such as a paper.
    • Culture:
      • How do OS projects reach a critical mass of support where they become self-sustaining?
        • I.e. How do feedback loops arise?
      • How to best facilitate technology adoption?
      • How do NSOs learn about how OS is used in other organisations? How can that knowledge be shared?
    • Trust:
      • The trust in OS software at times comes when interest in a piece of OS software has reached a critical mass.
      • There can be concerns about the trustiness of OS software, but also positive aspects:
        • The code behind OS software can be viewed and valuated, unlike with propriatary software.
          • E.g., R source code is often published alongside the work of methodologists and statisticians.
      • How can trust be measured?
      • Trust is something that is earned, and there are ways of doing so, e.g.:
        • Committing to transparency (including airing "dirty laundry")
    • Documentation and Discoverability:
      • The Awesome List is a public list which people can search and add suggestions for changes
      • The list captures knowledge of what OS software is available for official statistics alongwith metadata on the software
      • Likes on GitHub repos increase the visibility on Google, which increase the likes the repo will receive etc.
      • The list takes a bottom-up approach, i.e. it's not intended as an authoritative source on all things open-source.
      • In the past Olav experimented with badges for OS software on the list that showed how many times the software was downloaded.
        • Badges still exists for licencing information, version number, and last update, but the downloads badge was removed.
        • Downloads don't correspond to popularity one-to-one because of web scrapers.
        • Popularity relates to trust, but indicating popularity is still an open question for the list.
    • Sharing:
      • How many statistical institutions are sharing data, and how they are sharing data?
        • E.g. Sharing OS software that is uploaded to organisation websites vs CRAN.
      • How is the adoption and use of OS software monitored?
  5. Decisions:
    1. Two sub-teams will be created:
      1. Governance and Maintenance:
        • This team will explore questions on issues related to governance, maintenance, and culture, drawing upon successes and failures.
      2. Discoverability and Sharing:
        • This team will take stock of existing repositories, reviewing them in terms of best practices and creating recommendations derived from above, and improving them if needed. The team will also look at the ways information on OS is shared.
    2.  Deliverables:
      • Create a paper on OS.
  6. Actions:
    • Andrew to send out a questionaire asking members which sub-teams they want to be in.
    • Andrew to create a Google Doc for work on the OS paper.
  7. Suggested Reading


Participants: Andrew Tait; Barteld Braaksma; Christie Glover; Craig Lindenmayer; Francesco Isidori; Inkyung Choi; Jonathan Challener; Jonathan Wylie; Karl McKenzie; Kate Burnett-Isaacs; Kevin Townend; Li Wang; Lorenzo Asti; Marcello D'Orazio; Pierpaolo Massoli; Pubudu Senanayake; Samanta Pietropaoli; Vytas Vaiciulis

Open-Source Software Project – Kick Off Meeting Minutes

Quick Summary:

The discussion primarily revolved around questions of governance, culture, maintenance, and discoverability in relation to open-source (OS) software.

There was a desire expressed to not retread covered ground, but to explore something new.

Common concerns were about OS maintenance (one’s own code and other dependencies) and understanding the culture of OS.

Potential deliverables could include:

  • Developing general guidance (i.e., a list of best principles) for managing your OS software.
  • Creating standards for code commenting and general OS knowledge management among NSOs.
  • Exploring OS culture: Learn from current examples (Linux, R etc.) and analyze how they arise, and how they are maintained.
  • Determining best practices for managing dependencies, including understanding links between various packages and their dependencies, and handling the risk of dependence losing support.
  • Developing a discoverable list of OS software where the relations between the software are indicated and/or methods for searching for OS software (e.g., standards around OS metadata, query development or guidance on querying).

Full Notes:

1       Initial Placeholder Project Outline (subject to potential complete revision)

  1. Explore Generic Aspects
    • Guidelines, Principles, Frameworks.
  2. Investigate Use Cases
    • Co-development, community building
  3. Management, synergies, and communication


2       Previous & Current OS Work

3       Housekeeping

  • The project will continue until the next HLG-MOS workshop, which most likely will be in November this year.
    • The project can be extended if needed.
  • A wiki space has been set up for the project: https://statswiki.unece.org/display/OSSP
    • Those without access to the wiki should let Andrew know (tait1@un.org)
    • Andrew will create wiki user accounts for Li Wang, Marcello D’Orazio, Pierpaolo Massoli, Kevin Townend, Francesco Isidori, and Lorenzo Asti.
    • Andrew will update Kate Burnett-Issacs’s user details.

4       Common Understanding

  • The project should ideally cover new ground. For example:
    • How can we go beyond merely sharing OS code and repos, and instead share and collaborate on best practices relating to OS use, development, and maintenance?
    • Part of OS culture is not just usage of OS, but also giving back to the OS community. How is this done and how is this done best?
    • How can support for OS be done at the enterprise level?
      • Note: This can be a particularly difficult issue for smaller organisations.
    • When does it make sense to invest in OS compared to buying software where the maintenance and security burdens are managed externally?
    • Should practices be standardized across NSOs? If so, then how?

5       Topics Raised In The Meeting

5.1      Governance

  • Licensing:
    • What licenses are best to use?
    • How to ensure that code developed by offices is not taken and monetized by the private sector?
  • Models of Governance:
    • What examples already exist of governance structures for OS software?
    • How can others best learn from these examples?
  • Support:
    • The move to OS comes with a change of responsibility in terms of software management, from IT to statisticians, analysts etc. How can this challenge be managed?
    • How do we avoid being tied to anchors of support?
      g. how to we avoid one person being key to the continuing functioning of an OS package?
    • What is a reasonable amount to support to provide for your OS packages?
      • Should that support be ringfenced? I.e. limited somewhat within the NSO community?
    • How do you set reasonable expectations for your OS packages?

5.2      Community

  • How does OS culture emerge?
    e.g. Linux, CRAN
    • The R community is another example. However, it is run by volunteers, whereas we wish to create such a community with staff, which may inform the lessons drawn from such an example.
  • How can we foster the emergence of OS culture?
  • How does Peer Review of OS work?
    • It is noted that one benefit that NSOs have is that they are not in competition with each other, so there is no conflict of interest in co-development.
  • How is OS culture maintained?
  • How to approach OS culture development with Executives?
    • No “free lunch”(OS requires investment).
    • How best to combat OS myths? (OS culture doesn’t necessarily happen organically).

5.3      Discoverability

  • Is it possible to create a discoverable list of OS software for NSOs?
    • This already exists in at least in one form at least: CBS’s Awesome List
      https://github.com/SNStatComp/awesome-official-statistics-software
    • Ideally, what is desired is a searchable list where relations between OS solutions are indicated (e.g., which packages depend on which other packages)
    • The list should have greater functionality: e.g., there should be a way to indicate when a tool is in trouble (lack of updates, full of bugs etc. [i.e., signs of no support])
    • How would the list be maintained?
  • How do you ensure that your published OS is seen by others?
  • How do you check for OS solutions so that you don’t accidentally re-invent the wheel?
  • Is there metadata that can be applied to OS software that could facilitate discoverability?
    • Is/can this metadata be standardized?

5.4      Maintenance

  • Knowledge Management:
    • How can code commenting & documentation be standardized across NSOs?
    • What practices can facilitate OS maintenance? E.g., coding in the open.
  • Risk Management:
    • How to deal with dependencies?
      • What happens when they’re unsupported?
      • Note: Old code isn’t a problem unique to OS. Old Code written in SAS, VBA etc. also can develop issues from lack of maintenance.
    • How to avoid over reliance on any one individual to maintain OS code/packages?

6       Potential Deliverables:

  • Developing general guidance (i.e., a list of best principles) for managing your OS software.
  • Creating standards for code commenting and general OS knowledge management among NSOs.
  • Exploring OS culture: Learn from current examples (Linux, R etc.) and analyze how they arise, and how they are maintained.
  • Determining best practices for managing dependencies, including understanding links between various packages and their dependencies, and handling the risk of dependence losing support.
  • Developing a discoverable list of OS software where the relations between the software are indicated and/or methods for searching for OS software (e.g., standards around OS metadata, query development or guidance on querying).


Oops, it seems that you need to place a table or a macro generating a table within the Table Filter macro.

Name

Organisation

Kevin TownendStats NZ
Samanta PietropaoliISTAT
Kate Burnett-IsaacsInfrastructure Canada
Karl MckenzieONS
Ken RennoldsonONS
Craig  LindenmayerABS
Christie GloverStatistics Canada
Jonathan ChallenerOECD
Li WangStatistics Canada
Jonathan WylieStatistics Canada
Lorenzo AstiISTAT
Martin RalphsONS
Marcello D'OrazioISTAT
Francesco  IsidoriISTAT
Pierpaolo MassoliISTAT
Pubudu SenanayakeStats NZ
Jean-Marc  MUSEUXEurostat
Matyas Tamas  MESZAROSEurostat
Barteld BraaksmaCBS
Vytas VaiciulisCSO
Marc-Philippe St-AmourStatistics Canada
(Olav) K.O. ten BoschCBS

(Mark) M.P.J. van der Loo

CBS

Nevena Mitrovic

SORS

Aleksandra Skoko Despenic

SORS

Pascal Heus

Postman

Iraj Namdarian

CREA

Carlo Vaccari - Project Manager

Independent Expert 

Deraj Wilson-Aggarwa

ONS (Economic Statistics Change)

Giuseppe Sindoni

ISTAT
InKyung ChoiUNECE
Andrew TaitUNECE






  • No labels