Seitenhierarchie
Zum Ende der Metadaten springen
Zum Anfang der Metadaten

Machine Learning


Afbeeldingsresultaat voor machine learning istock

Progress

Engagement in the group continues to be very high. Over 20 projects were proposed, either test ML techniques or the share knowledge and experience.

A sprint was held on May 13 (see objectives and agenda). It was attended by 12 participants from 9 countries, and hosted by the ONS.  Good progress on Work packages 1 and 2 was made:

  • WP1 - Pilot Study on Coding and Classification: Building on US Bureau of Labor Statistics' successful implementation of machine learning to autocode injuries and illnesses, the code and practices has been shared and will be tested on different types of data in Serbia, Poland and Belgium. The types of data include data collected on the Web, notably to measure sentiment. Other applications may join this pilot study.
  • WP1 - Pilot Study on Edit and Imputation:  A sub-group will investigate the potential of ML in automating the editing process and determine how the statistical foundations of ML and traditional techniques differ, where ML techniques can add value and in what context this value added can be the most beneficial. Examples from the UK and Italy will be used to conduct these investigations. Knowledge and experiences in imputation will be shared. Communication with the Statistical Data Editing expert group will be assured by members sitting on each group, as well as a member of the SDE on the ML project's distribution list.
  • WP1 - Pilot Study on Imagery: The scope of this pilot study has not been finalized because some key project members were not in attendance.  At the sprint, the UK presented a successful application of ML to use street images to produce relevant statistics at a relatively local scale (two cities). It is relevant to the needs and interests of Belgium and Netherlands. We discussed an idea to produce a document to assist new users of imagery data by describing the processing pipeline (that calls on ML) in the use of such data, and its accompanying high-level ML-questions and aspects to consider, as well as proposed ML solutions/applications or places to find them. The pilot study may look at using satellite data to measure population (density, change). It will likely not look at satellite data for land use, as this topic, including its ML aspects, has been extensively covered a UN Task Team a couple of years ago.
  • WP2 - Quality: It will identify quality indicators and performance indicators in two contexts: ML applied to carry out traditional processes on traditional data, and ML applied on non-traditional data sources. To do this it will consider quality features (definitions, dimensions, indicators) from the official statistics and ML communities. It will seek to apply some of these indications in one or two of the WP1 pilot studies. One of the challenges will be to remain focused on ML issues and not the broader issues that come with the various data sources. These are covered more extensively by the ESSnet Big Data II WPK on Methodology and Quality. Communication with this group will be assured by two members sitting on each group and our respective wiki spaces. 
  • WP 3 - Lessons learned: This WP was not discussed directly. One of the intentions is to combine the experiences of organisations who have implemented or are close to implementing ML techniques with the experiences of organisations who will make advancements in implementing them through the WP1 pilot studies into lessons learned on topics such as: facilitators, obstacles, importance and costs of creating and maintaining learning datasets, role of manual operations, etc. 

The documents produced at the sprint will be posted on the wiki and shared with all project members. 

Next Steps

  • Share documents presented at sprint will all project members
  • Finalize and share the plans for each WP and pilot study: objectives, projects, deliverables, success indicators, timelines (most of this was set at the sprint; they now need to be written down, shared and agreed by all)
  • Get the pilot studies in motion 

Risks and Issues

IssueMitigation
Lack of time dedicated to the project. Team members are very engaged, but now the "real" work will start. This will require their time, as well as those of others in their organisations. The risk level is currently low, but the time needed to work on this project was mentioned by some participants at the sprint.
  • Although intended for the broad official statistics community, the work of the group must also be relevant to the participating organisations. Participants at the sprint were asked to return to their respective organisations to seek feedback on the scope of the project.
  • The project's scope remains ambitious and will remain agile throughout its execution. Any reductions or changes to the scope will be communicated to the EB. The participants at the sprint recognized and raised some concerns on the high ambitions of the project, but their high level of energy at the end of the three day sprint indicates that they are up to the challenge. 
Access to data. This issue was raised at different times early in the sprint. For now, there appears to be no need to create a common work space where data and ML tools need to meet. Most of the work will consist of bringing in ML techniques and algorithms to be tested within each organisation and sharing to "macro-level" results. That said, some participants mentioned that it may even be a challenge to get access to the required data within there own organisations. The risk level on this issue is low, but the negative impacts on the project would be very high. 

Assure that the ML projects remain relevant to the participating organisations.

The project manager will offer his support to project participants in getting access to the data that they need and raise access issues to the EB, as needed.



Strategic Communication Framework phase 2

Progress

UNECE hosted a three day sprint session April 30th - May 2nd to launch phase 2 of the project.  8 participants from 7 countries made good progress on Work packages 1 and 2 of the project plan.  The documents produced at the sprint have been posted on the wiki and shared with all project members.   Work package 3 was discussed but no real progress was made as all the appropriate project members were not in attendance.

Next Steps

The project team conducted its regular monthly webex meeting on Tuesday, May 21st.   Workplans were discussed and members agreed on their contributions.

A two day face-to-face meeting is scheduled for Gdansk, Poland on June 10 and 11.  The location and dates were chosen to maximize participation with those already attending the Workshop on Dissemination and Communication in Gdansk June 12-14.   7 members from 7 countries have confirmed their participation with two additional countries indicating interest but have not yet registered.  

The Project members thank Statistics Poland for agreeing to host the face-to-face project meeting.    The objectives of this meeting will be to make progress on Work packages 1 & 2 and begin work on Work Package 3.

Risks and Issues

Successful delivery of the ambitious work plans require full participation from all members.   I will be in a better position following the Gdansk session to know if we have the right skill sets within the project to deliver on Work Package 3.

Discussions continue with Australia and New Zealand to encourage participation as their knowledge and experience will bring significant contributions to all three work packages but particularly work package 3.




News from the Groups

Blue-skies Thinking

Identifying Topics/Opportunities


IN PROGRESS
Follow-up selected topics

IN PROGRESS

Developing Organisational Capability

Skills and Capability Framework

IN PROGRESS

DOC group members are working on paper showing connection between technical and complementary skills and increase awareness about the issue. It was prepared very initial document for further discussion among group members including to some extent alignments with GAMSO which is not easy task. The deadline is on 11 June, before next Webex call.


Promotion Forum
IN PROGRESS
We prepared one-page draft flyer for the CES session in June. It is dedicated to senior and mid level management to attract attention to the outputs of our group.



Setting vision in NSOs

IN PROGRESS

After the Communications Sprint that took place in Geneva at the end of April, it was decided that the Strategic Communications Team will take over the work on the paper on setting vision in NSO's. Our group will have possibility to give comments to draft paper. 

Other

The Organising Committee for the workshop on Culture Evolution will have first formal call at the beginning of June, just after the deadline to submit abstracts (end of May).  

Supporting Standards

Linking GSBPM and GSIM

IN PROGRESS

The task team is meeting regularly every three weeks. A template for the mapping has been agreed. The mapping is being done at two different levels of GSIM: a) a more conceptual level, corresponding to the specification level in GSIM; b) a less conceptual level, corresponding to the execution level in GSIM. The task team is concentrating on phase 5 of the GSBPM both at a design and implementation level. The mapping execrise, including examples from different countries is quite demanding. For the time being, the task team will be able to do the mapping for phase 5 and 4 and maybe one additional GSBPM phase. However, the task team  will probably not be able to complete the mapping by November.
Core Ontology

IN PROGRESS

The development of the core ontology goes on at a steady pace, via virtual meetings (6 of them in the first semester) and offline exchanges. The construction of the model first focused on the integrated view between GSBPM and GAMSO, with discussions on the connection between the notions of activity and process. More recently, modelization of the statistical organizations and products was undertaken. A first version of the ontology will be available for presentation at the HLG meeting in November and afterwards submitted to public review.
Alignment GSBPM and GAMSO

IN PROGRESS

The task team has produced a document specifying the activity to be done. Agreement has been reached. At next meeting in May, the task team will prepare and discuss first draft descriptions of overarching processes in GSBPM and their relationships to GAMSO corporate support activities. 
Metadata Glossary

IN PROGRESS

The work of the task team is proceeding regularly
Other
The Supporting Standards Group is higly involved in the preparation of the June ModernStats World Workshop.

Sharing Tools






  

Digitizing/editing CSPA document

IN PROGRESS

Adding Services to Catalogue

IN PROGRESS

Communication restated CSPA

NOT STARTED

Other


Diese Seite wurde -mal aufgerufen.

  • Keine Stichwörter
Report inappropriate content