Janusz Dygaszewicz – CSO Poland                                                                                            Warsaw, 08-06-2017

 

The proposal of the GSBPM model improvement in terms of spatial component

of the statistical production process

 

 

At Statistics Poland a comparison between the GSBPM and the national process model has been carried out. The practical implementation of relevant processes relating to spatial data and mapping them with GSBPM model showed that the important areas are not included in the model, although they are important and necessary in the actual production process. By indicating these areas, the potential shortcomings and imperfections of the GSBPM model were diagnosed. The analysis revealed shortages of the GSBPM model, which essentially concerned statistical data spatialisation aspects from the stage of designing the data collection, geocoding, analysis and providing spatial characteristics of statistical products.

 

Therefore, Statistics Poland suggest that the current GSBPM model should be modified by adding the following four new sub-processes:

• Phase 2: subprocess "2.5a Design geocoding frame, sample & data collection";

• Phase 4: subprocesses "4.1a Geocode frame & sample" and "4.3a Geocode collection";

• Phase 6: subprocess "6.2a Prepare spatial analyzes & maps";

• Phase 7: subprocess “7.2a Manage spatial analyzes & maps using GIS"

 

The introduction of above sub-processes would enrich the model with a spatial component of the statistical production process, which will allow a better understanding of spatial data as well as its role and place in the statistical production process and the standardization of methodologies merging statistical data with spatial data. This issue is still poorly perceived by the statistical community and is still in the early development phase. For obvious reasons spatial elements could not be fully taken into account when constructing the model. The introduction of the proposed changes in the GSBPM model would contribute to strengthening the position of geospatial statistics in the statistical production process.

 

Finally, the updated version of the graphic GSBPM model could look as follows:

 

13 Comments

  1. Input made by Mexico (Manuel Cuellar-Rio):

     

    Hello All,

    Please find attached a first draft of some ideas to include spatial components into the GSBPM. As you will see, it shares some of the elements proposed by Janusz.

    Best regards,

    Manuel Cuellar-Rio

    Acting Deputy Director General of Integration of Information
    DG of Integration, Analysis and Research
    National Institute of Statistics and Geography (INEGI), Mexico

  2. Input made by ILO (Edgardo Greising):


    Dear Colleagues,

    Looking at both proposals from Manuel and Janusz, I would like to ask you some questions and provide some comments.

    First of all, I would like to better understand the criteria used to number the sub-processes/activities proposed, since in Janusz’s proposal they have been identified as “twin sub-processes” to already existing ones, and in Manuel’s some of them have been assigned a 3rd-level number which would not be part of the Generic model.
    I think that beyond the obvious similarity or correlation with existing sub-processes, and that some of these processes can be considered particular cases of the 2nd-level already existing sub-process (e.g. “2.1.1 Design process for geographical information production” might be included in “2.1 Design outputs”), these new activities require a different expertise normally provided by additional people joining the team or taking over this particular activity, so I would propose considering them separate 2nd-level sub-processes (at least for most of the cases).

    Looking at the proposed sub-processes for each phase, I find Manuel’s proposal for Design (2.1.1 and 2.4.1) a decomposition of Janusz’s 2.5a, where the “Design geocoding frame” activity is lost. I agree with splitting the sub-processes (Manuel’s approach) and would add a third one for “Design geocoding frame”. My hesitance is whether this is a sub-process of Phase 2 “Design” or should be called “Define geocoding frame” and be part of the “Specify Needs” phase, considering that normally this frame already exists.

    I agree with Manuel’s proposal for Build phase.

    For Collect both proposals look quite coincident, and we should look at and agree the respective descriptions.

    I would humbly suggest adding a sub-process in Process phase (that none of the proposals considered) for “Set up Geographical Linkage” (or something like that) consisting of the activities required to link the collected geo-referenced data to other information based on their geographical coordinates using GIS. Would you agree?

    In the Analyse phase, I would split Janusz’s 6.2a into Manuel’s 6.1.1 and another sub-process for “Perform spatial analysis”.

    Both proposals for Disseminate phase seem to be quite the same.

    Finally, I wonder if we should not add an sub-process in phase 8 “Conduct spatial-driven evaluation” (or something like that) to take advantage of GIS tools for the evaluation of surveys.

    Thank you.

    Best regards,
    Edgardo.

    Edgardo Greising
    Head of Unit
    Microdata and Knowledge Management Unit
    Department of Statistics (STATISTICS)
    International Labour Organization (ILO)


  3. Input made by Turkey (Nilgün Dorsan)

    Dear All,

    I would like to comment on the proposals regarding to adding GIS based sub process to the GSBPM.

    GSBPM is a generic model and can be applied to any kind of products/ouputs. We think that GIS is a way of collecting and disseminating data.

    For example in the “2.4.Design frame and sample” process, we do not need to differentiate how we prepare our frame because both way is serve the same process

    In the same manner, “3.1. Build collection instrument” is more generic sub process. Do we really need to customize these process? (proposed as 3.1 Build tools for geographical information production)

    It seems that “6.1 Prepare draft outputs” and proposal on “6.1.1 Prepare GIS outputs” are mentioning the same process. The main point here is the preparation of any kind of output.

    When we mention “7.2 Produce dissemination products” we can talk about all kind of dissemination products. It can be table, map, database, news bulletin, etc. as it is written in the expalanatory notes of the GSBPM.

    The instruments used in the processes can be varied. In process approach we should focus on the outputs. I thougt that we do not need to specially mention to the GIS in the GSBPM. If we would like to map the production processes for the GIS based products, first we can define the product name, then define the output, and draw the process map. The main thing here is what kind of system this product is using.

    Best regards,

    Nilgün Dorsan

    Turkish Statistical Institute

    Head of Metadata and Standards Department

  4. Input made by Australia (Alistair Hamilton)

     

    Dear Nilgün

    What you're proposing below sounds similar to the perspective of my colleague Martin, who looks after geospatial solutions at the ABS.

    Martin is broadly in favour of the approach proposed in Annex 4 of the GEOSTAT 2 project.

    I agree with Martin that if the wording can be "tweaked" (eg made more inclusive), while keeping a sub-process generic, that may be preferable to subdividing the sub-process into finer grained structural components.

    Taking one of the examples you mention, my colleagues in the Dissemination area could break down 6.1 and 7.2 into many finer grained categories beyond 6.1.1 and 7.2.1 based on expected consumption (eg GIS vs machine to machine consumption of statistics on a non GIS basis, vs packaged data products for analysts, vs packaged articles at a more summary/explanatory level). For a particular Statistical Program several, to all, of the finer grained categories might be relevant.

    One risk might be that different NSOs categorise dissemination components differently, meaning they can easily map to GSBPM at the level of 6.1 and 7.2 but find it harder to map at the finer grained level.

    Another risk may be that by (eg) identifying 8 GSBPM subprocesses that often have particularly prominent geospatial aspects - and giving each of those a specific finer grained component - the fact that other GSBPM subprocesses can also have geospatial aspects (eg as described in Annex 4 above) becomes less prominent.

    Similarly to your last paragraph, if it was desired to describe in more detail - and more specifically - the geospatial aspects across all GSBPM phases this could be done as an "overlay" on the more generic "base GSBPM" (eg specific business processes and data and metadata flows) rather than adding specific categories into the "base GSBPM".

    This might be similar to the way the ABS describes how "families" of statistical programs (eg Administrative Data vs Business Surveys vs Household Surveys) typically work their way through "GSBPM" rather than adding detailed family specific components into "GSBPM" itself (eg index aggregation vs time series aggregation vs other forms of output estimation including for non standard geographies). (References to "GSBPM" are actually to the more detailed Statistical Production Activity Model, descended from GSBPM, used by the ABS.)

    Cheers

    Al

     

  5. Input made by Poland (Janusz Dygaszewicz)


    Dear Colleagues,

    Obviously, the production environment and NSIs' roles in geospatial statistical data differs from country to country and cause large differences in production processes. This is why I agree that at the moment different opinions and approaches should be seen more as a collection of relevant views than one view that covers the whole of geospatially related manufacturing processes and I expect that we will have an interesting discussion about it. I'm sure, that finally we find a common solution.

    In the CSO of Poland in terms of spatial data compliance with the processes recommended by the generic model the comparative analysis method was based on subsequent comparison of business functions in each GSBPM phase and sub-process with actual processes and business functions implemented in CSO statistical production with particular emphasis on spatial data.
    After last census the practical implementation of relevant processes relating to spatial data and mapping them with GSBPM model showed that the important geospatial components are not included in the model, although they are important and necessary in the actual production process. By indicating these components, the potential shortcomings and imperfections of the GSBPM model were diagnosed. The analysis revealed shortages of the GSBPM model, which essentially concerned statistical data spatialisation aspects from the stage of designing the data collection, geocoding, analysis and providing spatial characteristics of statistical products.


    To summarize conducted comparative analysis of the spatially referenced statistical data production processes implemented in the CSO versus the GSBPM model in terms of spatial data, the following conclusions appeared:
    1. Real statistical production process implemented and tested in practice in the last census round appears to be more comprehensive than the theoretical generic GSBPM model.
    2. Through the practical implementation of relevant processes relating to spatial data and mapping them with GSBPM model the important areas not included in the model were pointed out, although they are important and necessary in the actual production process.
    3. By indicating these areas, the potential shortcomings and imperfections of GSBPM model were diagnosed.
    4. Shortages of GSBPM model essentially concerned statistical data spatialisation aspects from the stage of designing the data collection, geocoding, analysis and providing spatial characteristics of statistical products.
    5. Diagnosed shortcomings are so important for the ongoing work in the field of issues involved in combining spatial data with statistical data, that there should be appropriate modification of the GSBPM model in terms of geospatial aspects.

    The introduction of new sub-processes would enrich the model with a spatial component of the statistical production process, which will allow a better understanding of spatial data as well as its role and place in the statistical production process and the standardization of methodologies merging statistical data with spatial data. This issue is still poorly perceived by the statistical community and is still in the early development phase. Of course, I agree with other colleagues that the more wider description under of existing sub-processes would be probably enough to describe spatial components in the current GSBPM model, but for obvious reasons the real risk exist, that spatial elements could not be fully taken into account when constructing the model and preparing the business process of statistical production. The introduction of the proposed changes in the GSBPM model would contribute to strengthening the position of geospatial statistics in the statistical production process and will better communicate the importance of the geospatial statistics to our community, what is also so important for education reason.

    Best regards
    Janusz

    Director
    Programming and Coordination Department
    Central Statistical Office
    00-925 Warsaw

  6. Input made by Italy (Marina Signore)


    Dear all

    I think that we should try to adapt the GSBPM to the new ways (and challenges) of producing statistics that are becoming more and more in use in our offices while keeping it as a generic model. This means that we should work on the level of "names" of phases and sub-processes, and on the descriptions without adding new sub-processes as much as possible. I have the impression that this would be feasible for almost all issues concerning geospatial data but also for big data and data integration.

    Best regards
    Marina

  7. Here are the comments from the Eurostat colleagues managing geospatial information:

    • Integrate the management and use of geospatial information into various processes and sub-processes of the GSBPM and to support related proposals from countries. An ESSnet GEOSTAT 2 has prepared detailed instructions (see http://www.efgs.info/wp-content/uploads/2017/03/GEOSTAT2ReportAnnex-4.RevisionGSBPM.pdf and other support material, e.g. illustrations) on how to extend the GSBPM model in this directions. This approach will be presented by several members of the GEOSTAT consortium led by Finland.
    • The necessary enhancement would mainly work by revising the descriptions of the processes and sub-processes concerned. Eurostat therefore mainly supports the proposal made by Finland. However for visibility reasons and for underlining the special contribution from geospatial information to statistics it would be extremely useful if the term 'geospatial' features at least in a few places in the most used diagram of the GSBPM showing the processes and sub-processes in boxes. Poland is suggesting adding geospatial sub-processes for that purpose. As an alternative, titles of existing processes and sub-processes could be modified (e.g. 2.5 'Design processing, data integration, geocoding and analysis'. 4.1. create and geocode frame & select sample, etc.) to highlight the prime importance and special role of geospatial information. Or work with foot notes (e.g. 1.5 check data availability* *including geospatial and alternative data) in the diagram.
  8. Dear Colleagues,

    In our opinion, it is vital to combine the Geospatial Information and its processes and the GSBPM –model in one way or the other. However, we do not support the idea of including several new sub-sub-phases into the model. The results of GEOSTAT2 and the tests done in Finland should be taken account while adding the geospatial approach into the GSBPM. 

    We suggest that this issue would be further elaborated in the Workshop on Integrating Geospatial and Statistical Standards in November 2017 in Stockholm. The solution could be to work with the descriptions in a way that they could be either widened to include the geospatial approach, or alternatively in some cases, the geospatial related issues could be specifically named in the descriptions.

    Essi Kaukonen, Standards and Methods
    Rina Tammisto, ICT Management
    Statistics Finland

  9. (Feedback from INEGI; 29 September, 2017)

    • INEGI has been working on incorporating references about including geographic information into the statistical production process into the GSBPM, these work could be replicated into the standard given the interest of the Statistics Office community to incorporate this kind of information
    • Taking the proposal made by Edgardo Greising as a basis, we have some comments:
      • it in the "Specify Needs" phase as "Define geocoding frame ", because it is fundamental to have a definition of the territorial coverage for the statistical project
      • It is convenient to include "Define geocoding frame" into the “Design” phase to set the geostatistical set of codes which will serve as a basis to geo-reference statistical information
      • The “Design of geostatistical frame” subprocess in the “Design” phase will help to set the basis for further geostatistical analysis
      • In the collect phase, the suggested “Create geostatisical frame sample” subprocess, is a way to assure a representative territorial coverage
      • “Geospatial linking” in the “Process” phase is essential to prepare maps, geospatial analysis of the statistical information and to develop services based on georeferenced statistical information
      • “Perform spatial analysis” in the “Analysis” phase is needed to reach the benefits from having georeferenced statistical information
      • “Perform spatial evaluation” in the “Evaluation” phase will serve to review the geospatial frame and to make corrections and updates detected as needed, and recover some other knowledge like zones which are difficult to be reached
  10. (Feedback from Statistics Sweden; 2 October 2017)

    The work within GEOSTAT 2 (http://www.efgs.info/geostat/geostat2/) regarding use of the GSBPM for geospatial data should be incorporated into GSBPM. It should not be seen as a separate implementation of GSBPM but as a part of GSBPM in the same way that use of administrative registers for the production of statistics is not a process model in itself but a part of GSBPM. The suggested additions specified in annex 4 of the report mentioned above should be considered as additions.

  11. (Feedback from Australian Bureau of Statistics (ABS); 2 October, 2017)


    Better reflecting geospatially oriented techniques/methods in statistical production as characterised by GSBPM

    The ABS strongly supports the idea that the GSBPM should more explicitly describe the (increasing) role of geospatial data and metadata, and geospatially oriented techniques/methods, in the production of official statistics.
    ABS supports the recommendations of GEOSTAT 2 in this regard. 
    In other words, rather than adding a small number of sub processes (or "sub-sub processes") that are explicitly geospatial in nature, we would recommend 

    • updating existing sub process descriptions more broadly (in the "vanilla" version of GSBPM), and
    • adding richer "applied use" information for key sub-processes, eg as an annex or separate artefact, that provides additional information for readers who are particularly interested in geospatial topics related to statistical production          

    The ABS provided some comments on the specifics of GEOSTAT 2 Annex 4 but our comments missed the GEOSTAT 2 cut off for comments on the draft.
    Here are those comments for information: (See attached file: Annex_GSBPM_remarks_13012017_ABS_comments.xlsx)
    Annex_GSBPM_remarks_13012017_ABS_comments.xlsx

    The ABS would, however, be very comfortable with Annex 4 as it currently exists being used as the starting point for updates. 
    The main point is that we favour wider scope, but more general, updates to GSBPM in this regard rather than focusing "geospatial considerations" into a small number of additional "specific and separate" sub-processes / sub-sub processes.
    We believe the GEOSTAT 2 proposal is both 

    • more in keeping with the generic nature of GSBPM (for agencies that are either very active with geospatially oriented techniques/methods or more "traditional"), and
    • more reflective of the wide contribution modern geospatially oriented techniques/methods can make to statistical production- and are making in some agencies.      
  12. (Feedback from Statistics Estonia; 5 October, 2017)

    Geographical data can be considered as a type of data used in GSBPM processes. GIS specialists have analyzed the field in more details. Please see more details from here:
    http://www.efgs.info/wp-content/uploads/2017/03/GEOSTAT2ReportAnnex-2.NationalExcercisesGSBPM.pdf

  13. user-8e470

    Of the countries/organisation who commented, there are:

    • two who make a proposal to add new subprocesses
    • one who favours a mix of new sub processes and better descriptions
    • seven who favour improvements to descriptions only (many referring to the proposal by GEOSTAT2)

    Based on this additional text suggestions made by GEOSTAT is implemented in current draft, subject to further discussion

    Meeting 15/5: Should not include examples of data types (geospatial, administrative) if we can avoid it. These will change over time and including them could date the model. Also review text from Juan to see how to modify the GEOSTAT2 proposal based on other revision decisions. Change "areal" to "geographic".