Child pages
  • Issue 5: Representing GSBPM in different ways
Skip to end of metadata
Go to start of metadata




The DDI community have undertaken some work to create a "a generic model that can serve as the basis for informing discussions across organizations conducting longitudinal data collections, and other data collections repeated across time." They call this model the Generic Longitudinal Business Process Model (GLBPM). If you are interested, you can read the paper here:


The main diagram for the GLBPM looks very similar to the GSBPM diagram. However they have some other views as well, like the one shown below.

Should we consider something like this diagram to be added to the GSBPM documentation?




See this cool visualisation from ONS: GSBPM Examples.pptx 


  • No labels


  1. Comments from Jay Greenfield on the 'circle view' and GSBPM: 

    The circle view is synonymous with the helical view. It too imagines cycles. In the GLBPM these cycles model a longitudinal study. In the GSBPM these cycles model taking repeated measures over time.
    The core of these views is the archive into which packages of metadata and data are folded. The organization of this type of archive is probably a manifold. In DDI there are comparison objects that use metadata to measure the distances between data in the manifold.
    This view of an archive qualifies it under the rules of Jules Berman as "big data". See
    There are a couple of big data principles at work in the helix/circle. The big data principle of immutability is a "big way" of talking about SIPs, AIPs and DIPs. The other big data principle at work in the circle is that "metadata descriptors must be organized under a classification or ontology that will permit heterogeneous data to be shared and merged" which in turn "drives down complexity". This classification or ontology are the DDI study objects that appear in the circle/helix including the comparison objects DDI uses to measure distances between data things in the manifold.
  2. From the StatCan Data Quality Secretariat:

    I have used the GSBPM in several different contexts, mostly related to managing quality.  I have found the model to be very useful and easily understood.  That said, there is always room for improvement.  Below is my humble feedback.
    •       The model looks linear, or at most, 2 dimensional.  The steps across the top are ordered, as are the sub-processes in the columns.  I have observed that some people take this representation too literally and conclude that their processes should also be linear, and worse, that their work units should be assigned tasks that appear in a single column.  The supporting documentation clearly explains that the model is not intended to be interpreted this way, that in fact many sub-processes can occur simultaneously and that work units can cut across several processes, however the visual model does not convey this.  I’m not sure what to suggest, but to me this was an interesting observation.
  3. Istat comments and suggestions are provided below.

    In general, GSBPM graphical representation is static although it illustrates an iterative process: it can be useful to try to add some changes to convey this idea of movement, for example through graphical symbols such as arrows, etc..

  4. I think it is easier to understand the non-linearity of sub-processes in the GSBPM when you understand how the model relates to GSIM. When you have a clear sense of what statistical objects a sub-process outputs, then you can see the range of sub-processes that could potentially use these outputs, and the visual linearity of GSBPM just gets blown away.

    For example, take sub-process 4.4 Finalize collection. The main output of this process is a data set. This data set could be an input into pretty much any sub-process in phases 5 and 6, into sub-processes 7.1 and 7.2, sub-processes 8.3 and 8.4, and sub-process 1.5.

    Maybe this approach to the way the two models interact can be visualized to show the non-linearity of GSBPM?

  5. I think we have to send several messages:

    • the process is not linear and the actual process can skip certain phases: I like the presentation with the arrows for that.
    • different parts have different rhythms; steps 4 to 7 can be repeated several times without redesign or build activities in between. This can be indicated with different circles (cylinders).
    • the multi source case where the different sources can partly have their own collect and process phase and after data linking the process will be combined. My feeling is that this can be presented as layers.
  6. I'm agree on having different kinds of diagrams to represent GSBPM and give the user a better idea of a non linear (sequential) model, but I think that we need to discuss the kind of diagrams that we're going to use te spread the message

Report inappropriate content