Issue 15: Make GSIM a "simple, easy to understand view" of statistical information objects - Generic Statistical Information Model

Jenny Linnerud

Possible ways to divide up the GSIM elephant, suggested at METIS 2013, were a) by group (B,P,S,C), b) by GSBPM Phase and c) by audience (top management, metadata specialists, IT, stat. methodologists etc)

Combinations would also be for example get business people to verify that B includes their top 20. Get IT to verify that P includes their top 20. Get metadata specialistts to verify that C includes their top 20. Who would verify S - archivists? Perhaps we could conduct a Google poll (like, dislike, neutral, give a comment) on an information object by audience basis?

Permalink

21 May, 2013

user-8e470

At UNECE we tried to create a first version of what this picture might look like. The picture is far from perfect (especially as we had to take some liberties with the relationships). However, it puts something on paper for us to find out if this is what we want.

I have attached it as a picture and an editable powerpoint.

higher level picture of GSIM.pptx

Permalink

02 Jul, 2013

user-af201

At the IMF, we have also had a go (see diagrams below and GSIM Level 2 description v0.13).

We started by analyzing GSBPM processes in the “work” section (processes 4 through 7), to see what objects these processes consumed and created (see GSIM GSBPM interface diagrams). By this path we arrived at 11 objects which fully describe the “work” objects part of the GSIM model, at least when considered as each having three layers (standard, program and data). We then reorganized the layout of the 11 objects, which fell into a 3 x 4 grid which serendipitously grouped the objects according to 7 important and useful characteristics (real world, interface, structure and meaning on one dimension, relevance, identification and domain on the other).

We then brought in three process and two knowledge objects, which we feel complete the model. Next, we abstracted these three groups of objects (work objects, process objects and knowledge objects) into a Level 1 diagram.

We validated this model by looking at what we call the “change” processes in GSBPM (processes 9 and 1 through 3) to see if there were any inputs or outputs that were inconsistent with our proposed model (also in GSIM GSBPM interface diagrams). We haven’t had a chance to explicitly validate process 8 (Archive) against the model, but are confident that it will fit comfortably. We also looked informally to see if any of the objects in GSIM v1.0 did not appear to fit into the model as lower level objects of the Level 2 objects we identified. They all seem to fit ok.

We have included all our working outputs in both PDF format (for viewing/printing) and in their original editable format (Word for the document and Visio for the diagrams), in case you want to play with any of the diagrams or see the evolution of the model as we worked on it.

PDFs

GSIM Level 2 description v0.13

Level 2 Diagrams up to v0.13

Level 1 Diagrams up to v0.13

Layers Diagrams up to v0.13

GSIM GSBPM interface diagrams

Word file

GSIM Level 2 description v0.13

Visio files (.vsd)

Permalink

31 May, 2013

Justin Lynch

Before we delve to much farther into the design of a new middle level diagram for GSIM, we think it is important to first fully understand what it is that we are trying to achieve with this middle level diagram. Is it meant as a light look into GSIM for senior staff within an organisation that do not need to look at or understand the lower level information models or the detailed UML (ie people who are not information modellers)?

The proposed new middle level diagram (below) for GSIM would seem to go into more detail than is required for communication GSIM to senior staff, and will add to the difficulties in grasping the concept of GSIM rather than aiding understanding. At the other extreme, those people within an organisation who will be tasked with implementing GSIM will find this model lacking in information, and not a useful artefact at all. In trying to achieve a middle level diagram that meets all audiences’ requirements, we are ending up with a diagram that meets no-one's requirements.

On the other end of the scale, if we are saying that the GSIM model at the UML level is too complicated for even experienced information modellers to understand, then we need to address this and put our efforts into making GSIM simpler, whilst ensuring that it still fully describes the statistical information domain. Creating middle level views to simplify an overly complex model is not the right approach.

The middle level view for GSIM should provide a general overview of the breadth of the model, not the depth. The current "middle level" views of GSIM (below) do seem much better at attempting to communicate this point. They are easily printed and read on a single page, and are concise enough as to not daunt people who are not familiar with information models. They are not meant to be "implemental" views of GSIM, rather they are for communication and education only.
The proposed new middle level diagram is a much larger diagram, with a lot more relationships exposed, making it a much more difficult diagram to traverse and digest for non-information modellers.

The underlying problem is that GSIM as an Information Model is not layered. There are Abstract Objects within GSIM which do drill down to lower level objects, however this is not at all akin to the GSBPM Phase / Sub Phase layering, and attempting to structure GSIM in the same way is just not practical. We need to move away from the idea of representing GSIM as a layered model entirely. Thinking about GSIM as a layered model, and/or attempting to represent it as a layered model is an incorrect approach and should be abandoned.

An issue with drawing sufficient meaning and understanding from middle level diagrams is that they are static, and non-interactive. Readers cannot readily transition from these diagrams down to further detail around a GSIM group or object, without having to leaf through large volumes of paper in the specification layer, or traverse the UML.

If we were to provide a list of GSIM objects categorised into their GSIM groups, which could be drilled down into giving the reader a view of the GSIM model centred around the selected object. In this way we would have a specific view of GSIM for every Object, some more detailed than others as required, providing a greater educational experience for the reader. It is kind of like having the entire GSIM model printed on a large floor mat, and we pass a magnifying glass over the specific area of interest, and understand what is happening in that isolated region, rather than attempting to take in the model in as a whole.

This option does involve a lot more work, creating a view for each object (though some objects may have the same view?) that isn't just the base UML, but a relational diagram such as those shown in the Specification layer, however there would be great benefits from this work. Given resourcing required, this may not be an achievable option, but we would be keen to have others opinion on this idea.

Other options for educating people in GSIM could involve other forms of media, such as educational videos, where the basics of the GSIM model could be discussed by experts in information modelling, in a language that reflects the intended audience. This would seem a great way of imparting essential knowledge to the audience directly, rather than leaving them to abstract meaning from a textural document alone. Very similar idea to the instructional videos supplied on the SDMX websites.

In summary, the main points we need to agree on are:
      1. GSIM is not a layered model, and all attempts to represent GSIM as layered should be ceased.
      2. The current middle level views meet their intended purpose of showing the breadth of GSIM, not is depth
      3. We need to clearly identify the audiences that we are trying to communicate GSIM to, and to what level of detail is appropriate for each audience, and then;
      4. Identify and pursue other forms of GSIM knowledge sharing, such as interactive representations of GSIM, and instructional videos to meet the requirements for each audience

IMF Diagram

With regards to the diagrams put forward by the IMF, we would see this as more a local implementation of GSIM, designed to meet specific business requirements of that organisation. It would not be considered a generic model that another organisation could pick up and implement. The diagram seems to blend together the GSBPM and GSIM, along with some new terms to meet IMF requirements.

The ABS is also looking at creating our own "views" of the GSIM model, specifically targeted to meet the needs of different business areas. We would not see these "views" as being fit for purpose for inclusion in the GSIM Communication documentation because they are scenario specific.

These diagrams would however be fantastic additions the GSIM User Guide, as examples of how GSIM can be tailored to meet individual organisational requirements, giving others an idea of how they to could practically use GSIM within their own organisations.

Permalink

02 Jul, 2013

user-af201

I’d like to thank Justin for contributing such a thorough post on this topic. I fully agree that “… it is important to first fully understand what it is that we are trying to achieve with this middle level diagram.”

I attempted to set this out in the original post, the attached METIS paper, and subsequent discussion in GIG meetings, and thought we had developed a shared understanding of the audiences and objectives of the simple and easy to understand view of statistical information objects (I am assuming this is what Justin means by “middle level view”), but it seems further clarification is needed, so I will have a go at doing this.

Audience: Statisticians, methodologists and IT specialists. (I couldn’t find anywhere in high-level GSIM documentation that modeling experts are a target audience for GSIM, even though there seems to be an implicit assumption by some that they are the primary audience. This may be a source of differing views on approaches to the model.)

Purpose: To bring together statisticians, methodologists and IT specialists to modernize and streamline the production of official statistics, by providing, in this order:

The basis of a shared language to help these groups communicate with each other;
A framework for these groups to build tighter communities through educating each other and collaborating with each other;
The basis for work by these groups validating existing practices and innovating new practices for producing official statistics; and, ultimately
The desired results of reduced cost and improved quality from automating statistical processes.

Justin makes three other points that “we need to agree on”. It’s not clear to me if it means that we need to agree to the statements Justin made, or if it means we need to agree on our approach to these issues. I will assume the latter (otherwise there is no point providing any comments!)

GSIM is not a layered model, and all attempts to represent GSIM as layered should be ceased.

My view as that we are too early in the life of GSIM to make such absolute statements. GSIM v1 is not perfect. It needs improvement. Attempts to identify layers may yield important insights into better approaches to modeling GSIM. Or they may not. In any case, I’m not aware of anyone proposing a GSBPM-style hierarchical layering for GSIM. I’m open to the possibility that there may be some more nuanced layering concepts at play in GSIM, though.

2. The current middle level views meet their intended purpose of showing the breadth of GSIM, not its depth.

At METIS, and the informal workshop on CSPA that followed it, a number of people articulated that none of the existing mid-level diagrams substantially meet the audience needs articulated above. There seemed to be fairly broad agreement that the four object groups are not particularly helpful, and it would be useful to look for more useful presentations of the structure at the high and middle levels.

4. Identify and pursue other forms of GSIM knowledge sharing, such as interactive representations of GSIM, and instructional videos to meet the requirements for each audience

I agree that activities like these are critical to communicating, educating and collaborating using GSIM. However, it seems to me that there is no point undertaking these activities until we are satisfied that the content of what we are communicating is appropriate. I think this means we need to sort out items 1 and 2 first.

Finally, Justin comments on the IMF diagram, stating “we would see this as more a local implementation of GSIM, designed to meet specific business requirements of that organisation.”

The IMF ideas were put forward as a contribution to thinking about the generic model, particularly aiming to meet the GSIM objectives in the form I have set out above, and that we set out in our METIS paper. The process we used to arrive at the model was transparent, and based on GSIM principles. There is nothing specific to the IMF about the model. That’s not to say the ideas are necessarily more or less useful, but I think it would be helpful to have a debate about the ideas.

I think some of the difference in perspective may revolve around what we each mean by generic model and implementation model. I’d like to understand what Justin means when he says “It would not be considered a generic model that another organisation could pick up and implement.” My understanding is that you don’t implement a generic model. You use it as a basis for collaboration, innovation etc, and as the basis for an implementation model. Perhaps what we need to discuss is whether the IMF ideas for GSIM could be the basis for an implementation model. Or maybe implementation models are more related to the detailed level of GSIM in any case.

Justin also says “The diagram seems to blend together the GSBPM and GSIM, along with some new terms to meet IMF requirements.” It seems to me that GSIM v1 already blends with aspects of GSBPM. We attempted to make the pivot points between GSIM and GSBPM more explicit. I think this is an interesting area for discussion – it’s certainly not clear in my mind what the best approach to this is.

I look forward to continuing the discussion in a few hours!

Permalink

02 Jul, 2013

user-8e470

A small note on audiences:

On the GSIM public page we have this:

Audience	Suggested documents
Top & Senior managers	GSIM Brochures
Middle Managers, Subject Matter Statisticians and Methodologists	GSIM Brochure GSIM Communication document GSIM User Guide
Architects, business analysts and metadata specialists	GSIM Communication document GSIM User Guide GSIM Specification
Solution Architects	GSIM Specification GSIM User Guide Enterprise Architect file

On the front page of the Specification document we have:

About this document This is aimed at metadata specialists, information architects and solutions architects.

Permalink

02 Jul, 2013

user-8e470

At the CSPA Sprint we used the following posters to help with the GSIM implementation work. They are essentially one poster per GSIM group, with all the objects in that group represented on the page. I can't comment on how useful they were compared to other options but perhaps the sprint participants could.

4 IN ONE.pptx

Permalink

02 Jul, 2013

Alistair Hamilton

If the target (which is totally necessary in order to achieve modernisation of statistical production) is

Purpose: To bring together statisticians, methodologists and IT specialists to modernize and streamline the production of official statistics, by providing, in this order:

The basis of a shared language to help these groups communicate with each other;
A framework for these groups to build tighter communities through educating each other and collaborating with each other;
The basis for work by these groups validating existing practices and innovating new practices for producing official statistics; and, ultimately
The desired results of reduced cost and improved quality from automating statistical processes.

then I'd be inclined to look first to the CSPA work because that is a package which spans Processes and Services (Technology), and should span Methodology, in addition to Information. I think we'll have a very difficult time seeking to achieve this purpose starting with a diagram in GSIM. I'd prefer to see a flow from CSPA into a view of GSIM which shows how it supports modernisation.

A foundational parameter for GSIM was that it should support the information needs of agencies who choose at this time to persist with a more "traditional" approach to statistical production, as well as enabling agencies to more efficiently communicate, plan and collaborate in support of modernisation. This implies to me that the core design of GSIM - eg features such as layering - shouldn't be premised on modernisation. At the same time, however, there does need to be an agreed way of harnessing GSIM (and other cornerstones such as GSBPM) to support modernisation.

Permalink

02 Jul, 2013

Steven Vale

An interesting discussion!

Regarding the point on layers of GSIM. I started out (first sprint) assuming a layered approach, mainly because that worked for GSBPM. However, around the time of the second sprint it started to become increasingly clear that this was not the best approach, and that GSIM itself should be fairly flat. I say "fairly" because there are some instances of objects being specific cases of a more generic object, though the vast majority of objects can be seen as being at the same level.

However, even if the model is flat, the documentation doesn't need to be, and that is an important point. The reading guide table that Thérèse copied in the post above shows clearly the layered approach to explaining GSIM to different types of audiences. All documents combine words and pictures, as different people respond better to different ways of presenting information. I am not saying that these words and pictures are perfect - there is always room for improvement!

In my experience (working with the CES group on Statistical Communication), the best approach is to identify target audiences and then decide what are the key messages for each group. The GSIM Communication task team, a mixture of GSIM experts and communication experts, adopted a similar approach last year, with the resulting higher level documents being more influenced by the communications people and the more detailed documents more influenced by the GSIM people.

I still have some doubts about the utility of the 4 groups from a modelling perspective, but I think they are useful from a communication perspective I often use them as a tool for explaining and giving examples of different types of information object, when communicating to generalist audiences.

I agree that middle-layer documentation such as the GSIM communication document should be more focused on breadth than depth, explaining what GSIM covers and can be used for. It should also contain pointers to lower level documentation with the detail on how to use GSIM, for the rather limited, but nevertheless important group that want that sort of thing.

I also agree that we need to look at using more interactive communication tools to help get the message across, particularly to non-specialists. Development of videos, webinars, etc. for GSIM (and GSBPM) is already on the "to do" list. Problem is, it is never high enough, and we never seem to have the resources to actually do it! However, Stats Netherlands has recently shown an interest in this sort of work, so perhaps we can make some progress this year.

Finally, I agree that there are many different ways of showing the same thing in a picture, and that the GSIM User Guide is a good place for this. However, presenting a raft of different views could be rather confusing for users, so a way of organising them is needed, and by agency seems a reasonable starting point - a sort of "this is what we think GSIM means for our agency" approach.

Permalink

02 Jul, 2013

user-8e470

2/7/13 meeting: Issue to wait for further steps to be taken by CSPA project

Permalink

02 Jul, 2013

user-af201

I know everyone agreed to this action at the meeting, but it's not entirely clear to me what we are waiting for the CSPA project to do and when we would be expecting them to do it by, and how they will know that we are waiting. Would someone be able to clear this up for me (I was suffering from a cold and it was very early in the morning in the timezone I was in, so I wasn't at my sharpest in the meeting)?

Permalink

07 Jul, 2013

user-f9ef6

I missed the previous meeting when this issue was discussed. I'd also find a clarification of what we are waiting for from CSPA guys and how CSPA relates to this discussion very helpful.

I don't want to comment in detail on the above postings now as it was decided we defer this discussion, but overall I don't understand why we would now postpone work that was considered highly relevant by a considerably large group of people/organizations at the METIS meeting (e.g. the "metadata flows" group), including NSOs. It didn't seem to me that this was merely an IMF implementation issue.

Permalink

15 Jul, 2013

user-8e470

This discussion started with let’s make views of GSIM that will help communicate, educate and collaborate. I think I should emphasis that this is not a new discussion, but one that has been worked on since before version 1.0 was released. The views expressed at METIS, while important, should not be considered the only views on this topic. Let me try to summarise, provide more context and propose a way forward.

Middle Level View: Not the best investment?

The IMF Proposal

I would classify myself as one of the less technical people in the group and to be very honest, the IMF proposal confused me. I did not understand why the rows and columns were useful and the introduction of terms not currently in GSIM did not help me understand GSIM better.

However, this view makes sense to and helps IMF, so I tend to agree with Justin these types of views should be put in the user guide (although the language used needs to align to what is in GSIM).

Why is it so hard to find a view that works?

I think the problem that we have come up against is that there is no “one view to rule them all”

The problem is that there is not one audience for GSIM. GSIM is a big model - it has a lot of information objects. Different people are interested in different subsets of the objects, they are also interested in different levels of detail.

For example, people working in Collection (GSBPM Phase 4) are going to be very interested in objects like question, instrument etc and less interested in objects like Dissemination Service. Just as there will be some people, like the Neuchatel Working Group, who in that role are particularly interested in the Classification related objects.

We could break it up by GSBPM phases or by audience (senior manager, IT people etc) or in any number of different ways. There is no consistency between organizations as to which subsets or which audiences are important. Speaking as the person who has had to draw many of those pictures, I think we could draw endless pictures and still not meet everyone’s needs.

The four high level Groups

This reasoning is also why when we have discussions about changing the 4 high level groups, there is no one answer. This discussion was not new at the METIS meeting. Previous discussions have suggested having high level groups based on GSBPM, others have suggested having 6 groups. The issue is that for every proposal there is an equal number of people who do not like it.

As Steve says “[there are] some doubts about the utility of the 4 groups from a modelling perspective, but …they are useful from a communication perspective [you can] use them as a tool for explaining and giving examples of different types of information object, when communicating to generalist audiences.” I have to agree that this have been my experience as well. There is no one proposal for the high level groups that meets everyone’s needs. What we have now is an agreed compromise.

A way into using GSIM: a proposal

I agree with the statement that GSIM is not a layered model. As Steve says it is a flat model (although there are some objects which have subtypes)
What was mentioned a number of times at METIS was that we need to give people “an easy way into GSIM” i.e give them a way to start understanding and using GSIM.

I am frequently asked by organizations to explain GSIM and how they should use it. The advice that has proved most effective is this:

1) Have a look at the poster section of the GSIM website – in particular this picturewhich will help you understand at a broad level what is in GSIM

2) Decide what part of the model is most relevant to you.

3) Having decided what your focus is, have a look at the toolkit page on the website. This will give you an idea more detailed objects you might be interested in.

This has allowed people to get an easy start into looking at GSIM.

Advantages of this approach:

People are not starting by not trawling through hundreds of pages of documentation (i.e potential users do not have to “buy-in” at the most detailed level of the model)
At the other end of the spectrum, people are not using just the subset of information objects representing in the puzzle piece drawing (This is what the Metadata Flows Group did.)
People are only exposed to the information that is relevant to them and it is a gentle way down into the level of detail that they need.

Recommendations for improving this way into using GSIM (these come from real users)

very useful to have hyperlinks to the GSIM specification sections from the structures in the powerpoint, it would make it easier to get more details.
an example mapping to the diagrams themselves as an annex to the user guide.
some kind of GSIM registry where organisations can upload their mapping and see the equivalent in other systems? That could help with gap analysis and how to align to other systems including SDMX and DDI.
If we were to provide a list of GSIM objects categorised into their GSIM groups, which could be drilled down into giving the reader a view of the GSIM model centred around the selected object.

Why the architecture project might help

CSPA is also a HLG project. The link between it and this project are well documented and other than having the same project manager, there are a number of people who are involved in various aspects of both projects. The discussion here has moved between the need to link GSIM to GSBPM at a high level, and the need to implement GSIM. See the quote from CSPA v0.2 for how it creates a link between these.

“Information Architecture connects information assets to the business processes that need them and the IT systems that use and manage them. It includes relating the coherent and consistent definition of information assets at an enterprise level to the information needs of specific business processes and IT systems in practice. Forrester characterises this as Information Architecture connecting definition of information on “macro” (enterprise level) and “micro” (practical use for specific business and IT purposes) levels. “

The CSPA uses GSIM at the reference model level and as an implementation model. As Al says it is an agreed way for harnessing GSIM to support modernisation. It was thought that by starting from CSPA, we might have an agreed starting place to look at.
The catalogues that will form part of CSPA will start to provide one of the recommended actions for improving the usability of GSIM (some kind of GSIM registry where organisations can upload their mapping and see the equivalent in other systems).

My recommendation:

We should not be creating another mid level picture.
We should wait to see what comes out of CSPA
If we want to continue progressing work while waiting for CSPA, we should look at some of the other recommended actions from users (hyperlinks, example mappings etc)

Permalink

16 Jul, 2013

user-af201

Thanks Thérèse for your detailed post. I have re-read the CSPA project scope document and the CSPA v0.1 document, and, this is possibly my lack of understanding, but I still don't see how CSPA work will help us address the concerns coming out of the Metadata Flows work and our experiences at the IMF. You quote from a CSPA v0.2 document - is this available to look at? In terms of the mid-level diagram, I still think this is an area that needs work, not to help people understand GSIM better, but to help people better understand the statistical information objects they use. Given how young GSIM is, I think we need to be open to the possibility that changes may be required (but, obviously, only with a very strong business case).

In any case, I think that the IMF and ABS have had a lot to say on this issue. I'd be interested to hear what others think.

Permalink

18 Jul, 2013

Page tree

14 Comments

Jenny Linnerud

user-8e470

user-af201

Justin Lynch

user-af201

user-8e470

user-8e470

Alistair Hamilton

Steven Vale

user-8e470

user-af201

user-f9ef6

user-8e470

user-af201