Quick links


[Common Metadata Framework]

Metadata Case Studies


 All METIS pages (click arrow to expand)
Skip to end of metadata
Go to start of metadata

Next section

2.1 Statistical business process

Since GSBPM (Generic Statistical Business Process Model) reached full maturity with the release of V4.0 agreed in April 2009 it has been regarded as the preferred reference model for the statistical business process within the ABS. This status for the GSBPM was confirmed with the area leading the enterprise architecture initiative within the ABS at that time. The IMT Program launched in February 2010 also confirmed this status for the GSBPM.

A number of reference models for the statistical business process existed, and were harnessed, within the ABS prior to the development of the GSBPM.

A particularly broadly applied model was affectionately known as "The Caterpillar" within the ABS. The Caterpillar was developed by the ABS as a reference model to support the Business Statistics Innovation Program (BSIP) launched in 2002.. It allowed a disparate range of surveys and other statistical activities whose processes were (especially prior to BSIP) very different in detail to describe what they did, why and how (eg what systems and data stores were used) in terms of a common high level reference point for the statistical life cycle. It later allowed "leading practice" to be identified in different parts of the statistical cycle.

The broad relationship between the Caterpillar and GSBPM, documented previously in this section of the case study, has been moved to a supporting page.

Extensive process documentation, together with categorisations of information and even software interfaces, was developed during the course of initiatives such as BSIP and ISHS (both described in BHM). This activity was undertaken based on the "pre GSBPM" reference models for the statistical business process associated with those initiatives. It has been agreed existing process documentation within the ABS will not be rewritten for the sole purpose of referring to the GSBPM. Formalising, and making readily available, mappings between the GSBPM and the local reference models has been particularly important.

The paper Applying the GSBPM within an NSI : Experiences and examples from Australia, prepared for the METIS Work Session in March 2010, provides more information in regard to ABS utilisation of GSBPM as well as in regard to statistical business process models that preceded GSBPM. Annex 1 of that paper provides a full description of the Caterpillar.

Key points include

  • GSBPM is harnessed, within the business domain of ABS Enterprise Architecture, as the primary reference model for the for the statistical business process.
  • Recognition within business architecture allows GSBPM to serve (among many other roles) the purpose for which it was originally commissioned internationally, namely linking metadata management (and statistical information management more generally within the data/information domain of Enterprise Architecture) to the statistical business process.
  • GSBPM is seen as a vital enabler of practical, purposeful and collaborative external engagement at the international level, national level and sub-national level.
  • GSBPM is harnessed as a reference model for the statistical business process for corporate planning and management purposes. A number of specific ABS applications of the GSBPM in this regard are listed in Section VI of the paper for the METIS work session.
  • Experience with BSIP and ISHS highlighted the practical value of common (within each initiative) reference model for the statistical business process prior to GSBPM maturing to fulfil this role across the ABS as a whole, as well as internationally. .(See Annex 2 in the paper for the METIS work session)
  • While ABS waited until GSBPM achieved the level of maturity, and breadth of international recognition, associated with V4.0 prior to adopting it formally, it has had an active interest in GSBPM since the inception of its development. (See Annex 3 in the paper for the METIS work session)

2.2 Metadata System(s)

There are currently many systems within the ABS that encompass significant metadata definition and management aspects.

The MRR (Metadata Registry/Repository) associated with IMT is by far the most significant metadata system currently under development. The MRR's Registry capabilities will act as a "central nervous system" for systems across the ABS that define, manage and use metadata. At this stage the MRR is at the Proof of Concept phase.

An early activity associated with IMT was the first phase of a "metadata census". As suggested by selection of the term "census" it was originally hoped that this activity would provide a much clearer and more comprehensive "as is" picture of metadata management at a local level within the ABS as well as at a corporate level. GSBPM was an important point of reference for indicating what phases and sub-processes each system was supporting with which metadata.

An early issue encountered was the ability for those responsible for systems to describe in a clear and consistent manner the "types" of metadata managed within their systems. For example, if one system was said to work with "variables", another with "data items" and another with "data elements" were all three systems talking about the same "type" of metadata, or about different "types" of metadata (that maybe related to each other in some way)? Work associated with GSIM and the MRR should lead to this issue being more tractable in future.

The "as is" picture is also complicated the fact that many local systems currently need to "replicate", possibly in a specialised format with local content additions, metadata held in existing corporate repositories. The issue about consistently typing metadata compounds the issue of being able to establish which systems are managing which metadata simply because they currently can't source it from elsewhere – as opposed to managing metadata for which that system should be considered an authoritative source within the ABS. A significant number of processing systems currently have a secondary role as a "metadata system" only because – for a variety of reasons - they can't source the metadata they need systematically from elsewhere.

The second phase of the "metadata census" focused in more depth on metadata associated with core corporate stores and systems. The outputs have already contributed to the design of the Proof of Concept for the MRR and will contribute more broadly to the development of the MRR in future as well as being used as one practical test of the scope and nature of metadata requirements addressed by GSIM.

Early work on the metadata census confirmed that some current metadata systems are

  • fully corporate.
  • "shadow systems" which extend corporate systems to supplement the standard content with attributes of local interest.
    • The need for "shadow systems" should be eliminated once, via the MRR, modelling of information objects, attributes and relationships address the needs of the organisation as a whole.
      • Some systems may still not be able to only use "standard" metadata but the metadata actually used by these systems will be able to be described, and registered, together with the relationship of that metadata to "standard" metadata..
    • Some of the "shadow systems" have been designed and maintained to ensure they can be easily reintegrated with corporate systems in future while others have not.
  • truly "local" systems
    • These exist for a variety of legitimate and not so legitimate reasons.
    • The best of them source relevant content from the Corporate Metadata Repository (CMR) as a properly maintained snapshot but then reformat that content to meet local needs (eg to support systems that cannot "read" the metadata directly and require it to be translated/packaged in a special way).
    • The worst of these update, evolve and create new metadata for local use independently of the CMR.
    • Others deal with classes of metadata (eg methodological parameters to drive specific processes) which are not currently managed within the CMR.

Information about key existing corporate metadata systems, documented previously in this section of the case study, has been moved to a supporting page.

In regard to systems envisaged for the future, the Statistical Workflow Management (SWM) facility designed to work with the MRR is expected to provide a source for information related to, for example,

  • reusable process specifications
  • the assembly of processes into specific workflows
  • results from executing defined workflows
  • reusable business rules for driving and chaining processes and workflows

Earlier conceptual and exploratory work identified seven types of "process metadata" from "configuration" metadata about the IT environment and the user running the process, through to metadata which is a formal "input to", or "output from" the process, through to metadata which describes the process itself and which describes how chains of processes fit together. (None of these seven types of process metadata corresponded to "process metrics" as described below. Given there are already more than enough types of "process metadata", the ABS tends not to favour using the term to also denote "process metrics".)

Achieving a clearer path forward in regard to structuring and managing "process" metadata is seen as an important enabler to having other metadata (eg the structural definition of data elements) actively drive statistical processes.

It is intended that the work related to structural definition and description of processes harness appropriate standards such as BPMN (Business Process Model and Notation) and BPEL (Business Process Execution Language).

It is anticipated that, through SWM working with the MRR, it will become possible to specify and analyse detailed information related to the statistical information used by, and produced from, specific process steps.

A further priority is to better capture and store (for automated and interactive analysis and reporting) "process metrics" related to how statistical processes are performing (eg response rates, imputation rates, edit rates etc). Such data about the outcomes of processes is sometimes referred to as "process metadata", "operational metadata" or (typically in specific circumstances) "paradata" by others. Process metrics can be useful for internal monitoring, management and tuning of processes as well as generating data quality indicators for external dissemination.

2.3 Costs and Benefits

Section 2.2 details infrastructure delivered as the result of diverse projects, some of which first delivered outputs more than a decade ago. Lifecycle costs and benefits are extremely difficult to even estimate meaningfully.

Costs and benefits for new developments and redevelopments were estimated when developing business cases. While much better than a vacuum for planning purposes, past experience suggests these cost benefit analyses were seldom borne out with any precision in practice. Often this was because decisions were made over time to diverge from the original project plan in some way rather than just because the original estimation process was flawed or based on imperfect information.

IMTP is instituting a much more rigorous approach to

  • estimation of costs and benefits during the planning stage
  • establishing compelling evidence that the planned benefits are achievable in practice, together with establishing well defined outcome realisation plans to ensure the benefits will be achieved
  • managing projects to ensure the implications for planned costs and benefits are understood in regard to, and refactored as a result of, any variation from the original plan
  • managing related projects as a coherent program to ensure any benefits which rely on successful completion, and co-ordination, of multiple projects are realised (and that dependencies between projects are understood and supported appropriately)

For future developments, therefore, more concrete information should be able to report in this section.

During formulation of the detailed business case for IMT, however, it is not appropriate for the ABS to release to the public domain the details of estimated costs and benefits associated with the program.

2.4 Implementation Strategy

Information related to the implementation strategy can be gleaned from the description of IMT (including resources linked to the page) and from Section 1.1.

The challenges that provide the drivers for IMT must be addressed in one form or another. In order to achieve the transformation in a timely manner (eg in well under a decade), and realise maximum benefits for users of ABS (and other NSS) statistics, significant resources in addition to those allocated to undertaking and supporting current "business as usual" activities within the ABS will be required. This approach

  • achieves greatest efficiency overall (a more protracted approach requires a smaller budget each year but stretches over many more years and ends up costing more in total)
  • reduces, through a focused approach, risks to business continuity and sustainability during the transition period

The first generation of the information management framework and other enabling infrastructure such as the MRR, together with generic tool sets, is required before the main transformation (including re-engineering) across statistical production streams can begin in earnest. As has been the case for all elements of IMT, the main transformation period will be planned in detail prior to commencement (eg which re-engineering for which statistical business process will occur at which time during, eg, a four year period).

In terms of metadata management, the swinging of a pendulum can be seen to some extent in the BHM.

Developments in the 1990s tended to be on a "big bang" basis.  These were sometimes pejoratively referred to as "Cathedral Projects" for being too grandiose in ambition and design, and for taking much longer and much more money to complete than originally expected. Nevertheless, many of the results of these projects have proved to be of enduring value - so much so that many outputs have lived on long beyond their prime.

The strategy next (eg as formulated in 2003) became "opportunistic" and "incremental". There was notionally a "master plan" of what should exist in the longer term, but individual "construction projects" were much more modest in scale. Progress toward the "master plan" was much slower, less direct and more difficult than anticipated and hoped.

IMT is establishing a much clearer, more compelling, more widely shared and more actionable "master plan" together with the active corporate mandate and governance to achieve progress. Where the cathedrals of the 1990s tended to be largely designed and built in isolation, the IMT approach focuses on collaborative and sharable solutions underpinned by common standards and frameworks

A consistent learning has been that a well developed and managed implementation strategy (in addition to a development strategy) is essential. New capabilities are being delivered into a complex context of existing processes and infrastructure. Uptake of those new capabilities needs to be managed and promoted appropriately. (The simple "Field of Dreams" approach of "Build it and they will come!" has never yet worked for us.) Often the new capability and/or the implementation and communication strategy for it, needs to be refined based on early uptake experience. Whether it is managed by the development team or some other team, every major project requires a well planned and actively managed "Outcome Realisation" phase after it has finished delivering its major outputs.

Next section

  • No labels