1.1 Metadata strategy
Preface to most recent update (2011.1)
The previous major update to this case study occurred in the first half of 2009. As recorded in the document entitled A Brief History of Metadata (in the ABS) (referenced simply as BHM hereafter), which is attached to this Case Study, the second half of 2009 saw fundamental decisions made by the ABS leading to initiation of the ABS Information Management Transformation Program (IMTP) in February 2010. While IMTP designates a specific program within the ABS, including a specific top level unit within the organisation chart, the aim is for the ABS to achieve IMT (Information Management Transformation). All staff within the ABS have a role in achieving IMT.
IMT will include fundamental reshaping of policies and strategies related to metadata management developed by the ABS over the past two decades. At this stage, however, IMT remains an early "work in progress".
IMT can be seen as focused on the "to be" environment for the ABS, in terms of business architecture, data/information architecture and other elements of enterprise architecture, as well as on the process for achieving the transformation (including business process re-engineering) required to realise the "to be" state. At this time many details contained in the previous version of this case study continue to accurately describe the "as is" environment for metadata management within the ABS.
In the initial update for 2011 it has been decided to focus the main body of the case study on the "to be" state and initial steps toward that state. Many details of the "as is" environment have been moved into supporting documents. Other aspects can be found by referring to the 2009 version of the case study. It is possible within the wiki to view earlier versions of each page. For convenience, also, PDF versions of the 2009 edition of this case study and 2009 edition of BHM have been made available.
The result of this approach is that the Case Study document is now shorter than the 2009 edition. Additional details will be added to the documentation as IMT progresses, including its new approach to statistical information management, including metadata management.
It should also be noted that, as with all content in the METIS wiki maintained by ABS practitioners, this is an informal working document shared with colleagues in the field of statistical information management. Unless unambiguously indicated otherwise, no content accessed via these wiki pages should be considered to represent a formal statement on behalf of the Australian Bureau of Statistics.
Ultimately any strategy exists to support the ABS mission and objectives as set out in the organisation's corporate plan. In particular, the availability of appropriate metadata and the application of sound statistical information management practices are critical to supporting informed use of statistics and the quality of the statistical services we deliver to the nation.
BHM provides information on the evolution of ABS strategies related to metadata over time, including extensive information in regard to Strategy for End-to-End Management of ABS Metadata established in 2003.
IMT can be seen as superseding the 2003 strategy, although at this stage there is no "direct replacement" strategy focused specifically on metadata and its management. IMT focuses instead on strategies related to "statistical information" management which spans metadata (in its broadest sense) and data.
Although IMT supersedes the 2003 strategy, most of the fundamental ideas contained in the 2003 strategy remain relevant. For example, none of the twelve cornerstone principles outlined in the 2003 strategy have been disavowed as irrelevant or inappropriate. In this example, however, IMTP seeks principles
- focused on statistical information management rather than simply metadata management
- rationalised where relevant with principles underpinning other frameworks applied within the ABS (eg enterprise architecture, quality management of statistical processes)
- that take account of relevant standards and frameworks associated with the global and national "industry" of producing official statistics, as well as standards and frameworks relevant to data providers and to users of official statistics
- expressed concisely in terms meaningful, and motivating, to business staff
- supported by relevant guidelines and training to assist in them being applied appropriately to specific design processes and decision making
This process of rationalisation is underway currently and it is expected the updated set of principles will be added to the case study once available.
More generally, strategic planning for IMT can be seen as learning from experience with the 2003 strategy (eg much slower progress, and much more mixed success, in putting the strategy into effect than had been anticipated).
Well defined, corporately accepted and supported, governance for information management is much more of a foundation consideration for IMT. This includes clearly established norms/principles/expectations, clearly established authority and accountability and clearly established processes for assessing compliance and actively managing non-compliance. The corporate positioning of IMTP (eg reporting directly to the head of the organisation and independent of any one operational or support division) promotes its ability to address governance requirements successfully compared with the implementation of the 2003 strategy.
The IMT strategy of starting with the Metadata Registry Repository (MRR) as the key enabling infrastructure, including its integration with Statistical Workflow Management capabilities, can be seen as establishing a "central nervous system" to support the new environment – including supporting its relationship with "legacy" applications and repositories – where the 2003 strategy primarily targeted developing new repositories and redeveloping existing repositories without such a well developed strategy for achieving "business integration" in practice.
Compared with the 2003 strategy, IMT work on the Statistical Information Management Framework also includes much greater integration with external frameworks such as GSBPM (Generic Statistical Business Process Model andGSIM (Generic Statistical Information Model) as well as wiith other frameworks applied within the ABS (eg Enterprise Architecture).
In October 2009, the ABS Executive formally agreed on Statistical Data and Metadata Exchange (SDMX) and Data Documentation Initiative (DDI) as the standards that will form the core of the ABS's future directions and developments with regard to statistical information management. This means strategic engagement with the two standards communities, including encouraging them to co-ordinate their work in order to support NSIs and others who seek to use both standards, is a high priority.
Participation in the Statistical Network strategy is the primary, but far from only, example of IMT's strategic focus on collaboration (internationally and/or nationally) when it comes to statistical information management.
More detailed formal statements of IMT strategy in regard to statistical information management are still being reviewed within the ABS. Any encapsulation of strategies which is agreed for general release beyond the ABS will be added to this case study once available.
1.2 Current situation
BHM describes how the current situation has evolved within the ABS. Documentation of IMT outlines the current situation.
The majority of data collection and input processing activities for business and household surveys have moved toward implementation of high level metadata frameworks informed by ISO/IEC 11179. These frameworks were developed over the past decade and postdate the ABS specific metadata framework which was implemented for the corporate output data warehouse which was developed during the 1990s.
Key elements of current metadata infrastructure, which predate initiation of IMTP in 2010, include major repositories related to
- statistical activities
- Termed "collections" by the ABS, these activities include surveys, censuses, statistical analysis of administrative data sources and statistical "compilation" activities such as preparing the national accounts.
- These are specific structured data files, data cubes and tables associated with statistical activities. Examples include various "unit record files" and aggregate outputs.
- This is a "legacy" system based on an ABS specific data model.
- data elements
- This is a more recent development based on the metamodel found in ISO/IEC 11179 Part 3.
- questions and question modules
- This was developed more recently for household surveys with an aim to generalise the facility in future.
- collection instruments
- This was developed more recently for household surveys with an aim to generalise the facility in future.
The more recent developments also incorporate an approach to metadata registration based on ISO/IEC 11179 Part 6. Even if some of the older repositories cannot be completely replaced in the next few years it is anticipated that a common high level metadata registration framework, harnessing the MRR, can be implemented across the ABS for all classes of metadata. This does not imply that all classes of metadata will undergo exactly the same registration workflow, but the workflows for each class of metadata will be consistent with a higher level "metamodel" for registration.
Interoperability of the current ABS metadata models, including the legacy "output" model, with third party software (eg SAS, Blaise, SuperCROSS) continues to be an issue.
The increasing focus of the ABS and other agencies on the National Statistical Service (NSS) requires development of metadata models and capabilities which are usable beyond the ABS. The NSS needs to interoperate with agencies whose data content is more "administrative", "geospatial" or "research oriented" than "statistically" oriented. This provides additional challenges and issues in regard to metadata modelling.
While many of those agencies are at least as passionate about metadata as the ABS - but from a different "school" - the NSS also needs to support content producers and users for whom metadata is much less of an interest and priority. This raises questions about minimum metadata content and quality standards.
Understandably, metadata is a particular area of focus for the NSS. This includes a simplified and generalised set of principles for managing metadata.
Challenges associated with the current situation, such as achieving a coherent "end to end" metadata driven environment(s) within the ABS and better supporting the NSS, underpin IMT.