Statistical metadata systems (SMS) should serve statistical organizations as tools for the efficient management and performance of statistical information systems. Globalization has brought several issues in statistical production to greater prominence. There is an increasing user need to make official statistics comparable and easily available both at national and international levels. Clearly, the SMS has a lead role in this endeavour. The SMS efficiency in this respect is significantly influenced by the standards methods and techniques used.
The Common Metadata Framework (CMF) initiative aims to assist statistical organizations in the adoption, modelling, usage, and implementation of statistical metadata systems and practices across all phases of their statistical business process. Since the process for statistical surveys is generally the same everywhere, it is possible to build a common business process model for survey work. As a result, the Generic Statistical Business Process Model was developed and is presented in Part C of the CMF: "Metadata and the Statistical Business Process".
While Part A of the CMF, "Statistical Metadata in a Corporate Context: A guide for managers", is focused on the corporate governance of metadata projects and its target audience is senior managers in statistical organizations, the target audience for Part B is SMS designers and experts responsible for SMS implementation.
The issue of standardization of metadata has already been on the agenda of various international groups and organizations for many years (examples include the development of the ISO/IEC 11179 standard, the Data Documentation Initiative, and, the SDMX standard - ISO 17369). Based on requests from UNECE member countries, there has also been an on-going discussion regarding many aspects of international standards for statistics and related metadata within the Joint UNECE/Eurostat/OECD Work Sessions on Statistical Metadata.
There is a common understanding in statistical organizations that the use of common standards related to statistics and metadata is indispensable. The number and diversity of existing standards, however, makes it a challenge for statistical experts to understand them and incorporate them efficiently in the SMS global architecture.
The aim of Part B is therefore to offer SMS designers an overview of existing resources (standards, concepts, models, best practices and other methodological materials), which are likely to be applicable when designing and implementing SMS. It is designed primarily as an Internet publication, so that it can be kept as up to date as possible.
When designing a SMS at the national level, both national and international standards should be taken into consideration. Bearing in mind that national standards often address very specific requirements, this publication is focused on international standards. However, it also includes information on some of the more important internationally available and/or applicable national models and practices.
It is unremarkable to note that every survey and statistical program in each statistical office produces its own data with its own definitions. The consequence of this, however, is remarkable, because this is why we need metadata. One cannot know for sure, a priori, what data mean, so metadata describe the data each survey produces, and metadata describe the designs and processes that produce the data. Without metadata, it is not possible to understand and to use data.
In general, statistical surveys are conducted in the same way. They follow the same business process, and in fact, Part C of the CMF is devoted to describing this in the form of the Generic Statistical Business Process model. From the metadata perspective, this means that a single model for statistical metadata, covering all aspects of the survey life-cycle, is possible. However, agreement on a single model is very unlikely, and it may not even be practical. What is far more likely is that each program office in each statistical agency will devise its own way of handling metadata. In this case, since metadata are data, too, understanding the metadata for each survey or program will require their own metadata! This replicates the problem, and we aim to avoid this.
Luckily, there is a way around this, through the use of standards. Even though system specifications built by an office for its own use satisfy the needs of that office better than a standard can, there are advantages to using standards over building system specifications locally. First, standards represent a solution to a business problem that has already been thought through, reviewed, and implemented elsewhere. Time needed to develop a specification is eliminated, and systems are built more cheaply. Second, use of a common specification means that sharing information can be done through the standard rather than with pair-wise agreements. This greatly reduces the burden of interoperability and sharing data or metadata across agencies. Third, standards are known outside each office that uses them, so tools needed for using a standard may be built by other organizations, systems for implementing a standard may be shared, and knowledge about the use of a standard is readily available. Fourth, standards have conformity statements indicating the criteria necessary for claiming a standard is faithfully implemented. Conformity is a strong claim, and it is usually a sufficient condition for establishing interoperability. Finally, one standard will not fulfill the system requirements for an organization. Even a group of standards may not solve every problem. But, standards are often designed for use with others. As more are used to specify some implementation, the more the savings in development and interoperability costs.
This Part B of the CMF addresses these issues.
This publication is a unique source of information on existing statistical metadata standards. It aims to provide a single point of reference, giving SMS designers and other potential SMS users basic information about standards related to statistical metadata, as well as links to more detailed materials and resources.
The basic functions of the SMS in the statistical information system are:
(a) to uniquely and formally define the content and links between statistical objects;
(b) to uniquely and formally describe the content and links between statistical processes; and
(c) to determine all related technical parameters.
These functions are explained in more detail in Part A of the CMF.
To help SMS designers decide in which areas of the statistical information system metadata standards should be implemented, the overview of existing statistical metadata standards is presented according to the following groups of standards:
- Statistical concepts;
- Technical standards;
- Models and statistical practices;
- Methodological guidelines and recommendations.
The four groups of standards above should be taken into consideration when designing and implementing an SMS. Making links to the Generic Statistical Business Process (see Part C of the CMF), the integration of standards into the SMS should be ensured in the following phases of the Generic Statistical Business Process Model:
- phase 1- Specify Needs;
- phase 2- Design;
- phase 3-Build.
The focus in Part B is on the following chapters:
(a) Statistical Concepts
This group of metadata standards refer to the content of the statistics. It encompasses internationally accepted statistical standards and/or recommendations that refer to:
- concepts and definitions used for compiling, disseminating and exchanging statistics;
- statistical classifications;
- statistical units;
- statistical subject matter domains;
- other standards related to statistical content.
(b) Technical Standards
This group of metadata standards provide technical specifications for the exchange, storage, documentation and retrieval of statistical data and metadata, as well as other ICT supported activities dealing with the use of metadata for the production of statistics. ISO international standards on Statistical Data and Metadata Exchange (SDMX), metadata registries, Data Documentation Initiative (DDI), Geographical information system (GIS) and other standards are introduced in this chapter.
(c) Models and Statistical Practices
Internationally developed models related to statistical metadata, as well as those developed nationally and recognized and applicable internationally, are presented in this chapter. The Neuchâtel Model on Statistical Classifications and Variables, the Corporate Metadata Repository model, the IMF Data Quality Assurance Framework, and other widely recognized metadata models are presented in this chapter.
(d) Methodological Guidelines and Recommendations
A lot of methodological materials and recommendations related to statistical metadata have been developed in the framework of international cooperation organized by the UNECE together with OECD, Eurostat and other international organizations. Those materials have proved already many times to be an asset for many national and international statistical institutes when building their SMS. "Guidelines for Statistical Metadata on the Internet", and "Best Practices in Designing Websites for Dissemination of Statistics" are examples of such documents. Those and others are introduced in this chapter.
The coverage of these four chapters (including links among standards) is presented graphically in Figure 1.
Figure 1. Metadata standards, concepts and models