- Erstellt von Benutzer-51b47, zuletzt geändert von Essi Kaukonen am 15 Sep, 2016
| Contact person* | Essi Kaukonen |
|---|---|
| Job title | Planning Officer / Metadata services |
essi.kaukonen@stat.fi | |
| Telephone | + 358 50 314 2311 |
Metadata strategy
Statistics Finland does not have a specific metadata strategy but metadata tasks are steered by the objectives of Statistics Finland's operational strategy such as the usability and reusability of data, reliability of statistics and standardization of processes. A common metadata system supports these objectives by providing a uniform way of describing statistical resources, such as classifications, data sets and variables.
The main principle is that metadata will be created and maintained in the common metadata system and will be made available to whatever use it is needed for in business processes from data collection to data dissemination. Common statistical metadata system will be further developed in order to rationalize and support the harmonization of statistical business processes.
Metatietojärjestelmän rakentamisessa noudatetaan TK:n ICT-strategiassa määriteltyjä palvelupohjaisen tietojärjestelmäarkkitehtuurin periaatteita. Metatietojärjestelmää kehitetään metatietokokonaisuus kerrallaan.
Metatietotyössä huomioidaan kansainväliset ja kansalliset standardit ja pyritään hyödyntämään niitä omien ratkaisujen sijaan. Tämän avulla tuetaan järjestelmien yhteentoimivuutta ja helpotetaan tiedonsiirtoa sekä parannetaan asiakaspalvelua. Tilastokeskus on mukana kansainvälisessä ja kansallisessa metatietojen kehittämistyössä.
Metatietoja kehitetään osana TK:n muuta tietoarkkitehtuurin kehittämistä. Projektien ulkopuolella tehtävä pienempimuotoinen kehittämistyö suunnitellaan ja koordinoidaan muutoksenhallintaryhmissä.
Current situation
At the moment, we are developing our metadata system and standard ways to utilize it in several projects.
.
Metadata Classification
The following categorization of metadata will be further developed in national ISAACUS Project. See Ongoing projects.
1) Statistical metadata.
Statistical metadata consist of :
- descriptions and definitions of statistical data and variables
- classifications
- variable formulas and unit of measurement
2) Statistical data quality.
Statistical data quality reports consist of :
- statistical method descriptions
- relevance of data
- validity, reliability and accuracy of data
The former elements of the report are evaluated by quality indicators which are based on international recommendations.
3) Metadata of statistical documents or products.
Document and product metadata consist of information about:
- producers
- publication information
- identification knowledge of the publications or products
- field or subject area glossary
- keywords
4) Process metadata. Process metadata are divided into technical and conceptual metadata:
a) technical metadata
- technical metadata guide the process of data production: data collection, data management and data dissemination. For instance, it makes it possible to follow data production phase by phase. It also documents the process.
b) conceptual process metadata
- conceptual process metadata consist of the technical information of data and variables which are used in producing data. For example, they can be minimum or maximum values, various calculation rules or use of certain classification values.
Metadata system(s)
Statistics Finland Metadata System consists of metadatapools. The data is maintained using metadata editors. The content of the centralized metadata system is used today in several system sin certain parts of statistical production, like
Tilastokeskuksen metatietojärjestelmä muodostuu metatietovarannoista, joita ylläpidetään editoreilla. Keskitetyn metatietojärjestelmän sisältöä hyödynnetään tällä hetkellä tietyissä kohdin tilastotuotantoprosessia, esimerkiksi rekisteripohjaisen aineistojen vastaanotossa ja arkistoinnissa.
The data descriptions stored in the metadata warehouse are created with the variable editor. As far as possible, data maintained in other systems (classification and concepts database, the operational guidance and planning system STOJ) are linked into the descriptions. Data description warehouse includes data descriptions from the published macro data sets as well as disseminated micro data.
The classifications and concepts are automatically copied from SQL Server databases to the new metadata warehouse.
The data descriptions in the metadata warehouse are also utilised in trilingual tabulation in SAS and PX-Edit.
Only part of all the metadata generated at Statistics Finland are updated at the moment in the common metadata warehouses. Plenty of metadata is still stored in the data systems of specific Statistical program, or in SAS files which makes them available only to the statistics concerned or even only to a certain expert. Deficient and non-uniform descriptions of metadata restrict the retrievability and usability and also the discoverability of the data.
The concepts and definitions published on the stat.fi website in the concepts section are produced from the concepts database. Concepts are pushed to the internet through eXist database. In the variable editor and the archiving database system, concepts can be retrieved from the concepts database and they can be combined to variable descriptions.
The statistical document metadata are maintained also in eXist with the Arbortext text editor. Arbortext reads trilingual variable data into the tables inside the publications.
The document metadata are maintained in eXist with the Arbortext text editor. Arbortext reads trilingual variable data into the tables inside the publications.
The data and variable descriptions stored in the metadata warehouse are drawn up with the variable editor. As far as possible, data maintained in other systems (classification and concepts database, the operational guidance and planning system STOJ) are used in the descriptions.
The data and variable descriptions in the metadata warehouse are utilised in trilingual tabulation in SAS and PX-Edit.
The content of the classification database has long been utilised as SAS formats and to an extent in the statistics production processes. In the variable editor and the archiving database system, variable-specific classifications can be retrieved from the classification database and they can be added to the data description. The classifications published on the classifications pages of the stat.fi service are also produced from the classification database.
The concepts and definitions published on the stat.fi website in the concepts service are produced from the concepts database. In the variable editor and the archiving database system, concepts can be retrieved from the concepts database and they can be combined to variable descriptions.
The new metadata system elements already in use are the eXist-XML database acting as the metadata warehouse, the variable editor and the Arbortext text editor. The classifications and concepts are automatically copied from SQL Server databases to the new metadata warehouse.
Only part of all the metadata generated at Statistics Finland are updated at the moment in the common metadata warehouses. Plenty of metadata is stored in the data systems of specific sets of statistics, in SAS and Word and Excel files, which makes them available only to the statistics concerned or even only to a certain expert. Deficient and non-uniform descriptions of metadata restrict their retrievability and usability.
Ongoing development projects
The new metadata system aims to enhance the connections of the metadata warehouse to the statistics production process by making the metadata maintenance tools easy to use, by improving the connections of the statistical information systems to the centralised metadata warehouse, and by increasing services related to the use of metadata contents.
ISAACUS Mikko
nettisvuston uudistaminen Mikko
Laatu saija
New archiving system for microdata (unit level data), implementation project
In the project that was executed in years 2015-2016, Statistics Finland renewed the process for archiving microdata and built some new tools for the process. The new process is based on combining data description (made with Variable editor) with data sets (SAS files) to an archive-ready xdf file.
The work now goes on with an implementation project. The most essential tasks to carry out during this project is to make archiving task a solid part of the statistical process; to make clear instructions on the selection criteria, storing and disposal of archived statistical data; to ease planning and quality control for archived data and to make archived data qualified enough for immediate active use in research and other purposes.
New Classification System, implementation project
The new GSIM-based Classification Editor and Classification Services as well as Classification Warehouse were launced 16.02.2016. The system is already used for classification maintenance. The classification data is syncronized into the old classifiation system. During 2016 plans will be made about how and when all the other systems like, SAS and Internet, will be connected with the new Classification System.
Renewal of archiving
Quality reporting
The project examines the relationship of the present quality reporting to the coming requirements, reviews the connection of Statistics Finland's metadata warehouse and metadata model to Eurostat's extended Metadata Standard (SIMS) and makes a plan for introducing the new quality reporting model. The aim is to perform quality reporting so that quality reports are no longer made separately for the EU, other international organisations and domestic users, but one quality report is used as far as possible in reporting. The project will start in May 2013.
Other data systems related to metadata
TILKUT is a description database of statistics that contains basic data on statistics (name, description, topics, keywords, publication frequency and contact persons). The data from the TILKUT database are used in the stat.fi web service, the operational guidance and planning system STOJ, and the variable editor.
The operational guidance and planning system STOJ includes information on the names, publication times and contact persons of publications. The contact details of persons needed for data descriptions are retrieved from STOJ to the variable editor.
Starting from 2006, the data collection register contains data related to Statistics Finland's data collections. The system was originally built to serve metadata needs connected to direct data collections. The system was later extended to cover administrative data sets as well. In principle, the register should have all Statistics Finland's data collections described, but especially for administrative data sets, this objective has not been reached. The data are used in stat.fi’s services to data providers, in the register of enterprise respondents and in Statistics Finland's planning and monitoring process. The register contains estimates of the burden caused by an individual data collection.
The register of enterprise respondents is a register intended for managing data collections to which samples, response data and respondent data are stored. The register is used to control whether a response has been received from a data provider and a rough estimate is given of the response burden.
For personal data collections, data on samples have already been collected for some time, but there is no actual register of them.
Costs and Benefits
.Implementation strategy
The implementation strategy of the metadata system is step-wise.IT Architecture
Statistics Finland's common metadata system has been built ever since from beginning of 90's. The strategy today is to move towards service-based architecture step-wise.
The different editors (interfaces) have been built during several decades and, therefore, also the technologies used differ. We currently have .NET applications and powerBuilder applications. The newest, i.e Classification Editor, is a web-application.
created with ASP.NET MVC - framework. Editor uses Classification Services which is created with ASP.NET Web API (2.0) framework.
The databases behind are both SQL databases (SQL Server) and XML databases.(eXist).
Metadata Management Tools
Statistics Finland's metadata system is operated by following tools:
- The Concept Editor (in use since 1990's, will be renewed soon)
- The Classification Editor (renewed system and editor are in production since Febuary 2016)
- The Variable Editor (in use since 2013, will be developed further in following years)
These tools are used for metadata management. The metadata itself (concepts, classifications, variables) are used in several systems in statistical production process.
Standards and formats
Statistics Finland aims to implement UNECE standards (GSBPM, GSIM, CSPA, LIM) in all development projects. The standards, especially GSIM and LIM, will be used as the base for the information architecture developed in upcoming years. Statistics Finland is planning to renew the data description system starting from 2016 onwards. The conceptual base for this work is GSIM. The main idea is to re-use metadata elements and show interactions between core metadata elements in different process phases.
Currently, Statistics Finland is using its own information model CoSSI "a Common Structure of Statistical Information" in most parts of the metadata system. GSIM Statistical Classification Model is used In the new Classification System.
CoSSI was developed in accordance with international standards such as the Dublin Core and CALS in the beginning of this millenium. The first version was published year 2003. CoSSI is a modular information model for describing statistical tables, classifications, concepts, variables, general information on statistical documents, quality descriptions, etc. The way in which data are produced (process metadata) is not described in the CoSSI information model.
CoSSI documentation on the web: http://www.stat.fi/org/tut/dthemes/drafts/cossi_en.html
Version control and revisions
.Outsourcing versus in-house development
The user interfaces and the applications for the databases have been mainly developed and built in-house.The applications developed at Statistics Finland can in principle be shared free of charge with other statistical organizations.Where necessary, details regarding test use and access to more precise descriptions etc. may be agreed upon separately.Sharing software components of tools
.Overview of roles and responsibilities
A high-level organisation structure map can be found on the Statistics Finland website: http://tilastokeskus.fi/org/tilastokeskus/organisaatio_en.html
In connection with Statistics Finland's organisational change, the Metadata Services unit was transferred on 1 January 2013 from the Information Technology Department to the new Standards and Methods Development Department. The task of the Standards and Methods Development Department is to steer the statistics production process, to support statistical methodology in statistics production, to promote uniform application of metadata and classifications and quality work in statistics production and to intensify project work.
The main guidelines for the development of the metadata systems at Statistics Finland are coordinated and processed in cooperation with the Standards and methods and IT Departments. The statistical departments lay out the main demands and needs for metadatabases and their use in various phases of the statistical processes.
The Metadata Services unit maintains classification standards, concepts and the archiving metadata system. The statistical departments maintain their own (statistical) metadata in the centralised metadata systems according to the instructions made by the Metadata Services unit. The Metadata Services unit also trains and consults statistical departments in metadata issues and is in charge of controlling quality in the metadata systems.
The CoSSI model steering group is in charge of managing and developing the model according to user needs in a manner that will not expose its main structure to risk.
Metadata management team
.Training and knowledge management
Training and knowledge management of metadata experts:
New experts working at the Metadata Services unit have been trained mainly by mentoring and guidance provided by senior experts. ESTP courses on metadata training have been provided.
Training and knowledge management of metadata of statistical departments:
Statistics Finland provides structured basic cources (for new recruits) and advanced courses dealing with statistical production. These cources contain also basic information about the present metadata systems, mainly classifications, concepts and archive data management.
The Metadata Services unit provides the personnel with informative briefings whenever there are major modifications made to the metadata system either in content or in application development. More systematicly organized training is needed when new tools are brought into use. Half-day metadata seminars are organised yearly to present topical metadata issues.
As an example, in connection with the implementation project of the variable editor, an extensive training programme was carried out during which a one-day training was planned and organised for statistical experts on making of data descriptions and use of common tools for it. During the project, training days were held around twice a month. In the future, new modes of training, especially self-training by following LYNC-recordings and special cliniques where you can work with your own material, are to be considered.
In practice, much of the training today for users of the metadata system is still done side by side. This leads to good results but is, in fact, ineffective as each client is trained individually and also resources used here would be needed in the development of metadata systems.
Partnerships and cooperation
Statistics Finland cooperates with organisations and participates in working groups that define standard classifications or standards on both international and national levels. Metadata experts attend regularly Eurostat’s Metadata Working Group and Classification Group meetings as well as METIS meetings. Statistics Finland has also representatives in the PC-Axis Reference group and Eurostat’s Quality Working Group. Spatial metadata experts contibute in INSPIRE metadata work and attend national working groups in the implementation of the INSPIRE metadata process.Other issues
.Lessons learned
A metadata system complying with the uniform architecture is not just a technological renewal, but its implementation will require change in work procedures, responsibilities and organisation of tasks. The change in work procedures above all means timely recording of metadata in connection with data planning and production – a move from irregular retrospective description to regular and up-to-date description. The change is a challenge to the systems and applications, because work procedures change only if the technology allows it. An optimal procedure can be realised only if the users feel the applications are easy to use and serve their work. Commitment by the Management and their support to the work is crucial for the statistical units to be able to provide the contribution needed to the development work and for ensuring that the work will be sustained. The centralised metadata system should support the harmonisation of the production of statistics to a sufficient degree, thus making it more effective, but it should also be flexible enough to a certain extent to serve what statistics specifically call for. Involving statistics in the planning is needed.
- Keine Stichwörter