Login required to access the wiki. Please register to create your login credentials We apologize for any inconvenience this may cause, but please note that this step is necessary to protect your privacy and ensure a safer browsing experience. Thank you for your cooperation. Documents available for download: GAMSO , GSBPM , GSIM |
Contact person* | |
---|---|
Job title | Expert: General Methodology Department |
Telephone | +420 27504 4216 |
Metadata strategy
A redesign of the Statistical Information System (SIS) was launched in the CZSO in 2004. The project has been centrally managed and monitored by the top management of the CZSO.
The first important step was design of a new SIS architecture. The architecture copes with increasing users´ requirements, both on a national and international level. It allows effective acquiring and completing of statistical data and metadata.
Major goals for Redesign SIS:
- reducing response burden and boosting respondent motivation;
- optimising production of statistical information in the CZSO;
- designing a conceptual model of SIS and Statistical Metainformation System (SMS);
- Defining a unified architecture of statistical tasks; **
- improving quality of statistical information;
- increasing users' comfort.
The SIS model encompasses a statistical business process (SBP) in all its phases, starting from assessment of users' requirements up to the dissemination of statistical information. The basis for the model has been a life cycle of statistical tasks. Recently the model was compared and updated in accordance with the Generic Statistical Business Process Model (GSBPM).
Core principles for Redesign SIS are as follows:
- systematic assessment and evaluation of statistical data requirements,
- increasing share of administrative data,
- increasing use of data modelling,
- implementation of SMS,
- implementation of statistical data warehouse,
- freeze of statistical surveys for 2-3 years,
- avoiding redundancy in statistical surveying.
The global architecture of SIS (GASIS) is composed of three components:
1.Content component of SIS
There is a significant shift in the content component from the statistical survey approach towards statistical object oriented approach.
The content component identifies data sources, links between surveys of different periodicity and different purposes; defines modelling methods, stratification of samples etc. The following three types of statistical variable are distinguished:
- fundamental variable (used for calibration and/or modelling),
- standard variable (predefined set of the statistically most important variables),
- complementary variable (supporting fundamental and standard variables).
2. Metainformation component (SMS)
Systematic use of metainformation inside and outside the SIS as a tool for internal and external integration. SMS is focused on the SBP. The model used for definition of a statistical variable ensures its standard description from the beginning to the end of SBP. Also the model and metadata description of statistical task's components has been designed.
3. ICT component
Software and hardware support for SBP. Standardisation of application software used in all stages of SBP. Tools for mathematical models and mathematical and statistical methods. Tools for data approval, release and dissemination. Statistical data warehouse and public database.
Statistical task - is a set of statistical activities needed to fulfil a user's request for statistical information. The statistical task can be composed of one or more statistical surveys.
Statistical survey - is a set of activities connected with the proposal of statistical questionnaire, preparing a sample, printing and distributing questionnaires, collecting completed questionnaires, data entry (including electronic collection of data) and data validation. Statistical survey is always a part of statistical task.
Current situation
The CZSO management in early 2005 approved the SMS strategy.
SMS is composed of mutually interlinked subsystems. In time being of updating this case study, the following subsystems have been tested under the pilot project of annual labour statistics:
- statistical classifications,
- statistical variables,
- statistical tasks,
- statistical quality.
Global architecture of SMS defines principles, obligatory for all SMS subsystems.
In years 2005 – 2010 design and implementation of SMS subsystems
(a.) Statistical classifications (CLASS)
(b.) Statistical variables (VAR) and
(c.) Statistical tasks (TASKS)
were carried out.
In time being the CLASS and VAR subsystems have been in regular operation, the TASK subsystem has been in semi-production operation.
Metadata Classification
Statistical metadata include content oriented and technological metadata. Both groups are needed for design, implementation and running of STs.
Content oriented metadata are presented in the following groups:
1. Metadata on statistical concepts and models
This group describes models of statistical classifications and statistical variables.
2. Metadata on statistical methods
This group describes imputation methods of missing values/data, grossing up to the whole population of observed units, methods for time series conversions, seasonal adjustment, methods for expert estimates, analytical, mathematical and statistical methods of evaluation, etc.
3. Metadata on processing procedures
This group describes processing procedures for individual stages of STs life cycle. For example the data collection, respondent burden measurement, preparation of statistical questionnaires, data validation, quality assessment, and aggregation, preparation of statistical tables, etc.
4. Metadata on use of statistical information
This group describes user satisfaction, use of statistical information by respondents, analysis of users' requirements for information, FAQ, users' opinions, use of web pages, etc.
5. Metadata on SBP assessment and evaluation
This group provides source materials for assessment and evaluation of effectiveness in individual phases of SBP and source materials for financial controlling in the CZSO.
The table below shows a placement of the above groups of metadata in the SMS architecture.
| Classifications | Variables | Tasks | Quality | Dissemination | Users | Respondents | Time series | Data fund |
Statistical models | x | x |
|
|
|
|
|
|
|
Statistical methods |
|
| x |
|
|
|
|
|
|
Processing procedures |
|
| x | x | x | x |
| x | x |
Use of statistical information |
|
|
|
| x | x | x | x |
|
Assessment and further development | x | x | x | x | x | x | x | x | x |
Metadata system(s)
SMS architecture is modular. It is composed of relatively self-sustainable, mutually interlinked subsystems as presented in the diagram below.
CLASS | Statistical Classification | RESP | Respondents |
VAR | Statistical Variables | D-FUND | Data Fund |
ST | Statistical Tasks | T-SERIES | Time Series |
REG | Registers | DISSEM | Dissemination |
QUALITY | Statistical Quality |
|
|
SMS subsystems
Statistical Classification (CLASS) - maintenance and update of statistical classifications/codelists; the module has been in full operation. It contains about 1000 active code-lists and all international statistical classifications.
Statistical Variables (VAR) - maintenance and update of the catalogue of statistical variables.
Description of VAR is based on the metadata model used for VAR in all stages of SBP; the module is in full operation. It comprise descriptions over 4000 variables.
Statistical Tasks (ST) - maintenance of metadata related to the design and processing of ST (basic characteristics, statistical questionnaires, statistical surveys, other input data, decree on annual programme of statistical surveys, data validation, definition of statistical samples, imputation methods, quality requirements, aggregations, specification of users, time-tables for data collection, applied code-lists, legislation, provider of ICT services, specification of ICT services, etc); the module has been in semi-production run. At present it contains description of building blocks of statistical questionnaires, description of validation rules for selected tasks of economic statistics, description of validation rules, automated corrections derivations and transformations of variables for population census 2011, etc.
Statistical Quality (QUALITY) - maintenance and update of qualitative characteristics and methods for statistical data assessment;the module is in design phase.
Statistical Time Series* (T-SERIES) - maintenance and update of metadata on current statistical time series;; there is an intention to launch a project of this module
Dissemination (DISSEM) - maintenance and update of metadata linked to dissemination of statistical information (statistical publications, electronic outputs, web site, data security etc.); the module has been designed and implemented for formal specification of population census outputs (the first phase of implementation).
Respondents (RESP) - maintenance and update of metadata on respondents, (respondent burden, respondent opinions, reporting duty, links to statistical surveys, etc); this module has been partially implemented, mainly the registration of respondents and for monitoring progress in data collection of economic statistics questionnaires.
Users (USERS) - maintenance and update of metadata on the SIS external users (users' opinions, FAQ, etc.); the implementation of the module has been still opened.
Data Fund (D-FUND) - maintenance and update of metadata on contents and structure of data files included in SIS.A data warehouse has been implemented as the central storage of approved micro and aggregated data; this implementation creates the first phase of the module.
iSMS - Internet presentation of SMS -- new application for presentation of statistical classifications and statistical variables has been developed for external users. There is and intention to extend this application for presentation of selected parts of the TASKS module.
SMS is interlinked with the system of Statistical Registers. The main registers in this system are the following:
- Business Register,
- Register of Census Districts and Buildings, and
- Population Register.
Core principles for SMS implementation
- unified internal users´ interface (search, update, administration),
- unified external users´ interface (navigation, selection, interpretation),
- unified data interfaces between SMS subsystems,
- preserving history of SMS objects,
- update of metadata elements on one place only,
- single authoritative source (registration authority) for each metadata element,
- registration process associated with each metadata element so, that there is a clear identification of ownership, approval status, date of operation etc,
- reuse of metadata where possible for statistical integration as well as efficiency reasons,
- unique storage and update of metadata,
- unified user documentation,
- unified technical documentation,
- standard data protection model,
- consistency of metadata inside the SMS subsystem and between subsystems,
- unified technological tools for implementation.
Steps in implementation of SMS subsystems and responsibility for them
- business system options (BSO) by CZSO,
- technical system options (TSO) by external supplier,
- programming by external supplier,
- testing by CZSO and external supplier,
- pilot processing using selected ST by CZSO and external supplier,
- operational running by CZSO.
Costs and Benefits
a) SMS financing
Principles of SMS financing:
- BSO are prepared by the CZSO and financed from the CZSO budget.
- TSO are prepared by external suppliers and financed partly from the CZSO budget and partly from resources provided by the EU (Transition Facility programmes and Integrated Operational Programme).
b) SMS benefits
- interlink of statistical data and metadata from beginning to end of SBP allows unified and clear data interpretation,
- strengthening the role of methodology throughout SBP,
- systematic data quality assessment,
- upgrading of data dissemination and interpretation to users,
- integration with other ISs of public administration,
- integration with ISs of international organisations (Eurostat, OECD, UN, IMF, etc.),
- tool for defining phases of SBP,
- tool for management of ST processing.
The current progress of the project makes it obvious that SMS strengthens the role of methodology in defining the content, size and coordination of statistical surveys.
The introduction of project management and organisation of work during SMS implementation increased the CZSO research potential without staff's increases. More staff of different profession groups (management, methodologists, statisticians, IT specialists) got involved. Training courses improved the knowledge of SMS subsystems in all profession groups. Communication barriers were reduced between departments involved in SBP (subject-matter departments, methodology, IT).
Implementation strategy
The SMS implementation is, in fact, a 'big-bang' approach. At this time, only statistical classifications are maintained and updated in e-way. Statistical tasks are defined without using metadata. Current application processing tools use different identification of statistical variables for central processing and different meta-identification for variables stored in the output database.
Introduction of the SMS into practice implies a change in the process of preparing and designing STs by statistical departments. These activities will rely on work with metadata and hence on using SMS tools. A prerequisite for the use of SMS functions is availability of an updated metadata base. What has to be done further is to bring into being all functions and organisational measures related to metadata administration. Adequate training of all participating actors should precede the SMS implementation.
The main condition for introduction of SMS into the SIS operational running is its functionality in all stages of SBP. Effective and viable interlink of SMS subsystems interpreted in a unified metadata base is a necessary precondition for that. This requirement predefines priorities in design and implementation of SMS subsystems implementation strategy.
In view of the project comprehensiveness and complexity, SMS should be developed step by step. The step-wise approach, however, has a clearly defined framework.
The first stage of the SMS introduction into the practice (2008-2009)
Pilot project
Subsystems CLASS, VAR, ST and QUALITY have been tested on the Annual Labour Costs Survey.
There is to test functionality of SMS namely for the following activities:
The aim of a pilot project
- definition of ST,
- design of statistical questionnaires,
- data validation (logical control specification),
- design of samples (response duty specifications),
- aggregate specifications,
- output specifications,
- preparation of timetables,
- specification of quality attributes of a statistical task.
The pilot project pre-requires the following:
- to complete a database of statistical classifications (SMS-CLASS),
- to unify methodologically a content of statistical survey(s) for the pilot project,
- to complete a description of statistical variables relevant to the pilot and to ensure their storage in the database (SMS-VAR),
- to create a database for definition of statistical tasks (SMS-ST),
- to develop and test an SMS application program package,
- to develop and make operational statistical data warehouse,
- to establish and make operational an SMS administration,
- to accomplish training of personnel for all professions needed for the pilot project (methodology, subject-matter departments, SMS administration, project preparation, IT applications).
Building up and loading of an SMS database has been for the CZSO an entirely new task. In the newly established SMS-CLASS database, the links to the existing (old) e-system of statistical classifications should be maintained until a complete transition of statistical tasks into the new SIS is accomplished.
The second stage of the SMS introduction into the practice (from 2010 on) will be focused on development, implementation and gradual introduction into practice of SMS subsystems for monitoring of quality, time series, dissemination, respondents and users of statistical information. The second stage will comprise also the completion of SMS-CLASS, VAR, TASKS and D-FUND, namely in terms of their links to the newly prepared SMS subsystems.
IT Architecture
IT architecture of SMS is an integral part of IT architecture of SIS. The SMS is a necessary precondition for all statistical data warehouse operations. Data warehouse will finally become the only place to store all statistical data with their completely structured metadata description.
Technological infrastructure
Computing centre is focused on using servers with UNIX operating system and Oracle database technology. Technological equipment is grouped into Unix clusters on which Oracle database and application servers operate. In the framework of SIS new architecture implementation, applied technological tools will be enlarged by data warehouse technology.
As client stations are used personal computers with operating system Microsoft Windows 2000 or higher and browsers Internet Explorer and Firefox Mozilla, program package Microsoft Office and other utilities.
SMS application programme package
- will not depend on users' work station platform, they will be under operating system MS Windows (version 2000 or higher) or Linux,
- metadata will be viewed through the Internet browser without installation of supplementary products at the internal user station,
- for metadata administration "thick client" solution can be used,
- will be implemented in the following technological environments:
- Oracle Forms and Reports (three-layer architecture). Oracle Application Server is supposed to be used. In this case the client is Internet browser;
- Java (possible thick clients);
- Java Server Pages for thin clients outside Oracle Forms and Reports. The Oracle Forms and Reports technology may not be suitable for some parts of SMS - then Java Server Pages (JSP) will be used;
- access to individual subsystems will be unified via SMS access portal while this portal will make part of the CZSO internal portal,
- for metadata presentation on the Internet Java Server Pages (JSP) technology will be used.
Metadata Management Tools
The following two technologies will be used for communication between SMS subsystems and between SMS and the other applications:
- technology of direct communication between Oracle data tables of SMS subsystems and tables of other applications,
- XML technology for defining unified communication interfaces between individual SMS subsystems and interfaces between SMS subsystems and other applications.
User interface to SMS subsystems will be developed in Oracle Forms. Standard rules were defined for design of communication windows to keep unified appearance and distribution of function keys. The user's basic tool is the Internet browser in which applications of individual subsystems are started.
The link of data with metadata will be established in data warehouse, using ETL processes. Structured metadata to statistical data for data warehouse will be taken from subsystems VAR, CLASS and TASKS.
Standards and formats
The SMS subsystem uses:
- XML interface or exports for other SMS or SIS applications,
- off-line work with copy of the data (created with the help of XML),
- SMS applications developed in Oracle Forms 10g can be started at work stations with MS Windows or Linux,
- Internet mirror of subsystem CLASS will be created with the help of Java Server Pages (JSP),
- backup on central db server in Oracle ARCHIVELOG product,
- SDMX technical standards for data interchange (implementation in near future).
Version control and revisions
Description of object state in SMS is stable in a time interval from - to. From certain point in time, description is changed or its validity terminated. For this stable state of object in certain validity range is used the concept object version.
Object version may be subject to changes, which are required to be registered in the system. To observe the history of object version changes, the following rules are to be followed:
- Until object version is approved, the changes are not registered, i.e. only current state of object is stored in the system.
- Provided object version has been approved, the resulting state is registered in the system for good.
If there is a need to change an object version already approved, so-called object version revision is made. On approval of an object version revision, former objects are designated as cancelled and replaced by new ones.
Each object version may go through the following states:
- under preparation;
- for approval;
- approved;
- revised;
- revised for approval.
Under preparation - new object or new version of already existing object is created.
For approval - no changes of object version can be made. Object version is prepared for approval. The result may bring an object version into the state 'Approved' or back into 'Under preparation'.
Approved - the object version is valid and available for other systems. No changes of object version can be made.
Revised - in case there is a need to change an object version that is in the state 'Approved'. The state before the revision remains stored and changes are made on a new copy of a given object version. The revised object version may be:
- approved - by which it replaces the former state of a given object version. The former state of object version is also registered in the system but designated as cancelled.
- rejected - changes made by the revision are cancelled and the former state remains unchanged.
Revised for approval - object version revision is completed and submitted for approval. No changes can be made to object version revision. The result may bring the object version revision into the state 'Approved' or back into 'Revised'.
Outsourcing versus in-house development
The following procedure was adopted for preparation and implementation of SMS subsystem:
- content and functional specifications are prepared by multidisciplinary project teams set up of employees from different CZSO departments (methodology, subject-matter statisticians, IT experts),
- technology proposal and design, implementation and running of applications will be dealt with by outsourcing,
- applications are tested by project teams in cooperation with the SMS department and supply and delivery company,
- initial filling of the subsystem will be made by staff of subject-matter departments in cooperation with the SMS department and supply and delivery company,
- principal documents such as design proposal or technical specification are subject to opposition discussions,
- takeover of program applications of SMS subsystems is completed by the acceptance protocol,
- in routine operation, individual subsystems will be used by statistical subject-matter departments, general methodology department, SMS department and data processing departments,
- in the end SMS tools should also be available to workplaces of the state statistical service with the aim to integrate methodology and contents of official statistics.
Sharing software components of tools
Overview of roles and responsibilities
The organisation model of SMS project management must be presented in the context of the model of SIS Redesign management.
Organisation structure of SMS is composed of the Project Steering Committee (PSC), Task Force SMS and the project teams appointed by the CZSO top management for development of individual SMS subsystems. Supervision over the whole project has the top management of the CZSO, which reviews progress reports on SMS subsystems presented by the PSC. Achieved results and/or proposals for changes are subject for the consideration and approval by the CZSO top management. Furthermore, the CZSO top management appoints members of PSC, Task Force SMS and project teams.
The nature of SMS project requires participation of diverse professions in the project teams. Members of the project teams are methodologists, subject-matter statisticians, IT specialists, programmers, specialists on statistical dissemination, users etc. Composition of working teams is flexible, depending on the nature of problems to be solved.
For the operational running the SMS administration must be established and will be an integral part of the organizational structure of the CZSO. According to the present situation the SMS administration will fulfil following roles:
- central administration of SMS,
- administration of the subsystem CLASS,
- administration of the subsystem VAR,
- administration of the subsystem TASKS,
- administration of the subsystem QUALITY.
The set of administration roles will be increased in future in accordance with further development of the SMS.
Organisation model of SMS project management is shown in the following diagram:
Metadata management team
Activities focused on SMS subsystems are organised as follows:
1. Steering committee (SC) for development projects SIS Redesign, SMS and Public Database is headed by the First Vice-President of the CZSO. Members of the SC: management of subject-matter statistics and methodology, IT management and advisor to the President of the CZSO for the SIS and SMS.
The Steering Committee manages conceptually the work on the projects in the CZSO. It continually controls the results of the work (at least once in three months) and takes principal decisions on the way of further work.
2. Members of the project teams (PT) are heads of sections, selected directors and subject-matter experts of the CZSO. The top management appoints the heads of the teams.
The project teams work on design of individual SMS subsystems and cooperate with external project workers in development, testing and putting the subsystem into the CZSO practice.
3. Administrative coordination/organisation of SMS project teams is the responsibility of Task Force SMS in cooperation with the heads of project teams for individual SMS subsystems.
The work includes the preparation of regular progress reports and creation of conditions for activities of external project workers inside the Office.
4. Operational running of the SMS will be incorporated in the CZSO's organisation structure, which will meet the requirements of SMS administration.
Separate CZSO intranet web pages have been built and regularly updated to comply with the need of documentation and information sharing between project teams of SIS Redesign, SMS and Public Database. Documents approved by the project teams are available to all CZSO staff members. These web pages proved to be an important tool for dissemination of SMS information inside the Office.
Training and knowledge management
Partnerships and cooperation
Other issues
Lessons learned
Some most important experiences and conclusions from our practice:
- SMS strategy in terms of contents and methodology must be fully in the responsibility of the statistical office,
- SMS design and implementation should be organized in the multidisciplinary working teams;
- design and implementation of the SMS project must be managed and systematically monitored by the top management,
- it is necessary to persistently obey the SMS system principles and to maintain a positive motivation of most wide circle of subject-matter statisticians and methodologists; in this respect the CZSO benefited from involvement of external expert as a consultant to the Office President,
- consistent co-ordination of time-scheduled workloads in the SMS project, the SIS Redesign project and current activities of the Office,
- purchasing of financial funds must be systematically monitored by the statistical office in relation to the stage of the project implementation, on the basis of functional specification and qualified estimate of man-hours. It is important to use all potential sources of funding (external and internal sources),
- financial costs of the operational running of the SMS should be covered from the Office budget.
Abbreviations used in the text (alphabetically ordered):
BSO | Business System Options |
ETL | Extract, Transform and Load |
GA SIS | Global Architecture of Statistical Information System |
ICT | Information and Communication Technology |
LCST | Life Cycle of Statistical Task |
PDB | Public Data Base |
PSC | Project Steering Committee |
SBS | Structural Business Statistics |
SI | Statistical Information |
SIS | Statistical Information System |
SMS | Statistical Metainformation System |
SPP | Statistical Production Process |
ST | Statistical Task |
TSO | Technical System Options |
Links: |
---|