|4. Statistical Metadata Systems (Central Statistical Bureau of Latvia)||Central Statistical Bureau of Latvia||6. Organizational and workplace culture issues (Central Statistical Bureau of Latvia)|
5.1 IT Architecture
Before development and implementation of the system classic Stove Pipe data processing approach with all appropriate technical incompatibilities existed as a consequence of the wide range of technology solutions that were in use.
As the result of the analysis of processes, data flows, user requirements and situation mentioned above it turned out that most of statistical surveys have the same main steps of data processing starting with survey design and ending with statistical data dissemination. The division was necessary between surveys filled in by respondent and surveys filled in with assistance of interviewer. The main difference was found in both data obtaining methods and data aggregation algorithms obtaining data from businesses and from persons & households. Business respondents are filling in questionnaires are either mailing them to CSB or enter the data in electronic survey system. Data from persons & households are obtained via interviewers service. Statistics structuring in the Central Statistical Bureau of Latvia is presented on a high level diagram as it is shown on the Figure 3 - Statistics Structuring in CSB based on the Process Oriented data processing.
A typical statistics production high level workflow can be seen as very simple diagram on Figure 4 - Typical statistics production high level workflow.
The corporative data warehouse of CSB is presented in Figure 7.
As the theoretical basis for system architecture "Information systems architecture for national and international statistical organizations" elaborated by professor Mr. Bo Sundgren (Statistics Sweden) and issued by UNSC and ECE and approved by Conference of European Statisticians as Statistical Standard was taken.
New system contributes harmonization and standardization and is developed as centralized system, where all data are stored in corporate data warehouse. The approach is by using advanced IT tools to ensure the rationalizing, standardization and integration of the statistical data production processes.
Important task during design of the system was to foresee ways and to include necessary interfaces for data export/import to/from already developed standard statistical data processing software packages and other generalized software available on market, which functionality was irrational to recode and include as the system component.
System is divided into following business application software modules, which have to cover and to support all phases of the statistical data processing:
- Meta data base module;
- Registers module;
- Data checking, editing and derivation module;
- Missing data imputation module;
- WEB based data collection and administration module;
- Data aggregation module;
- Output tables module;
- Data analysis module;
- Data dissemination module;
- User administration module;
- DEA module;
- Respondents response and reminder system.
5.2 Metadata Management Tools
All metadata management tools are provided by IMD SDMS. The modules (described in Section 4, see description of modules, which have to cover and to support all phases of the statistical data processing) provide the management tools for metadata.
5.3 Standards and formats
The metadata standards and file formats being used within CSB metadata systems:
1.The ADS. This system at the moment is under implementation. ESS documents on quality reporting (Standard Quality Report and Standard Quality Indicators) have been used as the base for the development of the structure for ADS projects.
2. IMD SDMS, based on: guideline "Information systems architecture for national and international statistical offices, guidelines and recommendations, United Nations, Geneva, 1999" applied by CSB for metadata production. In particular: fundamental concepts: "statistical characteristic" and "estimated statistical characteristic", aspects of the metadata infrastructure of a statistical organization, strategy for the development and implementation of a metadata infrastructure for a statistical organization"; Complies with: ISO/IEC 11179, Information technology - Specification and standardization of data elements, national standards on metadata and SDMX standard). File formats: *.px; *.xls; *.dbf; *.xml; *.html, *.doc
3.Data and Metadata Dissemination subsystem. Files-structured storage. Reference metadata structure is based on SDDS; the standard template is used for preparation of reference metadata within publication table. File formats: *.px; *.xls; *.xml; *.html
5.4 Version control and revisions
Metadata systems are controlled and revised permanently by responsible staff . The versioning of the system has no set rules, instead, system updates project may be launched if there is a reasonable requirements that the system does not meet. As for the version control of metadata descriptions, the version of questionnaire IMD SDMS is defined within one-year period, therefore each version with associated metadata is revised once per year.
At the moment CSB has the fourth version of the IMD SDMS. In comparison with the first version they are significant differences: new functionalities were built up and more user friendly interface was provided.
5.5 Outsourcing versus in-house development
IMD SDMS is developed by outsource company. After the eight years of the successfully exploitation of the IMD SDMS we found that system functionality should be reasonably increased.
Since 2009 a project has been launched for the IMD SDMS to cover Social statistics domain.
5.6 Sharing software components of tools
5.7 Additional materials
Technical platform and standard software used
The first version of the system, to be in line with the CSB IT strategy, existing computer and network infrastructure for the system development the Microsoft SQL Server was taken to handle system databases. All applications comply with the client/server technology model, where data processing performed mostly on server side. Other components of Microsoft Office are used as well. For multidimensional statistical data analysis is used Microsoft OLAP technology. As tool for data dissemination was chosen software product PC-AXIS developed by Statistics Sweden, which is widely used in different statistical organizations in different countries.
The last version of the system has been upgraded to MS Server 2003, MS SQL Server 2005 and applications reprogrammed in .net in 2008.