9. In most countries, official statistics are collected not just for governments but for the use of the community. This is particularly the case in democracies where official statistics can be used to assess the effectiveness of governments' policies and programmes - they provide a mirror on society.
10. To quote a 1993 White Paper on Open Government in the United Kingdom:
"Official statistics are collected by government to inform debate, decision making and research both within government and by the wider community.
"They provide an objective perspective of the changes taking place in national life and allow comparisons between periods of time and geographical areas.
"Open access to official statistics provides the citizen with more than a picture of society. It offers a window on the work and performance of government itself, showing the scale of government activity in every area of public policy and allowing the impact of public policies and actions to be assessed."
11. The research community plays a particularly important role in stimulating policy analysis and debate and assessing the effectiveness of government programmes. This requires access to good-quality statistical data if their analyses are to be effective. If they do not have access to relevant official statistical data, they will often seek to collect their own data. As well as incurring additional costs to both the data collector and the respondent, these collections will often be of lower quality.
12. Providing researcher access to microdata can also be a way of extracting additional value from the cost of collecting official statistics, and of obtaining valuable insights into the quality of the data and how statistical surveys might be improved or extended.
13. What is the research community? It includes those working in academic institutions, of course. It also includes researchers working in non-government organizations and international agencies. Furthermore, some researchers requiring access to microdata will work within government-funded agencies and institutions. For the purposes of these Guidelines, all of these researchers are regarded as part of the "research community". However, as will be seen from the Guidelines, the pertinent issues may vary somewhat between the different elements of the research community.
14. The following sections try to bring together the perspectives of national statistical offices and the research community with the intention of trying to find arrangements that largely satisfy the needs of both groups. These are considered in more detail in Chapter 6. The perspective of the National Statistical Office.
15. NSOs must maintain the trust of respondents if they are to continue to cooperate in their data collections. Confidentiality protection is the key element of that trust. If respondents believe or perceive that a NSO will not protect the confidentiality of their data, they are less likely to cooperate or provide accurate data. One incident, particularly if it receives strong media attention, could have a significant impact on respondent cooperation and therefore on the quality of official statistics.
16. This is the dominant issue from the point of view of NSOs but there are other concerns. A key one is whether they have sufficient authority to support researcher access to microdata, either through a legal mandate or some other form of authorisation.
17. Some NSOs are concerned that the quality of their microdata may not be good enough for further dissemination. Whilst quality may be sufficiently accurate to support aggregate statistics, this may not be the case for very detailed analysis. In some cases, adjustments are made to aggregate statistics at the output editing stage without amendment to the microdata. Consequently, there may be inconsistencies between research results based on microdata and published aggregate data.
18. NSOs may also be concerned about costs. These include not only the costs of creating and documenting microdata files, but the costs of creating access tools and safeguards, and of supporting and authorising enquiries made by the research community; new users of data files need help to navigate complex file structures and variable definitions. Although the costs are borne by the NSOs, they are usually not provided with budget supplementation to do the additional work. And on the whole, researchers do not have the funding to contribute substantially to these costs.
19. On the other hand, NSOs are increasingly recognising the importance of supporting the research community, and of the additional value that is provided to NSO data collection and processing effort through effective use of its data for research. Specifically, it is in the public interest that insights, which can be provided from the data, can be made available to decision makers and the public. Furthermore, if survey data are used more extensively in this way, it can provide an extra level of protection against budget reductions to these statistical programmes.
The perspective of the research community
20. From the perspective of the research community, supporting research based on microdata should be an important component of any official statistical system. The benefits include the following:
- (i) microdata permits policy makers to pose and analyse complex questions. In economics, for example, analysis of aggregate statistics does not give a sufficiently accurate view of the functioning of the economy to allow analysis of the components of productivity growth;
- (ii) access to microdata permits analysts to calculate marginal rather than just average effects. For example, microdata enable analysts to do multivariate regressions whereby the marginal impact of specific variables can be isolated;
- (iii) broadly speaking, widely available access to microdata enables replication of important research;
- (iv) access to microdata for research purposes, and the resulting feedback, can facilitate improvements in data quality. For example, the US Bureau of the Census has formalised the documentation it requires from researchers to assist it in improving the quality of its surveys;
- (v) it increases the range of outputs derived from statistical collections and hence the overall value for money obtained from these collections.
21. Furthermore, lack of access to microdata may result in researchers developing and conducting their own statistical collections, adding to the reporting burden imposed on the community. As well as the cost involved (to the collector as well as the respondents), the collections will usually be of inferior quality and with smaller samples than official surveys. This will lead to lower quality research results. There are benefits from having an accepted and authoritative, as well as high quality, data source for all analysis compared with the alternative of researchers using different data sets to analyse particular topics. NSOs can play a very useful role in this respect.
22. The researchers point out that they are not interested in identifying individuals and the evidence is that this is indeed the case. Given this, they feel that NSOs have generally been too conservative in the access they provide to microdata.
23. At a 2003 Workshop on Confidentiality Research hosted by the United States National Science Foundation, Peter Madsen referred to the Privacy Paradox. He argues that "the rush to ensure complete levels of privacy in the research context paradoxically results in less social benefit, rather than in more". He argues that when you include the concept of utility you may get different outcomes:
"Perhaps through this additional concept of utility, people will recognise that while they surely have the right to privacy, they may also come to the realisation that they have a duty to share information, if the common good is to be furthered."
Some use the term "privacy deficit" recognising that there are privacy issues associated with microdata release. The discussion can then focus on whether the benefits of a proposal outweigh any privacy deficit.
24. The research community also sees the importance of research into improved methods of confidentiality protection that increase the usefulness of the underlying data. NSOs would agree with the importance of this research. However, this research is only likely to lead to a partial answer to the desire for improved access to microdata for research purposes and researchers would remain frustrated if NSOs relied solely on improved statistical methods for confidentiality protection.