UNITED NATIONS


STATISTICAL COMMISSION and ECONOMIC COMMISSION FOR EUROPE
CONFERENCE OF EUROPEAN STATISTICIANS
Forty-third plenary session
(Geneva, 12-15 June 1995)
CES/1995/R.12/Add.3
8 March 1995

Integration Issues for the 1990's

Submitted by Statistics New Zealand


Introduction

1. Integration is a vital issue for a national statistical office. The measures that are needed to inform business, the community, and government about vital phenomena not only identify but explain variability and the influence of economic and social phenomena. This comprehensive interest can very rarely be met by any single data source.

2. Given that official statistics are used not only to judge public policies but also to inform their development and evaluation then statistical data needs to be able to reflect contemporary issues and concerns. The information base will tend to reflect the philosophical orientation behind public policy. In New Zealand we have seen a radical change in the role of the state, for example, with a major reorientation of economic philosophy. A significant emphasis on economic efficiency must place a new emphasis on the nature of our statistics.

3. The demand for integration needs to be focussed on the following critical areas. Firstly, key statistical frameworks (system of national accounts, balance of payments, demographic statistics, and inter-industry), most explicitly attempt to provide comprehensive and complete measures of facets of the economy and the population. It is the completeness of these measures and their comprehensiveness which necessitates a significant emphasis on integration. At a regional level, the requirements of the electoral process and the concerns of communities and local government, require a matching of statistical information about populations, business, social, environmental, and other phenomena, so that the inter-relationships between population and economic change can be well understood.

4. Many public policies focus on issues relating to industry, commodity or region. These are core cross-cutting variables where we seek to explain the growth or other change in particular industries or traded items. It is the ability to match information across data collections that allows us often not only to measure a particular phenomenon but to identify what it is associated with and what it may cause or be caused by it.

5. Much public policy is about sub-populations of interest. For those sub-populations, the difficulty is specific measurement through surveys designed to measure the characteristics of individual groups. Measuring sub-populations often requires us to relate data from a plethora of surveys carried out for other purposes, or extract from surveys and censuses of the total population. It is most important, therefore, that the variables which define populations, both the explicit ones in terms of the sub-populations of interest, and the conceptual variables that define residency, location, labour force status, and other characteristics, are themselves also measured on a consistent basis.

6. Integration leads to measures of the impact of public policy or measure the long-term effects across the social position of particular sub-populations that may result from individual public programs. Similarly, the key statistical units for data collection are also very important structural units in society. Families, enterprises, iwi, economic sector, are amongst the most important. In these units we are not only concerned with the characteristics of families, but also the dynamics of change of families themselves. This information can only come about by the effective matching of data from a wide range of sources and ensuring that each individual data collection provides a consistent basis for the measurement of these units.

7. Performance indicators may exist at a macro level, for example savings rates, or they may exist at a micro level where we may look at floor space per worker or similar sort of measures. In marketing and commercial activities, and the measurement of public programs, there is the ability to create a range of performance indicators and contrast them over time or between sectors. These allow us not only to assess trends and indicators but to understand how reliable those trends are. Statistical data at a macro level, with prepared economic frameworks, is often the key output of microeconomic simulation studies where the impact of policy changes on major macro variables is desired to be understood.

8. The greater the amount of adjustment and interpolation that is required to create key statistical measures from survey data, the less likely it is that one can readily integrate micro and macro data. This makes it difficult to perceive the impacts of simulation studies on macro measures. Similarly, the micro effects in terms of the distributional impacts of significant changes in incomes, consumption, or other major variables, cannot be measured with adequate reliability as adjustments of an iterative nature are required to make macro series consistent with each other. The greater the adjustment, the greater the loss of cohesion between macro and micro data.

9. The relevance of statistical concepts used in different data collections is a vital determinant of the cohesion between them. Some of the concepts that can most determine that cohesion are residency, location, social arrangements, ethnicity, labour force status, or industry. In each of these concepts it is possible for data collections to give emphasis to one particular facet, in such a way that the integration of data that is otherwise able to be achieved, is significantly reduced.

Benefits of Integration

10. The benefits of integration involve a significant increase in the scope of the statistical measures that can be produced, either by users of statistics or by Statistics New Zealand for users. Ultimately, a sound integration policy reduces the cost and increases the quality of the statistical measures that are undoubtedly going to be produced by inference, calculation, design, or plain cussedness of the consumers of statistics who have most significant need to relate, explain, and assess the quality of the statistical data that is published. Where the official statistical office doesn't recognise that integration, then the lack of coherence between its statistics must be met head-on by the users. Inevitably that lack of cohesion is seen as a significant loss in the quality and integrity of both the statistics that are being used and official statistics generally.

11. As a consequence, the benefits of integration are:

increase in scope of statistics
increase in comprehensiveness of statistics
increase in quality of statistics
more efficient statistical processes

Contributors to Integration

12. Statistical integration requires a consistent and balanced application of a wide range of principles and practices which span many areas of the statistical process. The existence of standard classifications, for example, is only part of the process of integration. The classifications themselves must reflect the characteristics of society and the economy, and they must be able to be measured across the range of quite different measurement processes that are used in official statistics, from administrative sources of statistical data through direct measurement and respondent assessment.

13. Development of integrated survey frames can provide a common base for the design of statistical estimates across a range of variables. That common base may, of course, be fundamentally flawed unless for example design variables have sufficient association with the causes of variability and of all the phenomena being estimated. For example, employment is stored on the New Zealand business directory for use as the main design variable. It is undoubtedly flawed when it comes to measuring change in stocks, or change in capital expenditure, as these two variables, which are both highly critical to ensuring the integration of macroeconomic statistics, may well be more fundamentally linked to other characteristics.

14. The concepts used in statistical data collections may not be able to be administered in documents used primarily for other purposes. For example, the concept of income preferred by economic statisticians may be impossible to obtain where information is collected primarily to determine a tax assessment.

15. Common processing systems may equally have their limits where, for example, the level of classification is different although the concept is very similar. Most particularly, this occurs when contrasting imports and exports, where the level of detail of exports for agriculture products in New Zealand is vastly greater than the level of detail required for the import of agricultural products into New Zealand. The use of a common classification system therefore poses a great increase in the cost of classification because it provides a level detail that is unnecessary in one part of the process.

Bounds to Integration

16. The ability to integrate statistical data will be limited by a number of characteristics of the populations being measured. We may not be able to measure a concept with any precision. A good example would be that of stocks. Stocks may be measured at market value, replacement value, or at historical cost, and there may be considerable uncertainty as to what is meant by this. The assessment of assets in New Zealand dollars may be considerably difficult to achieve where there is a portfolio of overseas assets held, simply because of the difficulty in determining what is the appropriate exchange rate. The volatility and availability of statistical units for the management and operation of statistical surveys, may significantly constrain the choice of survey statisticians as to what will be the best approach to measuring particular phenomena. In some areas such as buildings, the activity unit is a very volatile one and transactions can be measured in a more stable manner through monitoring building permits. Another example of this may be in the balance of payments area where it might be easier to measure the overseas owned shares in New Zealand companies by going to the share registers of major companies in New Zealand and looking at the residency of the owners of the shares, as an alternative to surveying the potential overseas companies or persons which might own shares in New Zealand. On the other hand, the loss of the overseas exchange transactions record resulted in the need to start surveying enterprises in New Zealand in order to determine their overseas transactions. The inability to track transactions has required a move back to surveying the activity unit.

17. Sometimes classifications and the enumeration elements cannot simply be separated or divided into elements that are useful. An important example of this is in the area of business statistics where we may have taken an accounting unit rather than an activity unit into a survey. This creates a higher level of volatility in the industrial classification that is used to determine the sector in which an industry operates. the variability in the quality of reporting is well known and it may reflect either full non-response where people are not prepared to participate in the survey, partial non-response, or erroneous information. Sometimes we don't know, and sometimes we make guesses. The ability of statisticians to pick up and detect new forms of activity may reduce the relevance at any time of statistical surveys. For example, the timing of the introduction of new supermarkets, or major participants into the balance of payments, into statistical surveys can significantly alter the magnitude of flows in a particular month. If these entities are detected late, then either significant revisions have to take place, or the cumulative year on year flows into the economy, may well be significantly under-estimated.

18. Timing differences can occur in statistical data. Where tax data is a prime source of statistical information, it is significantly affected by the choices that people have in their tax year. Timing differences also occur because of lags in the economic processes of manufacturing, farming and forestry. Perhaps one of the most significant areas where timing differences can have a major impact is in oil exploration, where the benefits from significant exploration costs may take a long time to be offset by revenue streams.

19. Many of our major statistics, such as the current account balance, are actually based on the measurement of differences between very very large flows or aggregates. Quite minor errors in any one flow can have a significant impact on these critical statistical differences. Integration is therefore most beneficial. Sampling and non-sampling errors play a major impact on bounds to integration, and aggregation bias which is perhaps most important in the area of deflation, is also a significant concern.

Constraints on Integration

20. Even if we achieve the level of integration that we wish in our statistical processes, there are other things that we do in the organisation that can reduce our capacity to gain the benefits of the integration. The first must be the availability of statistical data. If we cannot get our surveys completed in time or they are cancelled, or they do not exist, then there will not be data available. We may well have loosely specified design objectives, or there may be an unclear analytical framework for the creation of statistical aggregates. The classification tools that we need may just not exist, or may not have been updated. Perhaps most importantly, if we have not organised our data, and we do not have the capability of presenting and managing it, then we will not gain the benefits of data integration. To this end, Statistics New Zealand has begun a major investment in systems for data management.

Strategies for the Future

21. What then determines the priorities for a national statistical office with desires to significantly increase the benefits to its users, and where are those benefits obtained in the organisation. Clearly the first benefit is increased confidence in public statistics, and a greater use of statistical data for a wider range of purposes. This can only increase the value of the organisation and recognise the considerable benefits in a centralised statistical process, as opposed to a dispersed one. It must also significantly increase the revenue earning capability of the organisation. To achieve this, it is my view that Statistics New Zealand needs to take the following actions:

a) Data Management Initiative

Statistics New Zealand is developing systems for organising and referencing all of its statistical data holdings, and enabling the low cost implementation of standard classifications and standard descriptions for all future surveys.

b) Respondent Management Systems

Statistics New Zealand is creating a respondent management system which recognises the need to survey transactions by statistical units such as activity units, accounting units or enterprises, or individuals, families and households. That respondent management system would recognise the relationships between these statistical units and their most critical characteristics for survey taking. It would also link to statistical data holdings and the survey management registers for the panel surveys that we operate. When redesigned, the Business Directory will be able to draw on both survey data and that from tax and other administrative records.

c) Generalisable Survey Systems

The move to a generalised set of survey processing tools is a key priority using BLAISE as a core element. A critical element of this will be a commitment to data quality management, for example graphical editing.

d) Frameworks

A clearer specification of the data requirement of statistical frameworks and the development, for example, through SNA, balance of payments, and inter-industry, of an economic statistics strategy. Demographic statistics will be part of a wider social statistics strategy which would identify the critical elements most important for us to focus on in the integration of statistical data.

e) Regional Statistics

We will develop a plan to meet user needs, so we can get a clearer understanding of the prime focus for regional statistics for the next decade.

f) Marketing

The Marketing Group in the department needs to identify the core cross-cutting variables in important revenue earning products in the department. This sort of feedback ought to be incorporated into regular business plans and taken into account in the survey operations and classifications systems of the department.

g) Classifications and Frameworks Group

This group was set up to bring together all the classification development work in the department, and it should be of critical concern in the provision of consistent systems and definitions and descriptions across the organisation.

h) Common Collection Devices

Common collection devices through development of consistent organisational standards for data collection, through questionnaire design and questionnaire testing processes and standards, need to be adopted to ensure that significant changes in statistical processes that are not planned do not arise.

Conclusion

22. The integration of statistics is one of the key driving forces behind a centralised national statistical office. Without integration, and without a high level of public competence in statistical survey taking by the organisation, then the benefits to the state of centralised official statistics need to be questioned.

23. I believe we need to ensure that in New Zealand these benefits are seen, recognised and captured in the form of reliable, timely official statistics, and a valued responsive commercial information service. If they are not, then the revenue base and public support for Statistics New Zealand must diminish.