Skip to main content

Modernization of official statistics

Introduction

modernstasts by HLG-MOS

Official statistical producers operate in a rapidly changing landscape, where the pace of change accelerates annually. Increasingly complex societal issues demand more timely, disaggregated statistics and diverse data services. Adoption of new data sources like administrative and big data poses challenges in analysis methods, data access, ethics, and privacy. Amidst competition from other data providers, statistical organizations must enhance product communication and brand advocacy for trustworthiness. To tackle these challenges, statistical organizations must invest in modernization, staff capabilities, and technology Despite limited resources, efficiency improvements are crucial to ensure adaptability and resilience in the dynamic data ecosystem.

The HLG-MOS provides a collaborative platform for experts in statistics organizations to develop modernization and innovation strategies and solutions in a flexible and agile way. It has made several important contributions to the modernisation of official statistics such as the Generic Statistical Business Process Model (GSBPM) and the Generic Statistical Information Model (GSIM). The HLG-MOS projects in areas of machine learningsynthetic databig data and strategic communication spearheaded the implementation of new technologies, methods and other capabilities in statistical organisations. 

In focus

The capabilities of AI have made a significant leap forward in the last few years with the advance of large language models (LLMs) and there is a growing recognition of the transformative potential of LLMs in the statistical community.
However, as it often happens with new technology in its early stage, each organisation finds itself with only limited resources to navigate the full potential of the technology on its own. This makes pooling experiences and knowledge from different organisations invaluable to facilitate the adoption of the new technology.

In 2023, the international collaboration efforts of HLG-MOS on the topic of AI have concentrated around the Large Language model (LLMs) – advanced AI systems that sparked a substantial public interest with the release of ChatGPT in late 2022. Based on the extensive training on vast data sets, LLMs are capable of understanding and generating texts that cannot always easily be differentiated from texts produced by a human being. This exceptional capability in natural language processing tasks offers an immense potential for statistical organizations to enhance the quality of their services to society – provision of statistics and data services that are foundational for policy-makers, business and citizens alike. 

To establish a common understanding of LLM’s potential within the statistical community, a recently published white paper provides examples of implementation from various statistical organizations (e.g., code translation, report generation, natural language interface for database), which showcase the capabilities, challenges, and risks of LLM (e.g., privacy concerns, hallucination, ethical issues, governance, alignment with Fundamental Principles of Official Statistics).  

Highlights

Recent outputs

  • Cloud for official statistics: explores the opportunities and challenges presented by cloud technology, offering nuanced perspectives and practical guidelines drawn from diverse statistical organizations. With a focus on key themes, the publication aims to help organizations to navigate their unique cloud journey with confidence.
  • Large Language Models for Methodological Advice: summarizes the quick experiments we conducted about using LLMs for methodological advice and lessons learned from the results. It underscores the importance of exercising caution when solely relying on insights generated by LLMs, particularly for methodological advice.
  • Data Governance Framework for Statistical Interoperability: explores the issue of interoperability in statistical organizations. It provides analysis of interoperability concept, sources of non-interoperability throughout statistical production process, and highlights factors that help building the governing system to support the interoperability in statistical organizations.
  • GSIM version 2.0: GSIM provides a set of standardized information classes that are used in the design and production of statistics. It can improve the communication between business and IT experts as well as between different subject-matter-domains, creating an environment for re-use and sharing of methods, components and processes.
  • Organisational aspects of implementing ML-based data editing in statistical production: reflections on key issues, guidelines for overcoming barriers, and insights into MLOps implementation, that are also transferrable and applicable for the acceleration of ML adoption in other application areas.
  • Strategic Communication during the Inflation Crisis: following the main strategic communication elements outlined in the Strategic Communication Framework (2019) (https://lnkd.in/exbDnurm), the paper was developed based on strategies and concrete actions taken by statistical organisations.