Big Data and Machine learning can help achieve SDGs
The amount of digital information in the world is growing exponentially, doubling approximately every two years. With the Internet-of-Things, this will move even faster. In 2012, there were less than 9 billion devices connected to the internet. In 2020, some say this number will be more than 50 billion. In 2025, McKinsey expects this number to have increased to 1 trillion. All these devices will be connected, all will be sending data to the cloud. This is mind-boggling. We need to use the power of this data tsunami to improve our understanding of the world we live in. This is why we need Big Data for official statistics.
The use of Big Data will be an important part of the data landscape as we move forward in the measurement of progress towards the Sustainable Development Goals (SDGs). Big Data will change the very way in which we produce data and develop statistics. We will see multi-source statistics where Big Data will play a significantly larger role – combined with data from statistical surveys and censuses, administrative registers and geospatial information systems. This can improve the quality of our data.
It can also pave the way for almost real time data. In 2010, when world leaders where assessing the progress towards the Millennium Development Goals, most data was from 2005. This is not satisfactory. Hopefully, with multi-source statistics and Big Data, in 2030, when we meet to assess progress towards the SDGs, we will have data that is less than one year old, perhaps even data from the day before.
Moreover, there are huge possibilities in combining Big Data with impressive computer power and with Machine learning – a kind of artificial intelligence. This can ensure better predictions and policies and can help to relieve the strain on scarce public budgets. Estimates in my own country, Denmark, show that 1/3 of all health costs are due to 1% of total population. If these citizens can be identified earlier, more targeted and efficient health interventions can be implemented to the benefit of both citizens and the government. There are many examples. The Economist recently wrote about how food reviews by customers can help identify dirty restaurants for inspections. How a district in London is developing an algorithm to help predict who might be homeless. How poor farmers in Africa can be guided on the best time to sow their seeds. How Microsoft in India is helping schools to predict who may drop out.
This combination of Big Data and machine learning is promising but also somewhat scary. There are risks regarding privacy and confidentiality. The algorithms can lead to wrong decisions. They can wrongly label and stigmatize citizens and divide them into groups, where they do not belong. Some argue that we will become less innovative because Big Data and algorithms will point us towards what we did yesterday and not help us to find solutions for the future. This is why we need common rules, high quality Big Data, strong institutions and international cooperation. And we need transparency and access to information. I hope more countries will join our Aarhus Convention or will develop their own conventions and policies to improve access to information and to strengthen initiatives for open data and transparency.
We must do our part in the UN. The UN work on Big Data was initiated in UNECE under the auspices of the Conference of European Statisticians in 2012. UNECE is leading the way on modernizing statistical production and studying how best to integrate data from multiple sources, including statistical surveys, administrative records, geo-spatial information, and, of course, Big Data.
As the phenomena we are trying to understand become more complex, we have to become better at interpreting all relevant data to discover key trends. We also have to build new partnerships – with academia, with other authorities – from schools and hospitals to public utilities, and not least with the private sector, from electricity suppliers to Google, Amazon and Facebook. And finally we need partnerships with citizens and communities in each and every country. Citizens will generate the data. Just as we see crowd-sourcing and crowd-funding we need crowd-data.
We manage what we measure. More data and better data can help us change the world. It was only when data on maternal and child mortality improved that we managed to intervene more forcefully and reduce the numbers. Remember the power of the Human Development Reports and how new data brought political progress? Good data and good statistics can change the world. And Big Data transformed to good statistics and knowledge can change the world even faster.
If you wonder why the animal accompanying my blog is a cow, it is because cows are fascinating animals; the cows of our host country, Switzerland, are famous for their quality; and because I am still a farmer, and miss the cows I had in Denmark. Now I got one back.