The ECO Data Hub: new tools for access and interpretation

The Economics Observatory’s upgraded Data Hub addresses long-standing problems with accessing and interpreting public data. We’ve built a system that is open, verifiable and automated, allowing users to draw data directly from official sources and display the numbers with a simple and flexible tool.

Today, we are launching the first version of our data-sharing tool: the Economics Observatory (ECO) Data Hub.

Too many publicly funded data sit in complex silos, which can only be navigated by those with economics or programming degrees. This has led to data work that can be repetitious, costly and prone to error. It also opens the door to manipulation and obfuscation. Our new data tool is open, verifiable and automated, helping with all these challenges.

A yawning gap

There has never been a time in history when data have been more important. Important to policy-making, where government decisions are taken on the basis of economic data. Important to businesses, where data analytics are at the core of strategy decisions. And important to the public, where households are charged with ever more responsibility for their long-term finances, as they manage their bank accounts and pensions via websites or phone apps.

The Observatory data team believe that data can help make the world a better place, and that the best data – and tools to investigate them – should be universally available. But we also know that data need to be handled with care. Our new tool does all this – try it out here. You can also find a user guide here.

Even in 2023, decades after the creation of the internet, access to and use of data remain a major challenge. There is still no single hub for public data from around the world. And there is no widely accepted standard for collecting or sharing data: some favour spreadsheets; some like CSVs. Within policy-making bodies, news outlets and firms alike, there are often a host of players, file, formats and storage technologies that make up a data change. The current data system is:

Opaque. Publishing data as a picture is opaque. It is impossible to interrogate the provenance of the data. It can also be hard to spot trickery by visualisers who may be adept at shifting x and y axes in order to show their intended message.
Error prone. Each link in the data chain can, and does, break. Each step adds to the chance of human error.
Wasteful. There are many players, each adding delay or cost. The same analysis is also repeated time after time. This is slow, expensive and not a good use of analytical resources.
Slow. The use of different file types and formats across private and public sector organisations (and even within them) leads to compatibility problems. This generates delays and costs.
Carbon-intensive. The siloed nature of data efforts means that institutions end up housing large quantities of other institutions’ data. (Consider how many data from the Office for National Statistics , ONS, are stored on machines in UK civil service departments, for example). This creates a data storage requirement with associated resource and environmental costs.

The ECO team wants to play a role in moving the UK in the opposite direction on all five of these measures. That is, the goal should be a data system that is:

Transparent
Accurate
Efficient
Fast
Resource-light

We’ve built the very first version: to show how it can work; and to get support for our ideas. In tech-speak it’s the ‘Minimum Viable Product’ – the most basic version we could make to show potential partners our ideas.

Three steps to a beautiful chart

The tool is based on the typical path that anyone building a chart tends to follow. First, find the data; second, make the chart; and third, use it in a presentation or talk. We think huge improvements can be made in all these areas.

Step 1: Explore – an API of APIs

All data should be accessed ‘programmatically’. This is data-science lingo for grabbing the number you want using a line of code, rather than clicking around on a website. Getting your data via a line of code might take a little more time the very first time you do it, but after you’ve done it once, you never look back.

Thankfully, most data sources can be accessed this way, using an Application Programming Interface (API). The first contribution of our Hub is an easy-to-use API that will send data direct to your computer. The API takes the form https://api.econ.ac/{Country}/{Variable}. For example, the API for UK unemployment would be: https://api.econ.ac/gbr/unem

You don’t need to know three-letter country ISO codes or our series names by heart, since drop-down boxes will allow users to generate their own API queries.

If countries have APIs, what is the point of ours?

There are two problems in the world of APIs: consistency and compatibility.

Take consistency first. The problem is that countries have set up their APIs so that they all look very different. My bet is that, from our code above, you could guess how to get hold of data for the United States… yes, you just change the ‘GBR’ to ‘USA’. But let’s look at how three APIs – the UK, United States and Canada – work if you want to get hold of time series unemployment data. You’d need these three API calls:

This is the kind of thing that sends people back to the old, inefficient and unsustainable way of accessing data. Using our API, a user would need the following:

UK: https://api.econ.ac/gbr/unem
United States: https://api.econ.ac/usa/unem
Canada: https://api.econ.ac/can/unem

The second problem is a bit more technical: in short, countries’ own APIs often won’t do what you ask them to. The reason for this is a tricky problem known as Cross Origin Resource Sharing (CORS). For an explanation of the problem, read here. Our API irons out the problem. Full details will be set out in our forthcoming API documentation by our Data Editor Dénes Csala.

Don’t the OECD and the International Monetary Fund do this already?

No, they do something different. Think of them as best in class stores or tanks of data: they handle your query and send your data back themselves. Our system is an intermediary or matchmaker: what we are doing is sending your query back to the original publisher of the data.

For a visual metaphor, think of us as constructing a secure pipe from your computer back to the ONS in sunny Newport or Stats Canada in snowy Ottawa. Our system is the hub, and the data providers are the spokes. If you are interested in distributed systems like this, I recommend reading about the Estonia’s X-Road.

Step 2: Build – visualisations as data

Once you have your desired data in hand, the next thing to do is build something with it. The problem here is that there can be a steep learning curve when moving from point and click solutions like Excel to coded visualisations.

The latter though – charts written as a piece of code – are superior in every way. So, the second part of our tool helps to bridge the gap. The user simply points and clicks on drop down boxes and their chart will appear. So far, you may reasonably say, it’s like Excel, only much, much worse.

But there are two big differences here. First, our tool encourages the user to click on ‘code’ to see the code that is underlying their chart. This helps to build a change in mindset: visualisation are themselves data. And data are what everyone loves – because they can be shared, copied, verified and reused.

Second, that data – the code that draws your visualisation – can be used in your own website, and that includes the pipe that we have created back to the original provider. This means, for example, that your chart of UK inflation will update automatically as new data come down the pipe.

Our aim is that people with no coding knowledge should be able to integrate self-updating visualisations into their own websites by the end of the year. We are working on embedding guides for WordPress, Squarespace and Wix.

Step 3: Share – a new data community

The final step is to build a simple space where researchers, policy economists and the public can post datasets, data questions or their findings. Users can create profiles, saving the charts that they have made, and reducing repetition. (I guess the number of charts of UK GDP or inflation must be in the thousands). If they like the chart, they can post it to a public stream, that is a little like Twitter or Instagram, but for data visualisations.

Coming soon: 20 countries and a new blog

Our plan is to test and improve the tool over the summer, before integrating it into our research, policy work and teaching in September. We would love it if you could create an account and post a chart, telling us what you like and don’t like. We’re also keen to hear about additional functionality that you think the Data Hub needs, especially if you are planning on using the tool for teaching or research. We have already started work to expand the tool so that we cover the 20 largest economies in the world by the end of June. A new daily blog that uses the tool will launch on 3 July.

Author: Richard Davies

Picture by Tony Studio on iStock

Authors

Richard Davies
London School of Economics & University of Bristol
View Profile