Visualising similarity

Posted on Tuesday 5 August 2014 by roblevy

A model of the global economy is, by its very nature, an unwieldy object to work with. There are 40 countries (we want more; that’s coming next) and the economy of each country is described by the economic activity of 35 sectors.

Each sector in each country interacts with each other sector in each other country creating close to two million interactions.

This is great for wowing potential users of the model with the sheer scale and size of thing, but it makes life pretty hard if you want to ask a question like “what effect has a certain change had on… well, everything?”

This is hard because “everything” here encompasses two million numbers some of which will have gone up and others of which will have gone down.

If you don’t put any effort into visualisation, the output of the model looks absolutely horrible:

Needless to say, picking interesting information out of such a mass of numbers involves some careful thought. (For the interested, what you’re seeing here is dollar-valued commodity flows between sectors within the Australian economy, the sectors being numbered 1 to 35.)

The paper I’m writing at the moment asks an even trickier question than “what’s going on?”. I’m trying to work out how our model compares with other, more standard, ways of doing this kind of thing. This means making the same change in two models and comparing the results.

One way to boil down lots of information into a far smaller number of ‘things’ is to rank the numbers you’re analysing. This just means putting the numbers into order then saying which number is biggest, which is second-biggest etc.

So in our case, if we make a change to the global economy, instead of looking at a horrifying table of numbers we can just say “Australia was the country most affected by the change. Netherlands was second, Spain tenth, Bulgaria 39th…” and so on.

The advantage to this approach is that, when comparing the results of two models, you can just compare the ranks of the countries and see if they’re similar. If they are, you might be justified in concluding that the models are doing more-or-less the same thing.

It also allows for some nice visualisation. If we write down all the countries in one column in the order of their rank (most-affected by some change we’ve made, to least-affected) using one model, and make a second column where the countries are ordered according to their rank using the other model, we can quickly see where the differences are, particularly if we draw nice lines between the countries to show how their position has changed.

Here’s the outcome of such an experiment:

The design for this visualisation was inspired by a similar thing in the work of Hidalgo and Hausmann, see here on p4!

It shows the results of reducing demand for Chinese vehicles by $1M on the global economy in 2010. The left-hand column shows the results using a traditional model (for the interested: it’s called a Multi-Region Input-Output model, or MRIO). The most-affected countries are at the top and the least-affected at the bottom. The right-hand column is the same but for our model.

With the exception of Slovakia, the results look pretty good. The ranks are generally pretty similar which is encouraging. We’re currently trying to find out what’s going on with Slovakia, and I’ll post here if we ever find out!

(Note that Taiwan is not in our model, because the UN doesn’t report trade data for it, as it deems it to be a part of China. I won’t be delving into this international controversy here!)

Trade Data, visualisation