Six Degrees of Twitter

Posted on Friday 20 July 2012 by Oliver O`Brien

This is my Twitter social graph. Click on the graphic to see a larger version.

Key

The font sizes for the names correspond to the number of followers, while the colour ramp (light grey to yellow to blue) is proportional to the number of listings per follower. That is, someone who has a small number of followers, but has been listed by many of those people (and others) will appear bright blue. This is designed to be a very simple measure of value and influence – you can have a few number of followers, but if many of those have considered you to be an authority in a subject (and are themselves switched on enough to know about Twitter listing) then you can be considered to be a more influential Twitterer. I bet you most of the “celebrity” accounts will therefore score poorly here, while experts will be picked out. Bad luck BTTowerLondon.

How this Compares to other Social Graphs

To make the graph, I have taken the subset of people that both follow me and I follow back. I’ve then looked at connections between these people. Doing this in Twitter is a similar idea to what has been done in Facebook and Linked-In before except that:

The groups that appear will be quite different to what appear in Facebook. Facebook is a social network for friends, whereas Twitter is more of a social network for interests.
Twitter’s connections are asymmetric (you can follow people who won’t follow you back, and vice versa) which means you have to think about exactly what you are mapping.
It’s much more of a fiddle in Twitter because you have to query each person’s connections separately.
Twitter’s rate limits (for unauthenticated connections) are aggressive – a maximum of 150 requests an hour from a single IP. Luckily I have access to nine Linux machines which run my Python scripts nicely.
The lack of the equivalent of Facebook “apps” that do this kind of visualisation automatically, mean you have to do it yourself. I produced the visualisation in Gephi, which is powerful but tricky to get to grips with.

There is one great thing though:

You can build up these kinds of visualisations for anyone, not just yourself, as the raw information is accessible to anyone.

Community Classification

My Twitter network is more homogenous than I thought – a big blog of tech/geo, with the orienteers forming the main breakaway group, and some slender strands of friends on either side. Networks of friends which don’t share any connections with the other groups, will not be connected at all and will float away.

Below is a hand-done, rough community classification. Again, please click for a larger, more readable version. If I pulled in more of the metadata (profile and qualitative/quantitative) from Twitter for each person, then this could probably be done automatically – enough people in the CASA cluster, for instance, will mention CASA on their profiles, for it to be detectable, showing such people as CASA-linked even if they don’t say so themselves.

A – The Neogeo (Geography+Technology) community
B – OpenStreetMappers in London and elsewhere
C – The Open Data movement
D – Data visualisation and data journalism
E – UCL CASA, UCL Geography and associates
F – London general
G – East London
H – Running
I – Orienteering
J – Non-techy friends
K – Techy friends
L – An unlinked group of non-techy friends There are a couple of other such groups.
M – People unconnected to themselves and the others
N – Bike share operators

The last group is small – I follow a lot more of them, but generally these “official” accounts don’t follow back.

Data Graphics