Birmingham Mail has published my first post in their data blog. It’s a map of the 600 local eateries that fail to meet basic hygiene ratings.
The data was gathered using the API of the Food Standards Agency and the article an map was made by Paul Bradshaw and myself.
Behind the Numbers, a new datablog in Birmingham
© Birmingham Mail
Probably this is the first regional data blog in the UK and I had the pleasure to contribute from the beginning.
The aim of this project is to “look at the numbers behind Birmingham and the stories in it”, so we will be releasing datasets that support the journalistic work made by professionals from the Mail and the Post. This is a good way to engage with the audience in the way that they can play with the numbers and check which figures are backing the stories.
We also want to provide different ways of visualising the data, so an important part of the blog is going to be about maps, infographics and charts. Thanks to that we can provide a deeper analysis than in the regular articles published by the paper.
If you want more information, have a look to this post by Alastair Reid in Journalism.co.uk where he explains what we are doing.
I hope you enjoy Behind the Numbers!
How many European migrants are coming to the UK? We are getting there…
Photo by Horia Varlan
Immigration is used as one of the main reasons to support the withdrawal of the United Kingdom from the EU. But most of European workers don’t need to have a visa to live and have a job here it’s not easy to measure the exact amount of new migrants.
In my opinion, the best way to have an approximate idea of how many people is coming every year to the UK from the European Economic Area (EEA) is checking the number of new National Insurance Number registrations. This document is really easy to get if you are from one of this countries, and it is a better measure than the census if you want to count the most recent flows of population.
The Department of Work and Pensions publishes part of this data. Top ten nationalities are available every year so we can get some interesting info, like the undisputed leadership of Poland during the last decade or the huge growth of Spanish migrants from a two years time.
But a lot of states are missing, so I’m not going to use this post to explain the existing data. Instead, I announce that I have sent a FOI request to the DWP asking for all the data for every single country in the European Economic Area. Hopefully they will provide me interesting information to write an article (and maybe make a visualisation) about the flux of Euro-migrants.
You can follow the evolution of the petition from the website WhatDoTheyKnow?
How do Brits behave abroad? A quick view to the consular data
Photo by Phil’s 1stPix
British Foreign & Commonwealth Office (FCO) regularly publishes reports about what they call “British behaviour abroad” and it is based on the consular assistance enquiries made by tourist and permanent residents overseas.
This data shows some common problems that citizens from the UK can face when they are abroad. Rapes, arrests, hospitalisations or deaths are some of the categories used by consulates to classify the enquiries. Analysing this data we can understand which kind of contingencies are more likely to happen in each country.
I brought together statistics from different years and made a ranking of countries to discover which ones have a higher rate of incidents involving British citizens. You can have a look to the Top 10 in the following table, but you must consider that the FCO only provides information about those states which receive a larger number of brits each year. This means that there is no data for the whole in period in some countries.
I have also created a bar chart including the total assistance enquiries from 2008 until 2012 to have an overall view of which consulates are busier and what is the evolution of the aggregated figures.
Greece is the country with the highest rape rate of British citizens.
Since 2008, 84 Brits were raped in the Hellene country. Spain has reached the same number but as they receive a huge number of tourists from the United Kingdom the rate is much lower.
In 2012 there were 35,000 British citizens living in Greece and approximately 1,795,000 tourists visited the Mediterranean country in the same year.
If you want to make a deeper research it would be interesting to compare this data with the local rape rate or analyse the same statistics for other countries.
Thailand and Turkey hold the following two positions in the ranking and, as you can see in the table, the former is one of the countries with more registered incidents.
Philippines is in the TOP 3 of deaths and hospitalisations
As not many British tourists visit this country, there are not figures for every year, but regarding the data we have we can assert that Philippines is the place with the highest death rate. Assuming that, it’s reasonable to think that the number of hospitalisations it is also considerable. Philippines is just below Thailand in the percentage of Brits that went to the consulate so ask for assistance to go to a hospital.
According to the FCO, there are “significant numbers of elderly British expats” living in the Philippines, so this is one of the main reasons of deaths and hospitalisations. But it would be interesting in a further research to have a look to crime statistics. Just keep it in mind.
Balcony accidents are very common among British tourists in Spain
This is not something you can see in the numbers, but in the report called “British behaviour abroad 2012” the FCO highlights what they call “balcony incidents” as one of the most common causes of hospitalisation.
This is an interesting piece of information but I would really want to know the exact number of Brits who visit a Spanish hospital because they suffer a “balcony incident”. I should probably send a FOI request in order to make some educational work in the UK to teach potential tourists about the dangers of such an amazing and complex technology: a balcony.
Even though I have not done a proper research about this topic, I am quite sure that this kind of accidents happen because teenage holidaymakers think it is funny to do something that for most of the world is just madness: throwing themselves from balconies to the pool. If you don’t believe have a look to this video made by Sky News.
Get the data
Most of this data was in a PDF format, so I had to convert it into a spreadsheet for being able to work with it. If you want to use the numbers to create your own stories, now you can download it from Google Docs
Data Roundup, December 14
This post was originally published in the School of Data
Photo by Robert Banh
This week we want to use a few lines to talk about tools that you can use in your daily work. The School of Data offers a very useful compilation of online resources including tutorials, books and tools about scraping, data analysis, visualisation, etc.
But if you can’t find what you need, there are more options. Tony Hirst has just published a short post with some links to tools and applications that are useful for visualisations.
For those who want to learn how to manage digital data, there is an online training programme called Mantra that could be very helpful. If you are a ‘maps person’ you might find useful this new tool released by LA Times to convert GIS shapefiles into SVGs.
But if you are more into statistics don’t miss this FAQ guide about basic data concepts. Are you passionate about code? So visit this compilation of apps used by the National Public Radio.
Sometimes, news about economy are a little bit hard to understand, particularly if they are about big numbers. The Guardian tried to solve this problem using visualisations to explain Autumn statement.
But data is not just about what is happening now. We can use it to understand the future, like David McCandless tries to do it with this infographic about CO2 emissions. But we can also have a look to the past, like this interactive map about London bomb sight during World War II.
And if you are really passionate about a book or a TV show you can also start collecting your own data and make some visualisations like this “statistical look back” at The Walking Dead series or this interactive graph about Game of Thrones (the books).
The web is flooded with ‘infographics’ made by PR and marketing agencies that don’t pay much attention to the accuracy of the message.
But, as they have became so popular, many people think they are the original ‘information graphics’. Alberto Cairo is claiming the word infographic back.
If you are really interested in maps, I’m sure you will read this blog post from ProPublica where they explain all the choices and decisions they made to create an interactive map about migrations of african-americans from the countryside to the cities.
And also in ProPublica you can read how they used Creative Commons license to spread their content and how useful it was.
Data Roundup, December 6
This is post was originally published in the School of Data
Photo by Elkit
Visualising Catalan election: an alternative to Vilaweb’s map
Last week Catalonia, a Spanish autonomous community, celebrated their general elections in the middle of a strong debate about independence.
Parliament is divided between political parties that support a self-determination referendum and those who are against the plebiscite.
As we can see, the map is divided by local authority and uses two tonalities of orange to represent who holds the majority in each council: pro-referendum or unionists. If you drag your cursor over the different boundaries, you can read more details about the distribution of the vote.
I’m sure that Vilaweb’s team made a big effort to create this visualisation. It’s fully developed by themselves and it was updated on real time during the vote recount. But, in my opinion, it was a completely failure from the point of view of functionality.
Ask the right question
When I first saw this map I asked myself ‘what are they trying to tell me?’. Do they want to highlight that Catalans vote nationalist parties? Well, we already knew that. They have held the majority for decades and you don’t need to spend hours writing code to create an interactive map to say this.
So, maybe they want to show where the pro-referendum organisations are stronger. Not really, because they only use one tonality of orange to show which bloc is the most voted in each council.
I think that the biggest mistake that Vilaweb designers made was not to ask the right question from the beginning. It seems that they knew they wanted a map but not the reason why.
To me, the right question would be ‘where the pro-referendum parties are stronger?’. And if you choose to visualise this, you will need to use more than just two colours.
Let’s remember what Ben Fry wrote about the importance of questions (bold is mine) :
“One of the most important (and least technical) skills in understanding data is asking good questions. An appropriate question shares an interest you have in the data, tries to convey it to others, and is curiosity-oriented rather than math-oriented. Visualizing data is just like any other type of communication: success is defined by your audience’s ability to pick up on, and be excited about, your insight.”
From his book Visualizing Data (2007)
Interaction should be easy
I don’t think this is a functional map. Let’s do a short test: try to find Barcelona. Did you get there? Well, maybe that means that you know a little bit about Catalan geography, so let’s make it a little bit more difficult. Try to find Tírvia. Complicated, isn’t it?
It’s reasonable to think that most of your readers will not be able to situate on a map every single local authority in Catalonia (there are almost 950). So, why don’t you help them creating a simple search box?
When you are designing a visualisation you need to think in how your audience would interact with it. The average reader is not going to search in Google Maps where a certain council is and then go back to your application to get the data. You need to make the user’s life easier.
My proposal based on Google Fusion Tables
I want to show you how would I visualise this data using free tools and without writing any code. Obviously it would look better with the contribution of a developer, but I think that even though it is not the most beautiful design, it is more useful and clearer than Vilaweb’s map.
I’ve used Google Fusion Tables to generate a map. Sadly, Tumblr is not the best place to embed it, so you will need to click here to see it. Red means more pro-referendum vote and blue more support to the unionist bloc. It’s also interactive so, if you click in a council, more information will be displayed.
These are the main improvements I made:
- Using several colours to show the different percentages of support of the different blocs.
- Creating a search box so you can look for the council you would like to display.
- Giving information about percentages and not just about number of votes.
Do you agree with my critics and my solution? Comment and join the conversation!
Data Roundup, November 28
This post was originally published on the OKFN School of Data
Photo by chsh/ii
TOOLS, COURSES AND EVENTS
Datawrapper 1.0 is an open source tool to create graphs and charts easily that has just been released. During the last months it has been working as a beta version. It’s designed by Mirko Lorenz and it allows you to create embeddable data visualisations in a very simple way.The OKFN has created a new chapter in France as a way of relate local initiatives with the international open knowledge community. You can follow the French group on Twitter.
If you are interested in data visualisation, Alberto Cairo is organizing the second edition of a free course called ‘introduction to infographics and data visualisation’, hosted by the Knight Center for Journalism in the Americas at the University of Texas. For the first edition more than 2000 people enrolled in just a few days. This time there will be more vacancies but more than 1100 students joined the course so far.Next 1st and 2nd of December, 8 Latin American countries will celebrate an international hackathon called ‘Desarrollando América Latina’ (Developing Latin America). The aim of this event is to develop apps to ‘solve cross curricular social problems’ of the region. It will take place in Argentina, Bolivia, Brazil, Costa Rica, Chile, México, Perú, and Uruguay.
Data can measure (almost) everything, even happiness. The ONS has launched a report about the well-being in the UK and they even designed a wheel to interact with the data.And even though we don’t know if happiness is just about the money, The Guardian Datablog made the wages map of Britain
Emil Johansson also likes maps, but he prefers the Middle Earth instead of Great Britain. This Swedish student is the author of the Lord of the Rings Project, a data driven work about Tolkien’s universe. It was launched a few months ago, but it has became so successful that his owner had to ask for money to buy a bigger host. He collected more than 700 dollars in just 3 days.
Earlier this month Margaret Hodge, chair of the Public Accounts Committee, stated that data should play an important role in government decisions but that there were some coordination problems that prevented this to happen. This week, public sector took a big step towards the use of big data with the announce of a new code of practice to protect privacy in government datasets.And on the other side of the Atlantic, O’Reilly’s data blog Strata talks about how the public sector efficiency could be improve with the right use of big data.
Registration open for data visualisation course led by Alberto Cairo
Today I enrolled myself in the second edition of a course called ‘Introduction to Infographics and Data visualisation’.
More than 2000 people participated in the first edition of the course, that started a few weeks ago. Due to the huge success, those who missed it at first have another chance to attend.
The course will start on January 12th and it will last 6 weeks. During this time, Cairo will try to explain us the basics of data visualisation from a journalistic point of view. A good thing is that you don’t need to have any design knowledge, even though it won’t be a problem if you manage some tools.
If you want to enroll, you have to sign up here and follow the instructions. Luckily, it’s completely free.
I’m pretty excited about this course. I will use this blog to upload my work and impressions.