BigDataFr recommends: Data APIs, design, and visual storytelling
‘Over the past five years, international agencies such as the World Bank, OECD, and UNESCO have created portals to make their data available for everyone to explore. Many non-profits are also visualizing masses of data in the hope that it will give policymakers, funders, and the general public a better understanding of the issues they are trying to solve.
Data visualization plays a key role in telling the stories behind the data. For most audiences, data sets are hard to use and interpret — the average user will need a technical guide just to navigate through the complicated hierarchies of categories let alone interpret the information. But data visualizations trigger interest and insight because they are immediate, clear, and tangible.
At FFunction, we visualize a lot of data. Most of the time our clients send us Excel spreadsheets or CSV files, so we were happily surprised when we started to work with UNESCO Institute for Statistics on two fascinating education-related projects — Out-of-School Children and Left Behind — and realized that they had been working on a data API. As we began to work through the data ourselves, we uncovered several reasons why using an API helps immeasurably with data visualization.
Why APIs are the best way to share data
If you look at the data available in most data portals, you’re likely to find XML, CSV files, Excel spreadsheets (for those who haven’t heard of Open Document format), or PDF documents (god forbid!). Although many organizations do have an API, more should.
If you’ve worked a little bit with data, whether internal corporate data or open data, you’ll probably know from experience that data very rarely comes in an ideal format. Very often, you’ll find your data contains a multitude of formats (dates are a common offender there, where you’ll find YYYY-MM-DD along with DD/MM/YYY and, for good measure, MM/DD/YYYY).
As a result, you almost always need to normalize your data to ensure that all the fields are in the same format. Normalization is not the only chore, though. If you want to keep your data up to date, you’ll also need to make sure that your new data format hasn’t changed since the last time (fields shuffled, added, removed, etc.), which means you’ll need to create a program to validate your new data. And this necessitates having an automated way to retrieve the data. Once you finally have valid, normalized data, you can eventually process the data into a format that is suitable for a visualization.’
By Sébastien Pierre