Search this site
Getting started in data journalism
Looking for data on a particular topic or issue?
Not sure what exists or where to find it?
Don’t know where to start?
In this section we look at how to get started with finding public data sources on the web.
There are many publicly-available sources that can be accessed by the data journalist.
Image courtesy of S Falkow and released under Creative Commons
Streamlining Your Search
While they may not always be easy to find, many databases on the web are indexed by search engines, whether the publisher intended this or not. Here are a few tips:
Browse data sites and services
Over the last few years a number of dedicated data portals, data hubs and other data sites have appeared on the web. These are a good place to get acquainted with the kinds of data that is out there. For starters you might like to take a look at:
Ask a Forum
Search for existing answers or ask a question at Get The Data or on Quora. GetTheData is Q&A site where you can ask your data related questions, including where to find data relating to a particular issue, how to query or retrieve a particular data source, what tools to use to explore a data set in a visual way, how to cleanse data or get it into a format you can work with.
Ask a Mailing List
Both of these lists are filled with data journalists and Computer Assisted Reporting (CAR) geeks, who work on all kinds of projects.
Chances are that someone may have done a story like yours, and may have an idea of where to start, if not a link to the data itself.
You could also try Project Wombat (“a discussion list for difficult reference questions”), the Open Knowledge Foundation’s many mailing lists, mailing lists at theInfo, or searching for mailing lists on the topic, or in the region that you are interested in.
Hacks/Hackers is a rapidly expanding international grassroots journalism organization with dozens of chapters and thousands of members across four continents.
Its mission is to create a network of journalists ("hacks") and technologists ("hackers") who rethink the future of news and information.
With such a broad network — you stand a strong chance of someone knowing where to look for the thing you seek.
Ask an Expert
Professors, public servants and industry folks often know where to look. Call them. Email them. Accost them at events.
Show up at their office. Ask nicely. “I’m doing a story on X. Where would I find this? Do you know who has this?”
Learn About Government IT
Understanding the technical and administrative context in which governments maintain their information is often helpful when trying to access data.
Whether it’s CORDIS, COINS or THOMAS — big-acronym databases often become most useful once you understand a bit about their intended purpose.
Find government organizational charts and look for departments/units with a cross-cutting function (e.g. reporting, IT services), then explore their web sites.
A lot of data is kept in multiple departments and while for one, a particular database may be their crown jewels, another may give it to you freely.
Look out for dynamic infographics on government sites.
These are often powered by structured data sources/APIs that can be used independently (e.g. flight tracking applets, weather forecast java apps).
Try using phrases and improbable sets of words you’ve spotted since last time you searched.
When you know more about what you are looking for, you may have a bit more luck with search engines!
Write an FOI Request
If you believe that a government body has the data you need, a Freedom of Information request may be your best tool. See below for more information on how to file one.
Contributing authors include: Brian Boyer (Chicago Tribune), John Keefe (WNYC), Friedrich Lindenberg (Open Knowledge Foundation), Jane Park (Creative Commons), Chrys Wu (Hacks/Hackers)
This piece is part of the Data Journalism Handbook which is released under Creative Commons CC-BY-SA. Selected chapters of the handbook are being republished in the Media Helping Media Data Journalism section. In each case the author or authors of the piece are mentioned.