You are on page 1of 3

22/12/2010 The growing importance of data journ…

Home Shop Answers Radar: News & C ommentary Safari Books Online C onferences Training School of Technology

SEARCH
Insight, analysis, and research about emerging technologies

Data Edu 2.0 Gov 2.0 Mobile Programming Publishing Web 2.0 Find us on: About Radar

MOST RECENTLY DISCUSSED


The growing importance of data journalism Print
Listen Strata Gems: DIY personal sensing and automation
Parsing the progress of open government data requires new
tools and reliable information sources. Steve Wozniak on the FCC and Internet freedom
by Alex Howard | @digiphile | Comments: 2 | 21 December 2010 The growing importance of data journalism
107 Curtir 5 Strata Gems: Turn MySQL into blazing fast NoSQL

Why web services should be released as free software


One of the themes from News Foo that continues to resonate with me is the importance of data
journalism. That skillset has received renewed attention this winter after Tim Berners-Lee called RELATED TOPICS
analyzing data the future of journalism.
Data
When you look at data journalism and the big picture, as USA Today's Anthony DeBarros did at Gov 2.0
his blog in November, it's clear the recent suite of technologies is part of a continuum of
technologically enhanced storytelling that traces back to computer-assisted reporting (CAR). RELATED EVENTS

As DeBarros pointed out, the message of CAR "was about finding stories and using simple tools to Strata
do it: spreadsheets, databases, maps, stats," like Microsoft Access, Excel, SPSS, and SQL Server. The Business of Data, February 1-3,
2011 Santa Clara, CA
That's just as true today, even if data journalists now have powerful new tools for scraping data
from the web with tools like ScraperWiki and Needlebase, scripting with Perl, or Ruby, Python,
MySQL and Django.

Understanding the history of computer-assisted reporting is key to putting new tools in the proper ARCHIVES
context. "We use these tools to find and tell stories," DeBarros wrote. "We use them like we use a
Archives by Month...
telephone. The story is still the thing."
Archives by Topic...
The data journalism session at News Foo took place on the same day civic developers were
participating in a global open data hackathon and the New York Times hosted its Times Open
Archives by Author...
Hack Day. Many developers at contests like these are interested in working with open data, but
the conversation at News Foo showed how much further government entities need to go to deliver
on the promise open data holds for the future of journalism.

The issues that came up are significant. Government data is often "dirty," with missing metadata
or incorrect fields. Journalists have to validate and clean up datasets with tools like Google Refine.
ProPublica's Recovery Tracker for stimulus data and projects is one of the best examples of the
practice in action.

A recent gold standard for data journalism is the Pulitzer-Prize winning Toxic Waters project from
the New York Times. The scale of that project makes it a difficult act to follow, though Times
developers are working hard with nifty projects like Inside Congress.

You can see a visualization of the Toxic Waters project and other examples of data journalism in
this Ignite presentation from News Foo.

Making open government data sing

At ProPublica, the data journalism team is conscious of deep linking into news applications, with
the perspective that the visualizations produced from such apps are themselves a form of
narrative journalism. With great data visualizations, readers can find their own way and interrogate
the data themselves. Moreover, distinctions between a news "story" and a news "app" are
dissolving as readers increasingly consume media on mobile devices and tablets.

One approach to providing useful context is the "Ion" format at ProPublica.org, where a project like
"Eye on the Stimulus" is a hybrid between a blog and an application. On one side of the web page,

…oreilly.com/…/data-journalism.html 1/3
22/12/2010 The growing importance of data journ…
there's a news river. On the other, there's entry points into the data itself. The challenge to this
approach is that a media outlet needs alignment between staff and story. A reporter has to be
filing every day on a running story that's data sensitive.

Upgrading Data.gov
The data journalism News Foo session featured a virtual component, bringing City Camp founder
Kevin Curry, Data.gov evangelist Jeanne Holm, and Reynolds fellow David Herzog together with
News Foo participants to talk about the value propositions for open government data and data
journalism.

As the recent open data report showed, developers are not finding the government data they
need or want. If other entrepreneurs are to follow the lead of BrightScope, open government
datasets will need to be more relevant to business. The feedback for Data.gov and other
government data repositories was clear: more data, better data, and cleaner data, please.

Improving media access to data at the county- or state-level of government has structural barriers
because of growing budget crises in statehouses around the United States. As Jeanne Holm
observed during the News Foo session, open government initiatives will likely be done in a zero-
sum budget environment in 2011. Officials have to make them sustainable and affordable.

There are some areas where the federal government can help. Holm said Data.gov has created
cloud hosting that can be shared with state, local or tribal governments.Data.gov is also rolling out
a set of tools that will help with data conversion, optical character recognition, and, down the road,
better tools for structured data.

Those resources could make government data more readily available and accessible to the
media. Kevin Curry said that data catalogs are popping up everywhere. He pointed to CivicApps in
Portland, Ore., where Max Ogden's work on coding the middleware for open government led to
translating government data into more useful forms for developers.

Data journalists also run into government's cultural challenges. It can be hard to find public
information officers willing or able to address substantive questions about data. Holm said
Data.gov may post more contact information online and create discussions around each dataset.
That kind of information is a good start for addressing data concerns at the federal level, but
fostering useful connections between journalists and data will still require improvement and effort.

Related:

3 News Foo themes that continue to resonate

tags: data.gov, database, gov 2.0, government 2.0, newsfoo

107 Curtir 5

COMMENTS: 2
Stu [21 December 2010 10:18 AM]

Have you tried SmartOCR yet? It is a new software application which offers over 99.8 percent
accuracy and has a very nice interface. http://smartocr.com

Aaron [21 December 2010 08:53 PM]

You have got to read up on XBRL http://goo.gl/0ZpXq

Name:

Email Address:

URL:

Remember personal info?

Comments: (You may use HTML tags for style)

…oreilly.com/…/data-journalism.html 2/3
22/12/2010 The growing importance of data journ…

Please type the word RADAR below. (required):

Submit

About O'Reilly Com m unity More O'Reilly Sites


Academic Solutions Authors igniteshow .com
Contacts Forums makerfaire.com
©2010, O'Reilly Media, Inc. Customer Service Membership makezine.com
(707) 827-7000 / (800) 998-9938
All trademarks and registered trademarks Careers New sletters craftzine.com
appearing on oreilly.com are the property Press Room RSS Feeds labs.oreilly.com
of their respective ow ners.
Privacy Policy User Groups Partner Sites
Terms of Service PayPal Developer Zone
Writing for O'Reilly InsideRIA
O'Reilly Insights on Forbes.com

…oreilly.com/…/data-journalism.html 3/3

You might also like