Professional Documents
Culture Documents
Home Shop Answers Radar: News & C ommentary Safari Books Online C onferences Training School of Technology
SEARCH
Insight, analysis, and research about emerging technologies
Data Edu 2.0 Gov 2.0 Mobile Programming Publishing Web 2.0 Find us on: About Radar
As DeBarros pointed out, the message of CAR "was about finding stories and using simple tools to Strata
do it: spreadsheets, databases, maps, stats," like Microsoft Access, Excel, SPSS, and SQL Server. The Business of Data, February 1-3,
2011 Santa Clara, CA
That's just as true today, even if data journalists now have powerful new tools for scraping data
from the web with tools like ScraperWiki and Needlebase, scripting with Perl, or Ruby, Python,
MySQL and Django.
Understanding the history of computer-assisted reporting is key to putting new tools in the proper ARCHIVES
context. "We use these tools to find and tell stories," DeBarros wrote. "We use them like we use a
Archives by Month...
telephone. The story is still the thing."
Archives by Topic...
The data journalism session at News Foo took place on the same day civic developers were
participating in a global open data hackathon and the New York Times hosted its Times Open
Archives by Author...
Hack Day. Many developers at contests like these are interested in working with open data, but
the conversation at News Foo showed how much further government entities need to go to deliver
on the promise open data holds for the future of journalism.
The issues that came up are significant. Government data is often "dirty," with missing metadata
or incorrect fields. Journalists have to validate and clean up datasets with tools like Google Refine.
ProPublica's Recovery Tracker for stimulus data and projects is one of the best examples of the
practice in action.
A recent gold standard for data journalism is the Pulitzer-Prize winning Toxic Waters project from
the New York Times. The scale of that project makes it a difficult act to follow, though Times
developers are working hard with nifty projects like Inside Congress.
You can see a visualization of the Toxic Waters project and other examples of data journalism in
this Ignite presentation from News Foo.
At ProPublica, the data journalism team is conscious of deep linking into news applications, with
the perspective that the visualizations produced from such apps are themselves a form of
narrative journalism. With great data visualizations, readers can find their own way and interrogate
the data themselves. Moreover, distinctions between a news "story" and a news "app" are
dissolving as readers increasingly consume media on mobile devices and tablets.
One approach to providing useful context is the "Ion" format at ProPublica.org, where a project like
"Eye on the Stimulus" is a hybrid between a blog and an application. On one side of the web page,
…oreilly.com/…/data-journalism.html 1/3
22/12/2010 The growing importance of data journ…
there's a news river. On the other, there's entry points into the data itself. The challenge to this
approach is that a media outlet needs alignment between staff and story. A reporter has to be
filing every day on a running story that's data sensitive.
Upgrading Data.gov
The data journalism News Foo session featured a virtual component, bringing City Camp founder
Kevin Curry, Data.gov evangelist Jeanne Holm, and Reynolds fellow David Herzog together with
News Foo participants to talk about the value propositions for open government data and data
journalism.
As the recent open data report showed, developers are not finding the government data they
need or want. If other entrepreneurs are to follow the lead of BrightScope, open government
datasets will need to be more relevant to business. The feedback for Data.gov and other
government data repositories was clear: more data, better data, and cleaner data, please.
Improving media access to data at the county- or state-level of government has structural barriers
because of growing budget crises in statehouses around the United States. As Jeanne Holm
observed during the News Foo session, open government initiatives will likely be done in a zero-
sum budget environment in 2011. Officials have to make them sustainable and affordable.
There are some areas where the federal government can help. Holm said Data.gov has created
cloud hosting that can be shared with state, local or tribal governments.Data.gov is also rolling out
a set of tools that will help with data conversion, optical character recognition, and, down the road,
better tools for structured data.
Those resources could make government data more readily available and accessible to the
media. Kevin Curry said that data catalogs are popping up everywhere. He pointed to CivicApps in
Portland, Ore., where Max Ogden's work on coding the middleware for open government led to
translating government data into more useful forms for developers.
Data journalists also run into government's cultural challenges. It can be hard to find public
information officers willing or able to address substantive questions about data. Holm said
Data.gov may post more contact information online and create discussions around each dataset.
That kind of information is a good start for addressing data concerns at the federal level, but
fostering useful connections between journalists and data will still require improvement and effort.
Related:
107 Curtir 5
COMMENTS: 2
Stu [21 December 2010 10:18 AM]
Have you tried SmartOCR yet? It is a new software application which offers over 99.8 percent
accuracy and has a very nice interface. http://smartocr.com
Name:
Email Address:
URL:
…oreilly.com/…/data-journalism.html 2/3
22/12/2010 The growing importance of data journ…
Submit
…oreilly.com/…/data-journalism.html 3/3