You are on page 1of 5

Frameworks for access Peter Brantley

A framework for enhanced digital access:


Draft conceptions of reform and legislation.

For the U.S. House of Representatives


Committee on the Judiciary

Respectfully submitted.

Peter Brantley
Internet Archive
San Francisco, CA

January 30, 2010

Introduction.

The opportunity for creating new frameworks for intellectual property utilization that
serve to encourage the useful and recreational arts while enhancing access to the vast
creative riches of the nation's artists, scientists, and citizens has never been as compelling
or as uniquely susceptible to leverage. The vertiginous construction of new digital media
infrastructures reflects not merely an extension of existing patterns of media creation and
use, but a new paradigm of iterative creativity coupled with transformative reductions in
the barriers to distribution, sharing, and collaboration.

Analysis.

Coordination of legislative and regulatory reforms in three key areas -- digital deposit,
enhanced registries, and collective voluntary licensing – stand to combine to establish an
improved system of IP stewardship and use.

Digital deposit. The Library of Congress' (LoC) Copyright Office recently initiated a
project for demand based deposit of born-digital serials (for Federal Register notice, see
http://edocket.access.gpo.gov/2010/pdf/2010-1202.pdf). The Copyright Office should
be supported in its efforts to broaden deposit on demand, which should encompass ebooks
and other increasingly born-digital products, including film, photography, books, and
music.

Coincident with these deposit mechanisms, legislation should clarify the ability of the
Library to add digital deposits to its own collection for archival preservation and service
delivery. Transmittal of digital media from the Copyright Office to the Library for these
purposes should never be construed as an act of reproduction, but rather analogous to the
transmittal of print books to the Library, provided in § 407.

Digital content is accumulating at rates that are heretofore unprecedented; the Internet
Archive, a not for profit digital library that counts among its partners the Library of

Pg 1 of 5
Frameworks for access Peter Brantley

Congress and NASA (http://nasaimages.org), and conducts large scale web archiving for
many university and national libraries, is already maintaining petabyte-class storage
systems. On the commercial side, Yahoo's Flickr photography and video site hosted over
4 billion user- and institutionally-submitted images (including from the collections of the
Library of Congress and the Smithsonian in the Flickr Commons project) as of October
2009 (see http://royal.pingdom.com/2010/01/22/internet-2009-in-numbers/). Ebook
sales statistics from the International Digital Publishing Federation, which understate the
actual market, indicate a year-over increase from November 2008 to November 2009 of
200 percent. (http://idpf.org/doc_library/industrystats.htm). Apple's announcement of
an ebook store and reading application (iBooks) for their iPad tablet computer will boost
these numbers further (http://www.apple.com/ipad/features/).

Acquiring, building, and serving digital collections that are made available in digital form
through mass digitization is a necessary modernization of the Library of Congress'
mission. The digitization of texts through initiatives such as the Internet Archive's Open
Content Alliance, or Google Book Search (GBS), does not produce merely a digital copy of
a print work. A digitized book represents, in many ways, an entirely new publication
engendering whole new uses in support of access for distinct populations, such as the print
disabled, as well as supporting knowledge mining to enhance information search and
retrieval. In other words, these compilations, in part because their scale engenders new
possibilities for creation, represent a new capacity. Because they are born from our
nation's accumulated cultural heritage, they possess a public component. The Library
should develop depositories of digital content that preserve our nation's digital cultural
heritage, and enable new services.

Because of the scale of this country's digital “uplift”, it is appropriate to consider a new
system of Federally regulated digital depositories modeled on the existing Government
Documents depository system that would be subject to technical, security, and policy
auditing, and be required to support specific levels of access. Notably, the government
documents depository program is beginning a shift to digital depository distribution
through the Federal Digital System (http://www.gpo.gov/fdsys/search/home.action).

Not-for-profit digital libraries and educational consortia developing cloud based storage
architectures are well situated to fill essential components of this role. In addition to the
Internet Archive (http://www.archive.org), HathiTrust (http://www.hathitrust.org/) and
the DuraSpace initiatives (http://duraspace.org/index.php) are rapidly maturing. Hathi
is starting a assessment with TRAC (Trustworthy Repositories Audit and Certification), a
compliance suite originally developed by OCLC and NARA and maintained by the Center
for Research Libraries (CRL) (see http://www.crl.edu/archiving-preservation/digital-
archives/metrics-assessing-and-certifying). These efforts combine the oversight and
technical capacity to help meet the nation's digital needs.

Enhanced registries. The Copyright Office's registration system represents a complex set
of administrative, financial, and technical compromises matrixed against an increasingly
complex, lengthening set of copyright regulations, with differing historical precedents for

Pg 2 of 5
Frameworks for access Peter Brantley

text, audio, and visual images. Regardless of the long term fitness of the current copyright
code, the Copyright Office's registry system is in need of a reconceptualization to fit the
requirements of a digital age.

Digital media are no longer “fixed” in any way that might have been understood by
someone in the 19th or early 20th centuries. Because media are now capable of nearly
infinite transformations and derivatives that most of the general population can trivially,
and sometimes unknowingly, produce, a registration system has to be able to record a
rights biographical package, not merely an initial recording. A modern registry system
should record not only the essential profile of the publisher, but a recording of the rights
attributes of the work as they change over time. Whole rights are assigned; they are
transferred; they are birthed across boundaries of international territory and language.

Additionally, the premise that an artifact eligible for deposit has a straightforward rights
profile has never been more tenuous: a book (to take just one media type) is almost always
a composite of distinct items: inserts such as forewords and afterwords; narratives; charts,
illustrations, and photographs; and in cutting edge scientific, technical, medical (STM)
and legal publications, increasingly hyperlinked databases and interactive simulations.

My conversations with publishers indicate a desire to move toward a next generation


registry system far more flexible than the Google Book Search proposed class action
settlement's Books Rights Registry, which is naively focused on rights assertions (“this is
in-copyright”) as opposed to more useful cataloging of rights attributes (“publication date
1972”) that leave more leeway for risk assessment with the prospective user. Publishers
seek a publicly searchable and queryable system that would permit rights owners or their
delegates to digitally “sign” transactions to verify their authenticity. The adoption of
standard identifiers for books, music, and other media, combined with identifiers for
authors and contributors, would permit the packaging of rights and descriptive
information alongside associated publisher, distributor, and author data.

For historical registration information, an optimal system could support enhancements


and corrections to non-enforced (or non-digitally signed) elements of a work by registered
users, in the same manner that libraries improve the cataloging of an accessioned work by
contributing unique enhancements to common aggregations of bibliographic records, such
as OCLC's WorldCat (http://www.worldcat.org/) and the Internet Archive's OpenLibrary
(http://openlibrary.org/). Even among R1-class research university libraries, there is
relatively little “original” cataloging; rather, most cataloging work represents incremental
improvement.

The rights of digital objects do not stop at national borders, nor do their expected uses.
Registry transaction services will ultimately occur in an international context, with
maximal transparency. The European Union is already engaged in work that would assist
EU rights registration through initiatives such as ARROW (http://www.arrow-net.eu/).
Indeed, an interoperable rights registry on a global level would ultimately be endorsed at

Pg 3 of 5
Frameworks for access Peter Brantley

an international treaty level via an agency such as WIPO, obviating concern over possible
incursion into formalities as embodied in the current Berne and TRIPS conventions.
Agreement on a simple technical standard for sharing of rights attributes through a
common metadata schema is tractable; much could be achieved in early-stage draft
implementation as discussions to formulate international registry interoperation progress.

Interoperable rights registries would spur the genesis of innovative services in private
sector registration and rights maintenance. A workable analogy for the enactment of a
new registry system is embodied in the internet's domain name system (DNS). DNS is
governed by ICANN, and mandates technical interoperability and a core set of required
data elements to ensure that networked computers can find each other across the world.
DNS is a redundant, hierarchical system with a limited number of “root” server nodes;
similarly, the Copyright Office should host the national registry of record with secure
duplication through a small number of assured peer nodes to prevent either inadvertent or
malfeasant data corruption or loss. As long as the required data elements are collected,
and the resulting data is shared across the internet using a pre-determined technical
protocol, market competition in registry services can emerge.

Increasingly, government agencies and other organizations are making as much of their
data available to the public as possible through the use of machine language interfaces
called application programming interfaces (APIs). APIs permit one computer to query
another computer for data elements it is authorized to share (see http://www.data.gov/).
The Copyright Office could define a public API for its next generation registry service that
would permit search and query by the public; registry updates would be permitted by
validated, audited third parties. An API would help ensure the accuracy and timeliness of
distributed data, and permit private sector companies to build around a common core
registry, and distinguish themselves by their value-added capabilities. Registration
agencies could specialize, for example, in breadth (international coverage) or depth
(ability to manage complex media packages) or complexity (ability to rights encode
interactive media).

As in the DNS system, a content registrant could approach any approved registrar service
to catalog the associated rights attributes. By using the Copyright Office API, the registrar
would then update the root government registries for all new and any applicable existing
media objects. Rights and public ownership information would then be available for the
public to search and query.

Collective licensing. The example of mass digitization forces attention toward the means
of ensuring access to digital content when case-by-case procedural rights status
determination is either cost prohibitive or rationally infeasible. Such grounds may be
established for large scale simple search and discovery uses that do not incorporate
display (for reading or printing); or for situations which preserve critical library functions
in a digital age, such as digital lending for works without ready commercial availability, or
where factors of time or distance impede non-digital library lending. For this class of
cases, collective licensing can serve as an adjunct to, or when suitable, a surrogate for,

Pg 4 of 5
Frameworks for access Peter Brantley

orphan works legislation. Proper attention and credit to the negotiations enacted through
the discussions between Google and the parties representing authors and publishers as
presented in the settlement proposal may inform new draft legislation supporting either
voluntary licensing, or mandatory licensing with rightsholder opt-outs (a common Nordic
model of collective licensing), for works with a high per-clearance transaction cost.

The GBS Settlement proposal's articulation of different classes of works has heightened
sensitivity to the utility of multiple considerations for determining levels of access for
licensed use by various organizations, whether internet search firms, or libraries or
museums, or digital booksellers. For literary works, the complex gradations – between
books that are in-print and commercially available; books that are out of print with
identified and consensual rightsholders; books that are out of print, with unknown or
unresponsive rightsholders; and books that have returned to public domain – have
contributed to calls for legislative solutions that are not biased towards a narrow single-
party vision of commercial opportunity.

There are a number of ways of enacting cascading control over a collective licensing
paradigm. The potential for claims by forthcoming rightsholders against potentially
infringing uses of previously unclaimed OOP works might be more constrained for not-
for-profit or independent filmmakers, for example, than for commercial firms. Uses that
are not structurally derivative, and digital library lending programs, might be permitted
for claimed OOP works with rightsholder permission. Works that are classed as orphan
and have no known viable registration might be classed as provisionally public domain.
Perhaps an attractive path forward is to commence discussions for literary works and
then progress to a wider range of media.

A revitalized next generation registry system, alongside endorsement of digital deposit


strategies, can work in hand with collective licensing schemes to significantly broaden our
access to digital content. New legislation can incentivize voluntary enhanced and updated
registrations through the creation of inurements provided against claims of infringement
or violations of registry use.

Conclusion.

It is a queer product of technical and industrial development that we bear witness to


newly emerging conflicts between firms and organizations never previously in direct
competition, every one attempting to assert its determination of the future of digital
creation and use; exploitation and benefit. And yet this time of strife and innovation
affords us an unique opportunity to reshape copyright, registration, and access with a
vision that provides new means for companies and citizens to continue our greatest
collective enterprise unabated – our relentless creativity in art and science, and an
American entrepreneurship that reshapes the world.

Pg 5 of 5

You might also like