You are on page 1of 39

DIGITIZATION OF

LIBRARY MATERIALS

Learning Outcomes
Understand

the digitization process.


Identify the hardware and software used in
digitization.
Describe the technical issues in digitization.
Identify the file formats used in digitization.

Definition

Digital documents may be born-digital, created using


digital publishing tools (e.g. Word,), or created by
converting from an analogue format to digital format or
converted from one digital format to another to suite
the requirements of a particular Digital Library. The
process of capturing and converting from analogue to
digital format is often called as 'digitization or
'digitalization'.

Digitization is the conversion of an analogue signal or


code into a digital signal or code. (Lee, 2001,3)

Approaches to Digitization
Born

digital.

E-journals
E-books

Information

surrogates.

i.e. originally produced in another form but


subsequently digitized

Formulating a Digitization
Policy
The

digitization policy contain guidelines:

For access to materials.


An analysis of the audience.
A plan for preserving original and digital files.
A prescription for handling ownership rights
issues.
A commitment for support for digital projects.
Selection criteria for choosing materials to
digitize.

Digitization Process
Involves

Four Basic Activities /Stages:

Select material
Convert normal text into electronic text
Format electronic text for the Internet
Create website for access and navigation

Stage 1 (Select material)


Selection

of materials
Obtain copyright clearance
Preparation
and conservation
materials

of

print

Stage 2 (Convert normal text into electronic text)


Scan

or re-key text
In-house or outsource
OCR
Proofread text

Stage 3 (Format electronic text for the Internet)


Design

website and interface


Develop XML DTD
Mark up text in XML
Index text for searching

Stage 4 (Create website for access and navigation)


Prepare

style sheets for XML to HTML


Present finished product online

Steps Involved in Digitization


Initiate

the digitization projects


Establish start-up costs and secure funding
Prepare a detailed project plan including
milestones and deliveries.
Assess and select materials for digitization.

Steps Involved in Digitization


Digitize materials (prepare source material,
digitize, check quality)
Post-process digitized material, OCR (if
necessary), store in appropriate file formats,
compress (if necessary), catalogue, index.
Deliver materials and/or make them
accessible to users.
Support ( prepare for maintenance, archiving,
migration, and so on).

Item Selected for Digitization

A library, digital or otherwise is always highly selective


subset of available information objects, segregated and
favored, to which access is enhanced.
Printed materials

Drawings, photographs, paintings, manuscripts, rare book


materials, and museum objects or relics.

Different types of materials may require different


digitization approaches and also different hardware,
software, technology and skills.

How do you select?


What are the determining factors?

Factors to be Considered in
Selection
Objective

of the project
The uniqueness of the materials
The materials physical fragility
The demand for them
The target users
Time
Certification
Standardization

Guidelines for Selection


Copyright

clearance.
Get adequate information about the image to
ensure retrieval from a database.
Find out the various image modalities that are
available in the collection (size and format).
Determine whether it is technically feasible to
capture the information (framed images and
oversized materials).
Determine who will use the images and how.

Factors to be Considered When Digitizing


Mounted Images

Can the item be removed easily and safely from its


mount or mat for capture?
If the items need to be removed, who should be involved
and at what levels? Conservators? Framing personnel?
Material preparers
Should an original mount be treated as part of the item,
and should it be included in the scan?
What type of physical adjustments are to be made if the
original item needs to be remain as it is and what type of
complications may be involved for scanning or imaging?

Technical Issues in Digitization Works


What

hardware to use?
What software to use?
What about image file compression.
It all depends on the type of source materials
to be digitized.

Hardware
Scanner

Are the most commonly used devices for


capturing digital images 4 major consideration:
Optical resolution
Bit depth (optical density or OD)
Scan area
Scan time

Hardware
Digital

camera

Supports RAW image file format


TIFF (Tagged Image File Format) is nice to have in
addition to RAW but not essential
Macro lens
Third party imaging software such as Adobe
Photoshop
S digital zoom of the 35mm film equivalent to 140
mm
A memory card of at least 512 MB or higher

Types of Scanner
Flatbed

scanner

Most appropriate choice


Widely use and versatile
Allows a single sheet or bound materials to be
placed face down on the scan bed.
Produce superb color
Apply an adapter for the purpose of scanning
negatives and slides.

Types of Scanner
Overhead

scanner

Large and bulky


Suitable for fragile materials
Color scanning is possible but very expensive

Types of Scanner
Sheet-fed

scanner

Not a good choice because we have to slide


sheets of paper through the scanner, one by one
so not suitable for bound materials
Have the potential of damaging loose
manuscripts, papers, and photographs
Excess handling of materials that has the
potential of ruining the materials.

Types of Scanner
Film

scanner

Suitable For Photographs, Slides and Negatives


Limited By Size (5x7cm is about as large as you
can scan)

Example

Zeutschel OS 12000 Bookcopy

Software
After

an image has been created, software


may be required to:

Edit or post-process the image, eg. To adjust the


tone or color.
Creation of text file from an image files containing
only text, or text with graphics.

Types of Software Required


HTML editor
XML editor, parser and XSLT processor
Text editor and / or word processor
Image editor
Scanning software
OCR software
FTP software
Page layout and design software
PDF software

Compression and File Formats


Compression

is done to reduce file sizes


and the types of **compression must not
affect the image quality

Lossless compression
Loss compression

File Formats

Various file formats for texts and images are available.


Different factors affect the choices of format at each stage of
the digitization process:
Acquisition.
Acquisition The first step and most important step. To
maintain the highest fidelity to the original.
Archival storage.
storage There must be a standard format that will
be readable in future.
Editing.
Editing Propriety format may be useful to support any
editing of information.
Delivery.
Delivery Destination device (screen, printer) and its
capabilities, delivery method, file size and network
bandwidth, and format support at destination.

CLOUD COMPUTING ?

Cloud Computing
Some

analysts and vendors define cloud


computing narrowly as an updated version of
utility
computing:
basically
virtual servers
available over the Internet. Others go very broad,
arguing anything you consume outside the
firewall is "in the cloud," including conventional
outsourcing.
Please refer to:
http://www.infoworld.com/d/cloud-computing/what-cloud-comp
uting-really-means-031

Why digitized?

Why digitized?
Increase

access
Increase usage
Global usage
Preservation of original resource
Short-term preservation
Virtual collection
Organizational profile

What can be digitized?


Any

information sources, packages


Teaching/learning materials, reports,
historical documents, photographs,
manuscripts, maps, MoU, exam papers etc.
Consider the value of sources to the users,
organizations, and country
Unique sources
Prioritize the sources to be digitized
Study the copyright status

Access to digitized information


Two ways in accessing
1.
Searching
2.
Browsing
Browsing mode users can glance through a list or
hierarchy for the required digitized information
resource.
Searching based on the words or content of the
digitized resources. This may be enable by
indexing the full texts of the digitized materials or
indexing according to some metadata structure,
such as author, title or descriptor.

Cost of Digitization
1.
2.

3.

4.

5.

Preparation time cost. Relate to the time it take s


to get the originals ready for scanning.
Handling cost. Large items such as maps, and
fragile materials like glass or painting, will require
more time and effort.
Automated processing cost. Conversion of data to
machine-readable form. The more human
intervention is required, the more expensive the
process will be.
Skills/experience cost. The creation of complex
metadata requires experiences and
knowledgeable person.
Optimization cost. Activities requires to improve
the quality of the scanned items.

Cost of Digitization ( cont.)

6. Resource cost. The cost of equipment, set


up, software etc.
7. Quality assurance (QA) cost. Ex. The cost of
QA for colour or photograph data will
generally be higher than black and white.
8. File size cost. The longer the file the higher
the cost of storage media, movement of data
and its management.

Determining access to digital


information sources
Method of access Internet, Intranet
Display only? Download or print?
Restricted access? General users; registered
users
IP validated, User ID and Password
Thumbnail display for large images
Small files to facilitate speed of access
Images need to be indexed to facilitate searching
and retrieval

Thank You

You might also like