Professional Documents
Culture Documents
Data
Data is/are the facts of the World. For example, take yourself. You may be 5ft tall, have brown hair and blue
eyes. All of this is “data”. You have brown hair whether this is written down somewhere or not.
In many ways, data can be thought of as a description of the World. We can perceive this data with our
senses, and then the brain can process this.
Human beings have used data as long as we’ve existed to form knowledge of the world.
Until we started using information, all we could use was data directly. If you wanted to know how tall I was,
you would have to come and look at me. Our knowledge was limited by our direct experiences.
Information
Information allows us to expand our knowledge beyond the range of our senses. We can capture data in
information, then move it about so that other people can access it at different times.
Here is a simple analogy for you.
If I take a picture of you, the photograph is information. But what you look like is data.
I can move the photo of you around, send it to other people via e-mail etc. However, I’m not actually moving
you around – or what you look like. I’m simply allowing other people who can’t directly see you from where
they are to know what you look like. If I lose or destroy the photo, this doesn’t change how you look.
So, in the case of the lost tax records, the CDs were information. The information was lost, but the data
wasn’t. Mrs Jones still lives at 14 Whitewater road, and she was still born on 15th August 1971.
The Infogineering Model (below) explains how these interact…
Why Big6™?
We all suffer from information overload. There’s just too much “stuff” out there, and it’s not easy to keep up.
At the same time, there’s an irony—yes, we are surrounded by information, but we can never seem to find
what we want, when we want it, and in a form we want it so that we can use it effectively.
One solution to the information problem—the one that seems to be most often adopted in schools (as well as
in business and society in general)—is to speed things up. We try to pack in more and more content, to work
faster to get more done. But, this is a losing proposition. Speeding things up can only work for so long.
Instead, we need to think about helping students to work smarter, not faster. There is an alternative to
speeding things up. It’s the smarter solution—one that helps students develop the skills and understandings
they need to find, process, and use information effectively. This smarter solution focuses on process as well
as content. Some people call this smarter solution information literacy or information skills instruction. We
call it the Big6.
In addition to considering the Big6 as a process, another useful way to view the Big6 is as a set of basic,
essential life skills. These skills can be applied across situations—to school, personal, and work settings. The
Big6 Skills are applicable to all subject areas across the full range of grade levels. Students use the Big6 Skills
whenever they need information to solve a problem, make a decision, or complete a task.
The Big6™
Developed by Mike Eisenberg and Bob Berkowitz, the Big6 is the most widely known and widely used
approach to teaching information and technology skills in the world. Used in thousands of K-12 schools,
higher education institutions, and corporate and adult training programs, the Big6 information problem-
solving model is applicable whenever people need and use information. The Big6 integrates information
search and use skills along with technology tools in a systematic process to find, use, apply, and evaluate
information for specific needs and tasks.
The circular nature of the model demonstrates that becoming information literate is not a linear process; a
person can be developing within several pillars simultaneously and independently, although in practice they
are often closely linked.
Each pillar is further described by a series of statements relating to a set of skills/competencies and a set of
attitudes/understandings. It is expected that as a person becomes more information literate they will
demonstrate more of the attributes in each pillar and so move towards the top of the pillar.
LESSON 1: TYPES OF INFORMATION AND INFORMATION NEEDS
Information needs
Information seeking theories often refer to the concept of information needs, a presumed cognitive state
wherein an individual’s need state triggers the search behavior characteristic of information seeking in a
given context. While terms such as these have migrated from a common theory to everyday colloquial use,
their use in design research should be questioned and evaluated as in any research. There are other lenses to
view behavior that focus on motive, goals, activity contexts, but not necessarily “need,” whether information
or other personal need.
Information need goes back to a definition from Taylor’s (1962!) article “The Process of Asking Questions”
which describes four types:
The actual, but unexpressed, need for information (the visceral need)
The conscious, within-brain description of the need (the conscious need)
The formal statement of the question (the formalized need)
The question as presented to the information system (the compromised need).
Definition of Research
2.1 Definition of Research
re·search: NOUN: 1. a detailed study of a subject, especially in order to discover (new) information or reach a
(new) understanding.
1. Find the population of each country in A search for individual facts or data. May be part
Africa or the total (in dollars) of Japanese of the search for a solution to a larger problem or
investment in the U.S. in 2002. simply the answer to a friendly, or not so friendly,
bar bet! Concerned with facts rather
than knowledge or analysis and answers can
normally be found in a single source.
2. Find out what is known generally about a A report or review, not designed to create new
fairly specific topic. "What is the history of information or insight but to collate and
the Internet?" synthesize existing information. A summary of the
past. Answers can typically be found in a selection
of books, articles, and Web sites.
[Note: gathering this information may often
include activities like #1 above.]
Information needs
Using a Topic to Generate Questions
Research requires a question for which no ready answer is available. What do you want to know about a
topic? Asking a topic as a question (or series of related questions) has several advantages:
1. Questions require answers.
A topic is hard to cover completely because it typically encompasses too many related issues;
but a question has an answer, even if it is ambiguous or controversial.
TOPIC QUESTION
Drugs and Crime Could liberalization of drug laws reduce crime in the U.S.?
1. A clear open-ended question calls for real research and thinking.
Asking a question with no direct answer makes research and writing more meaningful. Assuming
that your research may solve significant problems or expand the knowledge base of a discipline
involves you in more meaningful activity of community and scholarship.
Developing a Question
Developing a question from a broad topic can be done in many ways. Two such effective ways are
brainstorming and concept mapping.
brain·storm·ing noun: 1. A method of shared problem solving in which all members of a group
spontaneously contribute ideas. 2. A similar process undertaken by a person to solve a problem by rapidly
generating a variety of possible solutions.
The American Heritage® Dictionary of the English Language: Fourth Edition. 2000
Brainstorming is a free-association technique of spontaneously listing all words, concepts, ideas, questions,
and knowledge about a topic. After making a lengthy list, sort the ideas into categories. This allows you to
inventory your current awareness of a topic, decide what perspectives are most interesting and/or relevant,
and decide in which direction to steer your research.
con·cept map·ping noun phrase: 1. A process, focused on a topic, in which group or individual brainstorming
produces a visual graphic that represents how the creator(s) thinks about a subject, topic, etc. It illustrates
how knowledge is organized for the group or individual.
You may create a concept map as a means of brainstorming; or, following your brainstorm, you may take the
content you have generated and create your map from it . Concept maps may be elaborate or simple and are
designed to help you organize your thinking about a topic, recognize where you have gaps in your
knowledge, and help to generate specific questions that may guide your research.
Combining brainstorming and concept mapping (brainmapping, if you will) can be a productive way to begin
your thinking about a topic area. Try to establish as your goal the drafting of a topic definition statement
which outlines the area you will be researching and about which you will present your findings.
Information Sources
Primary, secondary, and tertiary sources
When searching for information on a topic, it is important to understand the value of primary, secondary,
and tertiary sources.
Primary sources allow researchers to get as close as possible to original ideas, events, and empirical
research as possible. Such sources may include creative works, first hand or contemporary accounts of
events, and the publication of the results of empirical observations or research.
Secondary sources analyze, review, or summarize information in primary resources or other secondary
resources. Even sources presenting facts or descriptions about events are secondary unless they are based
on direct participation or observation. Moreover, secondary sources often rely on other secondary sources
and standard disciplinary methods to reach results, and they provide the principle sources of analysis about
primary sources.
Tertiary sources provide overviews of topics by synthesizing information gathered from other resources.
Tertiary resources often provide data in a convenient form or provide information with context by which to
interpret it.
The distinctions between primary, secondary, and tertiary sources can be ambiguous. An individual
document may be a primary source in one context and a secondary source in another. Encyclopedias are
typically considered tertiary sources, but a study of how encyclopedias have changed on the Internet would
use them as primary sources. Time is a defining element.
While these definitions are clear, the lines begin to blur in the different discipline areas.
In the humanities and social sciences
In the sciences
Humanities and Social Sciences 1-2-3 sources
In the humanities and social sciences, primary sources are the direct evidence or first-hand accounts of
events without secondary analysis or interpretation. A primary source is a work that was created or written
contemporary with the period or subject being studied. Secondary sources analyze or interpret historical
events or creative works.
Primary sources
Diaries
Interviews
Letters
Original works of art
Photographs
Speeches
Works of literature
A primary source is an original document containing firsthand information about a topic. Different fields of
study may use different types of primary sources.
Secondary sources
Biographies
Dissertations
Indexes, abstracts, bibliographies (used to locate a secondary source)
Journal articles
Monographs
A secondary source contains commentary on or discussion about a primary source. The most important
feature of secondary sources is that they offer an interpretation of information gathered from primary
sources.
Tertiary sources
Dictionaries
Encyclopedias
Handbooks
A tertiary source presents summaries or condensed versions of materials, usually with references back to
the primary and/or secondary sources. They can be a good place to look up facts or get a general overview of
a subject, but they rarely contain original material.
Examples
Subject Primary Secondary Tertiary
Art Painting Critical review of the painting Encyclopedia article on the artist
History Civil War diary Book on a Civil War Battle List of battle sites
Literature Novel or poem Essay about themes in the work Biography of the author
Political science Geneva Convention Article about prisoners of war Chronology of treaties
Formats
Data, facts, information, intelligence, and knowledge can be organized, presented and retrieved in many
physical formats:
Format Description
Printed Materials referenced and collected from print resources (hardback and
paperback books, periodicals, print-on-demand (POD) documents,
manuscripts, correspondence, loose leaf materials, notes, brochures,
etc.)
Microform Microform: materials that have been photographed and their images
developed in reduced size onto 35mm or 16mm film rolls or 4”x 6”
fiche cards, which are viewed on machines equipped with magnifying
lenses. In the UI Library this includes back issues of state, national, and
international
newspapers
; non-current issues of magazines; older ERIC documents; and
Agricultural Experiment Station documents.
Domain Names
The domain name tells you the type of organization sponsoring a page. It is a three-letter code that is part of
the URL and preceded by a "dot." Here are the most common domains.
Domain Description
.edu educational institution
Even though a page comes from an educational institution, it does not mean the
institution endorses the views published by students or faculty members.
.com commercial entity
Companies advertise, sell products, and publish annual reports and other company
information on the Web. Many online newspapers or journals also have .com names.
.gov government
Federal and state government agencies use the Web to publish legislation, census
information, weather data, tax forms and many other documents.
.org non-profit organization
Nonprofit organizations use the Web to promote their causes. These pages are good
sources to use when comparing different sides of an issue.
.net internet service providers
.mil U.S. military
In addition, more top level domain names were added in 2001.
Domain Description
.aero for the air transportation industry
.biz general use by businesses
.coop restricted use by cooperatives
.info for both commercial and non-commercial sites
.museum for museums
.name for use by individuals
.pro restricted to professionals and professional entities
I need to understand
thescope of my topic Guías Temáticas (UC3M Browse subject guides with descriptions of
library relevant sites
I need to see relatedtopics Google Uncover buried sites using "related searches"
I need to refine and SurfWax Search for your topic, then click "Focus" (top)
narrowmy topic to show similar, broader, and narrower
topics
I need to choose Hot Topics (Google Custom Begin your search on selective hot topic sites
acontroversial issue Search)
I need personal helpfrom Ask an ipl2 Librarian Get answers from volunteers and grad
experts students in a week (K-12)
I need sites ranked or Google High PageRank means popular, relevant sites
tagged as valuable or link to the page
relevant
Technorati Browse or search user-identified subjects
("tags") for blog advice or opinions
I need primary sources American Memory Locate documents, sound recordings, images,
maps, and other American primary sources
within the last hour Google Real-Time Choose "past hour" from the left column for
the most recent news
today Google News View top news stories and refine by category
or topic
recent (with analysis) BBC Special Reports In-depth topic coverage including news
features, analysis, photos, audio and video
a particular time period HistoryWorld Enter year event to retrieve timeline, then
(decade, century, era) click on icons for information or images
a place CIA World Factbook Select country for basic profile and
transnational issues
I need news from other World Press Review Get nonpartisan summaries of views outside
countries' perspectives U.S.
I want to compare news Newseum Compare news reporting on U.S. front pages
treatment
PressDisplay Compare news reporting from 55 countries
I need a specific type of media...
photographs and visual Google Image Search Use advanced search to limit by size,
images coloration, file type
almanac data Country at a Glance (U.N.) Basic country profile, use InfoNation to
compare data from 6 countries
books and other printed WorldCat Search for books and reviews, options to
works refine results, check your local library's
holdings
use a search engineoutside Search Engine Colossus Browse search engines and directories from
the U.S. countries and territories
find sites organized by Virtual LRC Select Dewey Decimal number before
theDewey Decimal searching
System orLibrary of
Congress Classification
locate resources by file Google Limit search by file type (.pdf, .ps, .doc, .xls,
type .ppt, .rtf)
ipl2 Easy-to-
navigate,
well-
Goal = find documents relevant to an information need from a large document set.
How to search:
• Search by keywords typed in a box.
• Sometimes we can search also by some fields (advanced search).
Always: a querying language.
When using these operators we will get the documents according with that conditions.
Boolean operators: OR
We will get all the documents having the first keyword OR the second one à documents having either one.
Boolean operators: NOT (-)
We will get the documents that do NOT have the term
We use this operator to filter documents from a previous search. Ex.:
Shorteners, wildcards
Search by fields
4 Databases
How is information processed and stored in a Database?
DB have a structure (fields) & language
Uniform criteria for selecting, processing and recording
Formal analysis & Content analysis
o Tries to infer at the same time the intentions of the author and of the searcher
o Multidimensional
Selection of resulting clues
Translation into the system’s language
o Words, phrases, codes, numbers, etc.
o Control of the vocabulary and the subjects expressed
o Rules, syntaxes, indexing systems, classification schemes
May include, in addition, full text / raw data
Translating search clues
Clues can be words, terms, expressions, formulas, phrases, dates, numbers, codes, etc. and the
relationships between them.
Translation is done in different ways depending on system characteristics:
o search equations / queries (a combination of search parameters (translated clues) and
search operators (boolean, others, truncation)) Eg.: ((“stem cells” AND biomechanic) NOT
engineering) PY=2012
o fill-in forms or query menus
o indexes or automated thesauri
o use of codes and classification schemes or taxonomies, etc.
o folksonomies
In “friendly” systems: auxiliary functions (interface guides the translation).
Command languages: more powerful, efficient and precise, but need training.
4.1 Information structure: databases
Terminology
The first thing you have to learn is a little bit of database terminology and concepts. Don't worry it isn't hard
or even very confusing (hopefully).
Ok from the big to the small. A "database" is a collection of related "tables". A "table" is a collection of related
"records". A "record" is a collection of related "fields". And a "field" is a collection of related pieces of
information (the stuff we are after when we work with databases). So:
Database terminology for whatever reason uses multiple names for the same things at times. A record is a
row, a column is a field (it is also sometimes called an attribute), and a cell is a piece of information
(occasionally called data though technically it isn't).
Concepts
OK - now for a little more information and concepts about tables. Each table is required to have a way to
uniquely identify each record in it. This allows one record's information to be accessed amidst the thousands
of other records. The field (or combination of fields) that hold the unique identifier is called the primary key.
When considering what field to declare as a table's primary key, you must keep in mind if the field will EVER
possibly have 2 records with the same entry or if the field will ever change. A person's name typically does
not work since there are many John Browns in the world. An address may eventually be changed. A social
security number or generated account number may be a better option as a table's primary key.
Next major concept - indexes. An index allows a user to quickly find the information they are looking for.
Think of it like a book index - look up something and find exactly where it is in the book. A primary key is an
index (usually made automatically by the database). A table can have as many indexes as you want - to help
you find the information you seek. Just remember that the more indexes a table has the more space that is
being used by those indexes.
One more concept - the information contained in a cell usually is as granular as possible. What does that
mean? Basically it means that you break down the data into its smallest pieces. Well it is easiest to show an
example rather then explain.
You have a person's name. You can save it in the database attribute NAME as "Lennon, Luke M." or even
"Luke M. Lennon". But what if for some reason you want to know something about the Lennon Family? Now
you will have to manipulate the strings to isolate the last name. A better way to do it... Instead of having 1
field called NAME we should have 3 fields named FIRST_NAME, MIDDLE_NAME, and LAST_NAME. This will
allow a person's name to be broken down into its smallest parts. Making sense?
The last basic concept that I believe is important to know, is that all data held in the database is kept in a
random order. This includes the ordering of the fields (AKA columns) and the order of the information
inserted or returned (rows). In a database there is no difference if the columns are output as "Name Zip Job"
or "Job Name Zip". To the database they are the same thing. The database also does not care if the results
(generated by a query) positions the record containing "John" as first or 15th or last. We will later discuss a
way to guarantee an output's ordering by using the databases computer language.
Scope: What subject areas are being covered? What years are
What is in the database? covered? What type of materials (journals, books, book chapters,
dissertations, etc.) are included? Can you find a list of journals or
other materials that are included in the database? Check for any
links to “About this database” for the answers to these questions.
One quick way to reduce your results and focus your search is to
add one or more additional concepts to your search. See if you can
Can you combine searches or type more terms into your search box, or if you need to modify your
add more concepts to your search in another way. Also, is there a “Search History” feature
original search? available? If so, you may be able to combine some of your previous
searches into a new one that should reduce your results and focus
your search. Also try focusing your search by using controlled
vocabulary terms as described in the “How does it search?”section
above.
The more concepts you combine in a search, the fewer results you
Eliminate concepts
are likely to retrieve. If you get little or no results from your search,
try eliminating some of your concepts, limits, or modifiers.
5 Deep Web
The Web is fast becoming a titanic, complex entity. By the year 2015, it’s estimated that one zettabyte of
content will be added to the web each and every year. Navigating this sea of information presents more and
more of a challenge -- particularly when much of that content is not easily accessed by traditional search
engines.
When most of us think of the Web, we think of the 'Surface Web', also known as the visible web - the
webpages we access directly, via links or via common search engines like Google. However, the Surface Web
makes up just 4 percent of all the content on the Internet.
The ‘Deep Web’ or ‘Invisible Web’ is several orders of magnitude larger than the Surface Web and represents
a staggering 96 percent of information on the Web. This content includes:
• Dynamic or scripted content
• Unlinked content - pages which are not linked to by other pages, which may prevent web crawling
pprograms from accessing the content.
• Private or password-protected websites
• Webpages with content varying for different access contexts (e.g., ranges of client IP addresses or
previous navigation sequence).
• Limited access content - sites that limit access to their pages in a technical way
• Non-HTML/text content - textual content encoded in multimedia (image or video) files or specific file
formats not handled by search engines.