Professional Documents
Culture Documents
We can consider the database on three levels of abstraction: external, conceptual, and internal.
(See Figure 1.)
The external level has the users' views of the database. Depending on their needs, different users
access different parts of the database. For example, a doctor performing drug tests should be able
to access the patients' medical data but not their hospital bills. However, a billing clerk should
have a very different view of the database.
The conceptual level describes the logical structure of an entire database, including descriptions
of the data and relationships among the data. For example, at this level we would describe a row
of the tableStudyA05 as containing the values for SSN and placebo. However, we would not give
the details of the physical storage of the fields and records.
The internal level gives the details of the physical storage of the database on the computer. This
level contains such details as the number of bytes for each data item, ordering of records, and
data compression techniques. For example, at this level we would describe the attribute SSN as
10 bytes to store the nine-digit social security number.
Definition The external level of a database has the users' views of the database.
The conceptual level describes the logical structure of the entire
database, including descriptions of the data and relationships among
the data. The internal level gives the details of the physical storage of
the database on the computer.
Conceptual Level
SS Dosa LastNa FirstNa …
N ge me me
Internal Level
struct {
string SSN;
double dosage;
string LastName;
string FirstName;
...
}
To make the description of the model more complete, we consider the example of a physics
department at a college that maintains a database of experimental results. Throughout a
laboratory, students collaborate and share their results and access data sets from other semesters
on a computer system. For example, in the laboratory session on "Freely Falling Objects with
Significant Drag," students determine the drag coefficient by dropping dust balls from different
heights and measuring the times they take to fall. Each team enters its results into the distributed
database, and the class analyzes the data. After a team enters data into the web-accessed
database, all students can obtain the measurements simultaneously. To simplify the analysis, we
assume that the database only stores results related to this experiment over a period of several
years.
Entity
An entity is a real-world item or concept that exists on its own. In our example, a particular
student (such as, "Emanuel Vagas"), team, lab section, or experiment is an entity. The set of all
possible values for an entity, such as all possible students, is the entity type. In an ER model, we
diagram an entity type as a rectangle containing the type name, such as student (see Figure 2).
Definition An entity is a real-world item or concept that exists on its own. The
set of all possible values for an entity is the entity type.
Attribute
Each entity has attributes, or particular properties that describe the entity. For example, student
Emanuel Vagas has properties of his own Student Identification number, name, and grade. A
particular value of an attribute, such as 93 for the grade, is a value of the attribute. Most of the
data in a database consists of values of attributes. The set of all possible values of an attribute,
such as integers from 0 to 100 for a grade, is the attribute domain. In an ER model, an attribute
name appears in an oval that has a line to the corresponding entity box, such as in Figure 3.
Definition An attribute of an entity is a particular property that describes the
entity. The set of all possible values of an attribute is the attribute
domain.
o
f
F
o
r
m
B
o
t
t
o
m
o
f
F
o
r
m
Sometimes the value of an attribute is unknown or missing, and sometimes a value is not
applicable. In such cases, the attribute can have the special value of null. For example, until the
professor grades a laboratory assignment, the team grade is missing or null. For a student who is
auditing a course but participating as a team member, it is not applicable for that student to have
an individual grade; the student's grade can have the value of null.
Definition Null is the special attribute value that indicates an unknown or
missing value.
An attribute can be simple or composite. A simple attribute, such as grade, is one component
that is atomic. If we consider the name in two parts, last name and first name, then the name
attribute is a composite. A composite attribute, such as "Emanuel Vagas", has multiple
components, such as "Emanuel" and "Vagas"; and each component is atomic or composite. We
illustrate this composite nature in the ER model by branching off the component attributes, such
as in Figure 4.
Definition A simple attribute is one component that is atomic. A composite
attribute has multiple components, each of which is atomic or
composite.
Definition For a particular entity, an entity attribute that holds exactly one value
is a single-valued attribute. A multi-valued attribute has more than
one value for a particular entity.
A derived attribute can be obtained from other attributes or related entities. For example, the
radius of a sphere can be determined from the circumference. We request the derived attribute
with a dotted oval and line, such as in Figure 6.
Figure 6. ER diagram notation for derived attribute, radius
An attribute or set of attributes that uniquely identifies a particular entity is a key. For example,
Emanuel Vagas' Student Identification Number uniquely identifies him. However, to determine
the class we need a composite key that consists of several attributes, such as catalogue number,
section, semester, and year. In the ER diagram of Figure 7, we underline the composite
key, class. (The figure shows another attribute (DragExpWeek) of LabSection that stores the
week of the semester (1-15) in which the drag experiment occurs.)
o
f
F
o
r
m
Quick Review Question 2 Give the ER diagram notation for relationship type runs, B
o
where each team conducts a drag experiment run (DragExpRun). t
t
o
m
o
f
F
o
r
m
The degree of a relationship type is the number of entity types that participate. Thus,
the LabSecMemberOf relationship type of Figure 9 has degree 2, which we call
a binary relationship type. To clarify the role that an entity plays in each relationship instance,
we can label a connecting edge with a role name that indicates the purpose of an entity in a
relationship. For example, we can have two binary relationship types associating
the student and team types, TeamMemberOf and LeaderOf. In the former case, a student entity is
a member of a team entity; in the later case, a student can be a leader of a team. We illustrate the
situation in Figure 10.
Definition The degree of a relationship type is the number of entity types that
participate. If two entity types participate, the relationship type
is binary. A role nameindicates the purpose of an entity in a
relationship.
Figure 10. ER diagram notation with roles member, leader, and lab team
As Figure 11 illustrates, a relationship type can also have attributes. The relationship
type order connects entities chemical and supplier. The relationship is many-to-many because
each chemical can be from several suppliers and each supplier has a number of chemicals. An
order has a purchase date, amount, and total cost as well as the chemical and supplier
information. Thus, order has attributes PurchaseDate,amount, and TotalCost that we cannot
appropriately associate with chemical or supplier.
Figure 11. Relationship type with attributes
Recursive Relationship
Another situation involving roles that can arise in an application is a recursive relationship.
Suppose one team is assigned the task of supervising the other teams. This supervisor team
makes sure that each team records its portion of the data. The supervision relationship is
a recursive relationship because the same entity, a particular team, participates more than once
in the relationship, as a supervisor and as a supervisee. To draw the situation, we label the two
lines between team and supervision with the two possible roles (see Figure 12).
Definition A recursive relationship is one in which the same entity participates
more than once in the relationship.
Figure 12. ER diagram notation for recursive relationship type, supervision
Weak and Strong Entity Types
In the physics laboratory ER model example, the entity type student is strong because its
existence does not depend on some other entity type. However, the team entity type is weak. The
existence of teamdepends on the existence of LabSection, and we call the in identifying
relationship. As (see Figure 13) illustrates, we draw double lines around the identifying
relationship, the team entity type, and the line connecting the two to indicate the weak entity
type.
Definition An entity type is strong if its existence does not depend on some other
entity type. Otherwise, the entity type is weak.
Figure 13. ER diagram notation for weak entity type, team, and identifying relationship, in
Cardinality Constraints
If each team can have at most one student leader and a student can be a leader of at most one
team, we have a 1:1 or one-to-one relationship. We can illustrate this ratio by writing ones on
the lines indicating the relationship, as in Figure 14. Several teams can be in each lab section,
students as members, so we write the cardinality ratio as N:1 in this many-to-one
relationship and draw the diagram, as in Figure 15. Similarly, we can have a one-to-many
relationship.
Figure 14. ER diagram notation for one-to-one relationship
Because a student can retake a course, a student could be a member of more than one team.
Thus, we would need different variables, say N and M, for the numbers of student and team
entities, respectively; and the ratio would be N:M in this many-to-many relationship. As Figure
16 diagrams, a team has any number of students, and a student can participate in several teams
during his or her college career. Ratios, such as 1:1, 1:N, N:1, and N:M, give a cardinality
constraint or numeric restriction on the possible relationships.
Top of Form
Bottom of Form
Participation Constraints
Every student must be a member of a team, or, in other words, a student entity is of interest only
if it participates in a MemberOf relationship. Thus, we can include in an ER diagram
a participation constraint in which participation of student in MemberOf is total. A double line
indicates the total participation constraint in an ER model (see Figure 17). However, the
participation of student in LeaderOfin Figure 10 is partial, because a student might be a team
leader.
Figure 17. ER diagram notation for total participation constraint
Using the above components, we have the ER model in Figure 18.
Figure 18. ER diagram notation for total participation constraint
Attribute
Key attribute
Derived attribute
Multivalued attribute
Composite attribute
Relationship type
Total participation
Many-to-one
relationship
Projects
For each of the projects, do the following:
a. Draw an ER diagram. Make sure each entity type has at least one key attribute. Document any
assumptions you make. Avoid diagrams with a single entity.
b. For each attribute, determine its domain of values, whether null is an acceptable value, and, if
acceptable, what null indicates.
c. If you think the description is incomplete, list other data the database should store.
d. Adjust your design to incorporate these additions.
1. A chemistry department wants to have a database of all chemicals in the stockroom. The
information includes the name, molecular formula, amount on hand, date purchased, supplier,
and supplier contact information.
2. A space agency wants to develop a database of all satellites that humans have launched into
space. Data includes the satellite identification, date of launch, destruction date, purpose,
maximum orbital altitude, launching location, launching agency, and contact information for
agency.
3. An environmental agency wants to catalogue all the plants in an area that is susceptible to acid
rain. Data should include genus, species, quantity, date, quadrant identification number, quadrant
location, average altitude of quadrant, and botanist.
4. A psychological study requires participants to answer a number of questions related to
personality. The database should store the multiple choice answers (A, B, C, D) to the questions
and information about each participant, such as participant id, age, and sex. The database should
compute a score based on the individual's answers. The score indicates one of 8 personality
categories. Each category has an identifying name and specifies that each of three qualities is
either true or false.