Professional Documents
Culture Documents
Dr. A. Brennan
NOTE:
Physical Design
What is it?
Why?
Data Types
CHAR fixed-length character
VARCHAR2 variable-length character (memo)
LONG large number
NUMBER positive/negative number
DATE actual date
BLOB binary large object (good for graphics,
sound clips, etc.)
Goals
Data type
Integrity Controls
Physical Records
A Physical Record is a group of fields that are stored in adjacent secondary memory
locations and are retrieved and written together as a unit by particular DBMS
Scope:
Efficient use of secondary storage (influenced by both the size of
the record and the structure of the secondary storage)
Data processing speed.
Computer operating systems read data from secondary memory in units called pages.
A page is the amount of data read or written by an operating system in one operation.
Normalization
What is Denormalization?
Denormalization a process of transforming normalised
Denormalization
In addition, the following factors have to be
considered:
Application specific;
Denormalization may speed up retrievals but it
slows down updates
Size of tables
Coding
Answer
15
Benefits:
Can improve performance (speed)
Due to data duplication
Problems:
Wasted storage space
Data integrity/consistency threats
Denormalisation How?
Option one: Combine attributes from several logical
relations together into one physical record in order to avoid
doing joins (one to one, many to many, one to many)
Option two: Partition a logical relation into several
physical records (multiple tables);
Option three: Data replication; or a combination of the
two options above.
Denormalisation Option 1
1. Two entities with a
one-to-one relationship
Mapping
Denormalisation Option 1
1. Two entities with a
one-to-one relationship
Try this!
Employee
PPS
EMPLOYEE
Address
Manager
ID
Name
Manages
MANAGER
Expertise
Expertise
EmployeePPS
Name
Address
Denormalisation Option 1
Many-to-many relationship (associative entity)
2.
with non-key attributes
Denormalisation Option 1
Physical Model: Denormalised Relation
Denormalisation Option 1
3 One to many relationship
Denormalisation Option 2
Option 2 : Partitioning of logical relation into multiple tables
Horizontal partitioning - places different rows of a table into several physical files, based
on common column values.
Vertical partitioning distributing the columns of a table into several separate files,
repeating the primary key in each one of them
CUSTOMER
CustID
CUSTOMERA
CUSOMTERB
FirstName
CustID
CustID
MiddleName
FirstName
CreditLimit
LastName
MiddleName
SalesTaxRate
Address1
LastName
Address2
Address1
City
Address2
County
City
Country
County
Phone
Country
CreditLimit
Phone
SalesTaxRate
Fax
Fax
Email
31
Vertical partitioning
32
CUSTOMER
CUSTOMERA-M
CUSTOMERN-Z
CustID
CustID
CustID
FirstName
FirstName
FirstName
MiddleName
MiddleName
MiddleName
LastName
LastName
LastName
Address1
Address1
Address1
Address2
Address2
Address2
City
City
City
County
County
County
Country
Country
Country
Phone
Phone
Phone
CreditLimit
CreditLimit
CreditLimit
SalesTaxRate
SalesTaxRate
SalesTaxRate
Fax
Fax
Fax
Email
Horizontal partitioning
Efficiency
Local optimisation
Recovery
Slow retrieval
Complexity
Extra space and time for updates
Denormalisation Option 3
34
Denormalisation Disadvantages
Whose responsibility?
36
DBMS
Database Designer
File Organisation
37
38
https://www.youtube.com/watch?v=zDzu6vka0rQ
40
Larger tables
Attributes which are referenced in ORDER BY or
GROUP BY clauses
https://www.youtube.com/watch?v=h2d9b_nEzoA
43
DB Architecture
Note
46
Top management
strategic
Middle management
tactical
Operational management
support
company operations
MIS
DSS
Database
TPS
Management data
Data/database administration
Database administration
operationally oriented
responsible for day-to-day monitoring and management of
active database
liaison and support during application development
Data administrator
Data coordination
Data standards
Database administrator
db activity
db service
planning
end-user support
organising
testing
of
monitoring
delivering
or passive
integrated or active
Metadata in Access
Data Dictionary
database
all data about entities are entered into the dictionary
requests for metadata information are run as reports
and queries as necessary
Table construction
Security
Physical residence
Impact of change
Responsibility
who
managerial:
cultural:
Data is a commodity
DATABASE SECURITY
Firewalls
Encryption
Plugging known security holes
using
delete
secret
Firewalls
Encryption
Encryption: decoding or scrambling data to make it unintelligible
to those without the key
encryption
Redundancy
Virus protection
Disaster protection
Minimise error
Alert network managers to problems
Minor disruptions require on-going monitoring
validation
check digits
hash totals
cross checking
batch totals
Software Invasion
Cruise virus
Worm
Trapdoor
delivers payload
or bypasses normal
security procedures
Trojan horse
looks like something else
Stealth viruses
encrypt and hides tracks
Logic bomb
event driven
Education!!
Distributed databases
Distributed database
Distributed processing/database
Distributed Processing
Shares data processing chores over
sites using communications network
Database resides at one site only
Distributed Database
Each site has a data fragment
which might be replicated at
other sites
Requires distributed processing
DDBMS
Advantages
Reflects organisational
structure
Faster data access and
processing
Improved communications in
org.
Reduced operating costs
Improved share-ability and
local autonomy
Less danger of single-point
failure
Modular growth easier
Disadvantages
Complexity management
and control
Security
Integrity control more
difficult
Lack of standard comms.
protocols for dbs
Increased training costs
Database design more
complex
Characteristics DDBMS
DDBMS features
OR
THEN
Data allocation:
Data fragmentation
Data fragmentation
one fragment
reconstruction: should be able to define a relational
operation that will reconstruct relation from fragments
disjointness: a data item appearing in one fragment
should not appear in another
Horizontal fragmentation
Vertical fragmentation
division
Mixed fragmentation
combination
Data replication
Data replication
Unreplicated database:
stores each database fragment at a single site
no duplicate database fragments
Data allocation
Partitioned/ fragmented
Replicated
Big Data
Big data is the term for data sets so large and complex that it
becomes difficult to process them using on-hand database
management tools or traditional data processing applications
We are collecting more data than ever
Internet of Things
Ubiquitous Broadband
Reduction in connectivity
costs
2.
4.
5.
http://www.mckinsey.com/insights/business_technology/big_data_the_next_
frontier_for_innovation
http://strata.oreilly.com/2013/08/cancer-and-clinical-trials-the-roleof-big-data-in-personalizing-the-health-experience.html
Huddersfield University
University of Derby
Loughborough University
Data mining
Analytics
Machine learning
Visualisation
Data mining
Who interprets?
Issues
profiling individuals
Over-reliance on technology
Need for skilled workers with deep analytics skills
www.internetofthings.eu
House Keeping
110
FINI