You are on page 1of 42

Chapter 2

Database System Concepts and Architecture

Schemas versus Instances

Database Schema:

The description of a database. Includes descriptions of the database structure, data types, and the constraints on the database.

Schema Diagram:

An illustrative display of (most aspects of) a database schema.


A component of the schema or an object within the schema, e.g., STUDENT, COURSE.

Schema Construct:

Schemas versus Instances

Database State:

The actual data stored in a database at a particular moment in time. This includes the collection of all the data in the database. Also called database instance (or occurrence or snapshot).

The term instance is also applied to individual database components, e.g. record instance, table instance, entity instance

Database Schema vs. Database State

Database State:

Refers to the content of a database at a moment in time. Refers to the database state when it is initially loaded into the system.
A state that satisfies the structure and constraints of the database.

Initial Database State:

Valid State:

Database Schema vs. Database State (continued)

Distinction

The database schema changes very infrequently. The database state changes every time the database is updated.

Schema is also called intension. State is also called extension.

Example of a Database Schema

Example of a database state

Three-Schema Architecture

Proposed to support DBMS characteristics of:


Program-data independence. Support of multiple views of the data.

Not explicitly used in commercial DBMS products, but has been useful in explaining database system organization

Categories of Data Models

Conceptual (high-level, semantic) data models:

Provide concepts that are close to the way many users perceive data.

(Also called entity-based or object-based data models.)

Physical (low-level, internal) data models:

Provide concepts that describe details of how data is stored in the computer. These are usually specified in an ad-hoc manner through DBMS design and administration manuals

Implementation (representational) data models:

Provide concepts that fall between the above two, used by many commercial DBMS implementations (e.g. relational data models used in many commercial systems).

Logical Data Model A logical data model describes the data in as much detail as possible, without regard to how they will be physical implemented in the database. Features of a logical data model include: Includes all entities and relationships among them. All attributes for each entity are specified. The primary key for each entity is specified. Foreign keys (keys identifying the relationship between different entities) are specified. Normalization occurs at this level.

The steps for designing the logical data model are as follows: Specify primary keys for all entities. Find the relationships between different entities. Find all attributes for each entity. Resolve many-to-many relationships. Normalization.

Conceptual Data Model A conceptual data model identifies the highestlevel relationships between the different entities. Features of conceptual data model include: Includes the important entities and the relationships among them. No attribute is specified. No primary key is specified.

Conceptual Data Model

we can see that the only information shown via the conceptual data model is the entities that describe the data and the relationships between those entities. No other information is shown through the conceptual data model.

The main differences between logical & conceptual model In a logical data model, primary keys are present, whereas in a conceptual data model, no primary key is present. In a logical data model, all attributes are specified within an entity. No attributes are specified in a conceptual data model. Relationships between entities are specified using primary keys and foreign keys in a logical data model. In a conceptual data model, the relationships are simply stated, not specified, so we simply know that two entities are related, but we do not specify what attributes are used for this relationship.

Physical Data Model Physical data model represents how the model will be built in the database. A physical database model shows all table structures, including column name, column data type, column constraints, primary key, foreign key, and relationships between tables. Features of a physical data model include: Specification all tables and columns. Foreign keys are used to identify relationships between tables. Physical data model will be different for different RDBMS. For example, data type for a column may be different between MySQL and SQL Server.

The steps for physical data model design are as follows: Convert entities into tables. Convert relationships into foreign keys. Convert attributes into columns. Modify the physical data model based on physical constraints / requirements.

Comparing the logical data model with the logical data model diagram, we see the main differences between the two: Entity names are now table names. Attributes are now column names. Data type for each column is specified. Data types can be different depending on the actual database being used.

DBMS ARCHITECTURE

The logical DBMS architecture

The physical DBMS architecture

DBMS ARCHITECTURE

The logical DBMS architecture


The logical architecture deals with the way data is stored and presented to users.

The physical DBMS architecture

DBMS ARCHITECTURE

The logical DBMS architecture


The physical architecture is concerned with the s/w components that make up a DBMS.

The physical DBMS architecture

Three Level Architecture of DBMS


A major purpose of a database system is to provide users with an abstract view of the data. That is, the system hides certain details of how the data is stored and maintained.

External or View Level Conceptual Level

Internal or Physical Level

External or View Level


This level is closest to the users and is concerned with the
way in which the data is viewed by individual users. Most of the users are not concerned with all the information contained in the database. Instead they need only a part of the database relevant to them. The system provides many views for the same database.

External or View Level


continue

Highest level of abstraction of database.


Allows to see only the data of interest to them.

Users Application programmers or end-users.


Any no. of external views external schema.

Conceptual Level
This level of abstraction describes what data are actually stored in the database. It also describes the relationships existing among data. At this level, the database is described logically in terms of simple datastructures. The users of this level are not concerned with how these logical data structures will be implemented at the physical level, rather they just are concerned about what information is to be kept in the database.

Conceptual Level
continue

The conceptual view is defined by means of the


conceptual schema, which includes the definition of each of the various types of conceptual records and the mapping between the conceptual schema and the internal schema.

Internal or Physical Level


Lowest level of abstraction.


Describes how the data are physically stored.

Internal view internal schema (not only defines


the various types of stored record but also specifies what indexes exists, how files are represented, etc.)

Internal or Physical Level


The internal level is closest to physical storage. This
level is also termed as physical level. It describes how the data are actually stored on the storage medium. At this level, complex low-level data structures are described in details.

Data Independence
The ability to modify a scheme definition in one level
without affecting a scheme definition in the next higher level is called DATA INDEPENDENCE

Physical Data Independence Logical Data Independence

Physical Data Independence


It refers to the ability to modify the scheme followed at the
physical level without affecting the scheme followed at the conceptual level. The application programs remain the same even though the scheme at the physical level gets modified. Modifications at the physical level are occasionally necessary in order to improve performance of the system.

Logical Data Independence


It refers to the ability to modify the conceptual scheme without causing any changes in the schemes followed at view levels. The logical data independence ensures that the application programs remain the same. Modifications at the conceptual level are necessary whenever logical structures of the database get altered because of some unavoidable reasons.

Physical & Logical Data Independence


It is more difficult to achieve logical data independence than the physical data independence. The reason being that the application programs are heavily dependent on the logical structure of the

database.

DBMS Languages

Data Definition Language (DDL) Data Manipulation Language (DML)

High-Level or Non-procedural Languages: These include the relational language SQL

May be used in a standalone way or may be embedded in a programming language


These must be embedded in a programming language

Low Level or Procedural Languages:

DBMS Languages

Data Definition Language (DDL):

Used by the DBA and database designers to specify the conceptual schema of a database. In many DBMSs, the DDL is also used to define internal and external schemas (views). In some DBMSs, separate storage definition language (SDL) and view definition language (VDL) are used to define internal and external schemas.

SDL is typically realized via DBMS commands provided to the DBA and database designers

DBMS Languages

Data Manipulation Language (DML):


Used to specify database retrievals and updates DML commands (data sublanguage) can be embedded in a general-purpose programming language (host language), such as COBOL, C, C++, or Java.

A library of functions can also be provided to access the DBMS from a programming language

Alternatively, stand-alone DML commands can be applied directly (called a query language).

Other Tools

Data dictionary / repository:

Used to store schema descriptions and other information such as design decisions, application program descriptions, user information, usage standards, etc. Active data dictionary is accessed by DBMS software and users/DBA. Passive data dictionary is accessed by users/DBA only.

Centralized and Client-Server DBMS Architectures

Centralized DBMS:

Combines everything into single system including- DBMS software, hardware, application programs, and user interface processing software. User can still connect through a remote terminal however, all processing is done at centralized site.

Basic 2-tier Client-Server Architectures

Specialized Servers with Specialized functions

Print server File server DBMS server Web server Email server

Clients can access the specialized servers as needed

Logical two-tier client server architecture

Classification of DBMSs

Based on the data model used

Traditional: Relational, Network, Hierarchical. Emerging: Object-oriented, Object-relational. Single-user (typically used with personal computers) vs. multi-user (most DBMSs). Centralized (uses a single computer with one database) vs. distributed (uses multiple computers, multiple databases)

Other classifications

You might also like