You are on page 1of 157

DB2 - IBMs Relational DBMS

CTS-PAC

Version 1.1

Session 1

CTS-PAC

Version 1.1

Topics to be covered in this session

Introduction to databases - covers their advantages and


the types of databases (time : 30 min) Relational database concepts - covers Terminology, ER model , Normalisation, An Introduction to Database objects, CODDs Relational Rules, An Introduction to SQL.

CTS-PAC

Version 1.1

Introduction to Databases
What is Data ? A representation of facts or instruction in a form suitable for communication - IBM Dictionary

What is a Database ? Is a repository for stored data - C.J.Date

CTS-PAC

Version 1.1

contd...
What is a database system ? An integrated and shared repository for stored data or collection of stored operational data used by application systems of some particular enterprise.

Or

Nothing more than a computer-based record keeping


system.

CTS-PAC

Version 1.1

Advantages of DBMS over File Mngt Sys


CTS-PAC

Data redundancy Multiple views Shared data Data independence (logical/physical) Data dictionary Search versatility Cost effective Security & Control Recovery restart & Backup Concurrency
Version 1.1 6

TYPES OF DATABASES (or Models)

Hierarchical Model Network Model Relational Model Object-Oriented Model

CTS-PAC

Version 1.1

contd...

HIERARCHICAL Top down structure resembling an upside-down


tree Parent child relationship First logical database model Available in legacy systems on Mainframe computers Example - IMS

CTS-PAC

Version 1.1

contd...

NETWORK Does not distinguish between parent and child. Any


record type can be assocaited with any number of arbitrary record types Enhanced to overcome limitations of Network model but in reality, there is minimal diffeence due to frequent enhancements

CTS-PAC

Version 1.1

contd...

RELATIONAL Data stored in table in the form of tables and rows. Examples - DB2, Oracle, Sybase, Ingres etc OBJECT -ORIENTED MODEL Data attributes and methods that operate on those
attributes are encapsulated in structures called objects

CTS-PAC

Version 1.1

10

RELATIONAL DB CONCEPTS

CTS-PAC

Version 1.1

11

Relational Properties

Why Relational ? - Relation is a mathematical


term for a table - Hence Relational database is perceived by the users as a set of tables. All data values are atomic. Entries in columns are from the same domain Sequence of rows (T-B) is insignificant Each row is unique Sequence of columns (L-R) is insignificant

CTS-PAC

Version 1.1

12

Relational Concepts (or Terminology)

Relation : A table or File Tuple : Row contains an entry for each attribute Attributes : Columns or the characteristics that
CTS-PAC

define the entity Domain:. A range of values (or Pool) Entity : Some object about which we wish to store information Null : Represents an unknown value Atomic : Smallest unit of data; the individual data value
Version 1.1 13

contd... Candidate key : Some attribute (or a set of


attributres) that may uniquely identify each row(tuple) in the relation(table) This exists only for a short period of time and the primary and attribute key take its place. Primary key : The candidate key that is chosen for primary attributes to uniquely identify each row. Alternate key : The remaining candidate keys that were not chosen as primary key Foreign key : An attrtibute of one relation that might be a primary key of another relation.
Version 1.1 14

CTS-PAC

Entity Relationship Model

E-R model is a logical representation of data for a

business area Represented as entities, relationship between entities and attributes of both relationships and entities E-R models are outputs of analysis phase i.e they are conceptual data models expressed in the form of an ER diagram

CTS-PAC

Version 1.1

15

Normalisation (1NF - 5NF)

It is done to bring the design of database to a

standadized mode 1NF : All entities must have a unique identifier, or key, that can be composed of one or more attributes. All attributes must be atomic and non repeating. 2NF : Partial functional dependencies removed - all attributes that are not a part of the key must depend on the entire key for that entity.

CTS-PAC

Version 1.1

16

contd...

3NF : Transitive dependencies removed - attributes that


are not a part of the key must not depend on any nonkey attribute. 4NF : Multi valued dependencies removed 5NF : Remaining anomalies removed

CTS-PAC

Version 1.1

17

Types of Integrity

Entity Integrity : Rule states that no column that is

part of a primary key can have a null value Referential Integrity : Rule states that every foreign key in the first table must either match a primary key value in the second table or must be wholly null Domain Integrity : Integrity of information allowed in column

CTS-PAC

Version 1.1

18

Example of a Relational Structure

CUSTOMER Places ORDERS ORDERS Has PRODUCTS

CTS-PAC

Version 1.1

19

The above relations can be interpreted as follows :

A Customer can place any number of orders (one-to-

many) Each order relates to only one customer (one-to-one) One order can contain many products (one-to-many) One Product can be a part of many orders(one-tomany)

CTS-PAC

Version 1.1

20

contd...

In the above example Customer, Order & Product are

called ENTITIES. An Entity may transform into table(s). The unique identity for information stored in an ENTITY is called a PRIMARY KEY. Eg. CustomerNo uniquely identifies each customer

CTS-PAC

Version 1.1

21

contd...
A table essentially consists of Attributes, which define the characteristics of the table Primary key, which uniquely identifies each row of data stored in a table Secondary & Foreign Keys/indexes

CTS-PAC

Version 1.1

22

contd...
Table Definition : Table Customer Attributes - Customer-No, Cust-name, Cust-location, Cust-Id, Order-no...

Primary Key - Customer-No Secondary Key - Cust-Id Foreign-Key - Order-no

CTS-PAC

Version 1.1

23

contd...

The Relationships transform into Foreign Keys. For eg.


Customer is related to Orders thru Order-No which is the Foreign-key in Customer and Primary key in Order. So basically the relationship Places is thru the OrderNo. As per the relational integrity the Primary-Key ,OrderNo, for the table Orders can never be Null, while it can be so in the table Customer.

CTS-PAC

Version 1.1

24

contd...

Tables exist in Tablespaces. A tablespace can contain

one or more tables Apart from the Primary Key, a table can have many secondary keys/indexes, which exist in Indexspaces. These tablespaces and indexspaces together exist in a Database

CTS-PAC

Version 1.1

25

contd...

To do transformations as described above we need a


tool that will provide a way of creating the tables, manipulate the data present in these, create relationships,indexes,tablespace, indexspace and so on. DB2 provides SQL which performs these functions. The next part briefly deals with SQL and its functions. A detailed explanation will be taken up later.

CTS-PAC

Version 1.1

26

CODDS RELATIONAL RULES

1. All information in a relational database is


represented explicitly at the logical level and in exactly one way - by values in tables

2. Each and every datum(atomic value) in a relational


database is guarenteed to be logically accessible by resorting to a combination of tablename, primary key value, and column name

CTS-PAC

Version 1.1

27

contd...

3. Null values are supported for representing missing


information in a systematic way irrespective of the datatype.

4. The database description is represented at the logical


level in the same way as ordinary data, so that authorised users can apply the same relational language to its interrogation as they apply to the regular data.

CTS-PAC

Version 1.1

28

contd...

5.A relational system may support several languages


and various modes of terminal use. However there must be one language whose statements can express all of the following items: (1)data definitions (2)view definitions (3)data manipulation(interactive and by program)(4) integrity constraints (5) authorisation(6) transaction boundaries(begin, commit,rollback)

CTS-PAC

Version 1.1

29

contd...

6. All views are theoretically updatable, are also


updatable by the system

7. The capability of handling a base relation or a


derived relation (view) as a single operand applies not only to the retrierval of of data but also to the insertion, updation and deletion of data

CTS-PAC

Version 1.1

30

contd...

8. Application programs and terminal activities remain


logically unimpaired whenever any changes are made in either storage representations or access methods

9. Application programs and terminal activities remain


logically unimpaired when information-preserving changes of any kind that theoretically permit unimpairment are made to the base tables.

CTS-PAC

Version 1.1

31

contd...

10. Integrity constraints specific to a particular


relational database must be definable in the relational data sublanguage and storable in the catalog, not in the application programs.

11. The data manipulation sublanguage of a relational


DBMS must enable application programs and inquiries to remain logically the same whether and whenever data are physically centralized or distributed.

CTS-PAC

Version 1.1

32

contd...

12. If a relational system has a low-level(singlerecord-at-a-time)language, that low level cannot be used to subvert or bypass the integrity rules and constraints expressed in the higher-level relational language(multiple-records-at-a-time)

CTS-PAC

Version 1.1

33

An introduction to SQL
SQL or Structured Query Language is A Powerful language that performs the functions of data manipulation(DML), data definition(DDL) and data control or data authorization(DAL/DCL). A Non procedural language - the capability to act on a set of data and the lack of need to know the how to retrieve it. An SQL can perform the functions of more than a procedure. Very flexible

CTS-PAC

Version 1.1

34

contd...
SQL - Features What you want and not how to get it Unlike COBOL or 4GLs, SQL is coded without data-navigational instructions.The optimal access paths are determined by the DBMS. This is advantageous because the database knows better how it has stored data than the user. Set level processing & multiple row processing

CTS-PAC

Version 1.1

35

The following are the Operations that can be performed by a SQL on the database tables :

Select Project Union Intersection Difference Cartesian Product Join Divide

CTS-PAC

Version 1.1

36

Session 2

CTS-PAC

Version 1.1

37

Topics to be covered in this session

SQL - this is to be dealt here because all other data

objects manipulation, creation and use, involve SQLs. DB2 objects - Database, Tablespaces & Indexspaces creation & use, and other terminologies associated with databases.

CTS-PAC

Version 1.1

38

Topics dealt with, in SQL

Definition and Types usage of SQLs with examples, scalar and column
functions Subqueries and Multiple queries, DMLs Static & Dynamic SQLs

CTS-PAC

Version 1.1

39

Structured Query Language - SQL

Standard query language for RDBMS Non procedural lang : Programmer specifies what data
is needed but not how to retrieve it Used also to define data structures, control access to the data and delete occurrences of data Uses set-level processing

CTS-PAC

Version 1.1

40

SQL - Types - based on the functionality

Data Definition Language (DDL) - CREATE, ALTER,

DROP Data Manipulation Language (DML) - DELETE, INSERT, SELECT, UPDATE Data Control Language (DCL) - GRANT, REVOKE

CTS-PAC

Version 1.1

41

SQL - Types

Production SQL or Ad-Hoc SQL Embedded SQL or Stand-alone SQL Static or Dynamic SQL

CTS-PAC

Version 1.1

42

SQL - Selection & Projection

Select retrieves a specific number of rows from a table Projection operation retrieves a specified subset of
columns(but all rows) from the table Eg : Select Cust-no, Cust-name from Customer; The WHERE clause defines the Predicates for the SQL operation. The above WHERE clause can have multiple conditions using AND & OR.

CTS-PAC

Version 1.1

43

Select distinct, select in range :


Select Cust-no, Cust-name, Cust-addr where Cust-no BETWEEN 10000 AND 20000; Select Cust-no, Cust-name, Cust-addr where Cust-no NOT BETWEEN 1000 AND 2000; Select Cust-no, Cust-name, Cust-addr where Cust-no IN(1000, 2000);

CTS-PAC

Version 1.1

44

contd...
Select Cust-no, Cust-name, Cust-addr where Cust-id like/not like 425% Note :- _ for a single char ; % for a string of chars Escape \ - escape char;if precedes _ or % overrides their meaning

CTS-PAC

Version 1.1

45

contd...
NULL : To check null the syntax is IS NULL or IS NOT NULL. Select Cust-no, Cust-name, order-no where order-no IS NULL; However if there are null values for order-no, then these are always evaluated as a Not True condition in a Query.
CTS-PAC Version 1.1 46

Order by and Group by clauses :

Order by sorts retrieved data in the specified order;

uses the WHERE clause Group by operator causes the table represented by the FROM clause to be rearranged into groups, such that within one group all rows have the same value for the Group by column (not physically in the database). The Select clause is applied to the grouped data and not to the original table. Here HAVING is used to eliminate groups, just like WHERE is used for rows.
Version 1.1 47

CTS-PAC

Example :Select Order-No, SUM(No-Prodts) From ORDER Group by Order-No Having AVG(No-Prodts) < 10 Order by Order-No ;

CTS-PAC

Version 1.1

48

Functions

Types are two : Column Function Scalar Function

CTS-PAC

Version 1.1

49

Column Functions

Compute from a group of rows aggregate value for a

specified column(s) AVG, COUNT, MAX, MIN, SUM Rules for column Functions - Refer Handout

CTS-PAC

Version 1.1

50

Scalar Functions

Are applied to a column or expression and operate on a

single value. CHAR, DATE, DAY(S), DECIMAL, DIGITS, FLOAT, HEX, HOUR, INTEGER, LENGTH, MICROSECOND, MINUTE, MONTH, SECOND, SUBSTR, TIME, TIMESTAMP, VALUE, VARGRAPHIC, YEAR Rules for Scalar Functions - Refer handout

CTS-PAC

Version 1.1

51

Complex SQLs

One terms a SQL to be complex when data that is


to be retieved comes from more than one table SQL provides two ways of coding a complex SQL Subqueries and Joins

CTS-PAC

Version 1.1

52

Subqueries

Nested select statements specified using the IN(or NOT IN) predicate, equality
or non-equality predicate(= or <>) and comparative operator(<, <=, >, >=) When using the equality, non-equality or comparative operators, the inner query should return only a single value

CTS-PAC

Version 1.1

53

contd...

Select Cust-No, Cust-Name


From CUSTOMER Where Order-No IN ( Select Order-No From ORDER Where No-Prdts <5);

Select Cust-No, Cust-addr


From CUSTOMER Where Order-No = ( Select Order-No From ORDER Where NoPrdts=5);

CTS-PAC

Version 1.1

54

contd...

The nested loop statements gives the user the flexibility

for querying multiple tables A specialized form is Correlated Subquery - the nested Select stmt refers back to the columns in previous select stmts It works on Top-Bottom-Top fashion Noncorrelated Subquery works in Bottom-to-Top fashion

CTS-PAC

Version 1.1

55

Eg - Correlated Subquery..

SELECT A.Cust-name A.Cust-addr


FROM CUSTOMER A WHERE A.Order-No IN (SELECT Order-No FROM CUSTOMER B WHERE A.Cust-id = B.Cust-id) ORDER BY A.Cust-id, A.Cust-no ;

CTS-PAC

Version 1.1

56

Corelated Subquery using EXISTS clause :


SELECT Cust-No, Cust-name FROM CUSTOMER A WHERE EXISTS (SELECT * FROM ORDER B WHERE B.Order-No = A.Order-No AND B.Order-No = 5);

CTS-PAC

Version 1.1

57

Multiple levels of Subquery


SELECT Cust-no, Cust-name, Cust-addr FROM CUSTOMER A WHERE Order-no IN (SELECT order-no FROM ORDER B WHERE Prod-id IN (SELECT Prod-id FROM PRODUCTS WHERE Prod-name = NUTS));

CTS-PAC

Version 1.1

58

Joins
OUTER JOIN : For one or more tables being joined, both matching and nonmatching rows are returned. Duplicate columns may be eliminated The nonmatching columns will have nulls in them.

INNER JOIN: Here there is a possibility one or more of the rows from either or both tables being joined will not be included in the table that results from the join operation
CTS-PAC Version 1.1 59

DMLs
INSERT : Eg: INSERT INTO Tablename(column1, column2, column3 ,......) VALUES( value1, value2, value3 ,........)

If any column is omitted in an INSERT stmt and that column is NOT NULL, then INSERT fails; if null it is set to null
CTS-PAC Version 1.1 60

contd...

If the column is defined as NOT NULL BY

DEFAULT, it is set to that default value Omitting the list of columns is equivalent to specifying all values SELECT - INSERT INSERT INTO TEMP (A#, B) SELECT A#, SUM(B) FROM TEMP1 GROUP BY A# ;

CTS-PAC

Version 1.1

61

contd...
UPDATE:

Eg:

UPDATE tablename SET Columnname(s) = scalar expression WHERE [ condition ]

Single or Multiple row updates Update with a Subquery

CTS-PAC

Version 1.1

62

contd...
DELETE:

Eg:

DELETE FROM Tablename WHERE [condition ];

Single or multiple row delete or deletion of all rows

CTS-PAC

Version 1.1

63

Static SQL

Hard-coded into an application program cannot be modified during the programs execution
except for changes to the values assigned to the host variables Cursors are used to access set-level data The general form is EXEC SQL [SQL stmts] END-EXEC.

CTS-PAC

Version 1.1

64

Dynamic SQL

Stmts can change throughout the programs execution When the SQL is bound, the application plan or
package that is created does not contain the same info as that for a static SQL program The access paths cannot be determined before execution

CTS-PAC

Version 1.1

65

SQL Guidelines :
- Refer handout - Mullins, chapter 2

CTS-PAC

Version 1.1

66

Topics dealt with, in DB2 objects

Databases, stogroup, Tablespaces (types, creation and



modification) Indexspaces (creation and modification) some more terms associated with tablespaces

CTS-PAC

Version 1.1

67

DB2 Objects

Databases - User & system(catalog) A collection of logically related objects - like


Tablespaces, Indexspaces, Tables etc. not a physical kind of object - may occupy more than one disk space A STOGROUP & BUFFERPOOL must be defined for each database. Stogroup and user-defined VSAM are the two storage allocations for a DB2 dataset defn.
Version 1.1 68

CTS-PAC

Stogroup

It is a collection of direct access volumes, all of the

same device type The option is defined as a part of tablespace definition When a given space needs to be extended, storage is acquired from the appropriate stogroup

CTS-PAC

Version 1.1

69

contd...

In a given database, all the spaces need not have the

same stogroup These are, in a sense, the most physical of various storage objects in DB2 More than one volume can be defined in a stogroup. DB2 keeps track of which volume was defined first & uses that volume.

CTS-PAC

Version 1.1

70

VCAT Option

User Defined VSAM datasets have to be defined

explicitly by the AMS utility IDCAMS Two types of VSAM datasets are used -ESDS & LDS. Linear Data set is more efficiently used by DB2 Vsam datasets defined here are different from the plain vsam datasets - can access them only thru VSAM Media Manager

CTS-PAC

Version 1.1

71

Tablespaces

Logical address space on secondary storage to hold one

or more tables A SPACE is basically an extendable collection of pages with each page of size 4K or 32K bytes. It is the storage unit for for recovery and reorganizing purpose Three Type of Tablespaces - Simple, Partitioned & Segmented

CTS-PAC

Version 1.1

72

Simple Tablespaces

Can contain more than one stored table Depending on appln, storing more than one Table
might enable faster retrieval for joins using these tables Usually only one is preferred. This is because a single page can contain rows from all tables defined in the database. LOAD with replace option deletes all data

CTS-PAC

Version 1.1

73

Segmented Tablespaces

Can contain more than one stored table, but in a

segemented space A Segment consists of a logically contiguous set of n pages No segement is allowed to contain records for more than one table Sequential access to a particular table is more efficient

CTS-PAC

Version 1.1

74

contd...

Mass Delete is much more efficient than in any other

Tablespace Reorganizing the tablespace will restore every table to its clustered order Lock Table on table locks only the table, not the entire tablespace If a table is dropped, the space for that table can be reclaimed with minimum reorg

CTS-PAC

Version 1.1

75

Partitioned Tablespaces

Only one table in a partitioned TS; 1 to 64

partitions/TS It is partitioned in accordance with value ranges for single or a combination of columns. Hence these column(s) cannot be updated

CTS-PAC

Version 1.1

76

contd...

Individual partitions can be independently recovered

and reorganized Different partitions can be stored on different storage groups for efficient access.

CTS-PAC

Version 1.1

77

Tablespace parameters to be specified for TS creation

Locksize - indicates the type of locking DB2 performs

for the given TS Page Table Tablespace ANY - DB2 decides the starting page

CTS-PAC

Version 1.1

78

contd...

USING - method of storage allocations - Stogroup or

Vcat PCTFREE - % of space available for future inserts FREEPAGE - no of pages after which an empty page is available Bufferpool - BPQ, BP1, BP2 & BP32K CLOSE - Yes/No - whether the underlying vsam datasets be closed each time the table is used.Max no of datasets that can be open in DB2 at a time is 10,000
Version 1.1 79

CTS-PAC

contd...

ERASE - Yes/No - whether physical DASD where the


TS reside to be written with binary zeros when the TS is dropped NUMPARTS - For Partitioned Tablespaces SEGSIZE - For Segmented Tablespaces

CTS-PAC

Version 1.1

80

Table Parameters for Creation

Column Definition Format : CREATE TABLE TABLENAME (Column


Definitions) PRIMARY KEY(Columns) / FOREIGN KEY * UNIQUE (Colname) (referential constraint)

CTS-PAC

Version 1.1

81

contd...

1. LIKE Table name / View name 2. IN Database Tablespace Name Foreign Key references dbname.table on relation
condition for delete Table1 references table2(target) - Table2s Primary key is the foreign key defined in Table1

CTS-PAC

Version 1.1

82

contd...

The Condns are CASCADE, RESTRICT & SET


NULL (referential constraint for the foreign key definition) Inserting (or updating ) rows in the target is allowed only if there are no rows in the referencing table

CTS-PAC

Version 1.1

83

Alter & Drop stmts

ALTER : ALTER TABLE <Tablename>


ADD Column Data-type [ not null with default] Alter allows primary & Foreign key specifications to be changed It does not support changes to width or data type of a column or dropping a column

CTS-PAC

Version 1.1

84

contd...

DROP : DROP TABLE <Tablename> Similar stmts are there for INDEX.

CTS-PAC

Version 1.1

85

Some general rules for RI & Table Parameters

Avoid nulls in columns participating in


Arithmatic logic or comparisons Primary key cols cannot be nulls Limit referential structures to no more than three levels in a direction

CTS-PAC

Version 1.1

86

contd...

Use DB2s inherent features rather than pgm coded

RIs. Do not use RIs on tables build from another RI system Consider using Fieldprocs or Editprocs or Validprocs

CTS-PAC

Version 1.1

87

Index Parameters for Creation

CREATE INDEX Indexname ON Tablename


(Colnames asc/desc) CLUSTER SUBPAGES USING STOGROUP/VCAT (the corresponding name) PRIQTY / SECQTY ; ERASE Yes/No BUFFERPOOL CLOSE - Yes/No FREEPAGE PCTFREE
Version 1.1 88

CTS-PAC

Index Guidelines - What to do ?


1. Consider indexing on columns used in UNION,DISTINCT,GROUP BY, ORDER BY & WHERE clauses. 2. Limit the indexing of frequently updated columns 3. Create explicitly, a clustering index 4. Create a unique index on the primary key and indexes on foreign keys

CTS-PAC

Version 1.1

89

contd...
5. overloading of index when row length of a table to be accessed is short 6. Atleast one index must be defined for a table with more than 100 pages 7. Use Multicolumn index rather than a multi-index (appln dependent); however the latter requires more DASD .

CTS-PAC

Version 1.1

90

contd...
8. Create indexes before loading the table. 9. Clustering reduces I/O; DB2 optimizer usually tries to use an index on clustered column before using the other indexes. 10. Optimize Subpages Parameter 11. Specify Indexspace freespace the same as tablespace freespace

CTS-PAC

Version 1.1

91

contd...
12. Use the DEFER option while creating the index. RECOVER INDEX utility can then be used to populate the index. Recover utility populates index entries faster. 13. Use different STOGROUPs for Tablespaces & indexspaces 14. Create Critical indexes in a different bufferpool than the tablespaces.

CTS-PAC

Version 1.1

92

Index Guidelines - What Not to do ?


1. Avoid indexing on Variable columns 2. Limit the number of indexes on partitioned TS 3. Avoid indexes if the table is very small (< 10 pages) or it has heavy inserts and deletes and is very small (< 20 pages) or it is accessed with a scan. Avoid defining redundant indexes

CTS-PAC

Version 1.1

93

Some more terms & concepts associated with Tables


VIEWS: It is a logical derivation of a table from other table/tables. A View does not exist in its own right. They provide a certain amount if logical independence They allow the same data to be seen by different users in different ways In DB2 a view that is to accept a update must be derived from a single base table

CTS-PAC

Version 1.1

94

Some more terms & concepts associated with Tables


Aliases and Synonyms : Both mean another name for the table. however the difference is a synonym is private to the user who created it. Aliases are used basically for accessing remote tables (in distributed data processing), which add a location prefix to their names.Using aliases creates a shorter name.

CTS-PAC

Version 1.1

95

Some more terms & concepts associated with Tables


Format: CREATE VIEW <Viewname> (<columns>) AS Subquery (Subquery - Select from other Table(s)) . CREATE ALIAS <Aliasname> FOR <Tablename> CREATE SYNONYM <Synonymname> FOR <Tablename>

CTS-PAC

Version 1.1

96

Session 3

CTS-PAC

Version 1.1

97

Topic to be covered in this session

The following topics will be covered in this session Application programming using DB2 - 1 day Data control Language, SPUFI, QMF, Appln pgming
Guidelines - 0.5 days

CTS-PAC

Version 1.1

98

Application programming using DB2

Application environments supporting DB2 : IMS(Batch/Online), CICS, TSO(Batch/Online) CAF - Call Attach Facility All DB2 application types can execute concurrently Host Language support - Cobol, PL/1, C, Fortran or Assembly lang

CTS-PAC

Version 1.1

99

Steps involved in creating a DB2 application

Coding the application using Host variables using Embedded SQL using Cursors issue DCLGEN command

CTS-PAC

Version 1.1

100

contd...

Pre compile the program Compile & Link edit the program Bind

CTS-PAC

Version 1.1

101

Host Variables

These are variables(or rather area of storage) defined in


the host language to use the predicates of a DB2 table. These are referenced in the SQL stmt. A means of moving data from and to DB2 tables DCLGEN produces host variables, the same as the columns of the table

CTS-PAC

Version 1.1

102

Host Variables
Can be used in INTO CLAUSE OF SELECT & FETCH

STATEMENTS AS INPUT OF SET CLAUSE OF UPDATE STMTS AS INPUT FOR THE VALUES CLAUSE OF INSERT STATEMENT IN WHERE CLAUSE OF SELECT, INSERT, UPDATE & DELETE AS LITERALS IN SELECT LIST OF A SELECT STATEMENT
Version 1.1 103

CTS-PAC

Example

SELECT Cust_No, Cust_name, Cust_addr


FROM CUSTOMER INTO :H-Cust-No, :H-Cust-name, :H-Cust-addr WHERE Cust_No = :H_Cust_No;

CTS-PAC

Version 1.1

104

Embedded SQL statements

It is like the file I/O Normally the embedded SQL statements contain the
host variables coded in the INTO or SELECT .... as shown above they are preceded by EXEC SQL SELECT, INSERT, UPDATE & DELETE stmts can be coded inline

CTS-PAC

Version 1.1

105

Using Cursors

can be likened to a pointer used when a large number of rows are to be selected can be used for modifying data using a FOR UPDATE
OF clause

CTS-PAC

Version 1.1

106

Cursors

DECLARE : name assigned for a particular SQL stmt OPEN : readies the cursor for row retrieval; sometimes
builds the result table.However it does not assign values to the host variables FETCH : returns data from the results table one row at a time and assigns the value to specified host variables CLOSE : releases all resources used by the cursor

CTS-PAC

Version 1.1

107

DCLGEN

issued for a single table prepares the structure of the table in a COBOL
copybook The copybook contains a SQL DECLARE TABLE stmt along with a working storage host variable defn for the table

CTS-PAC

Version 1.1

108

Precompile

searches all the SQL stmts and DB2 related INCLUDE


members and comments out every SQL stmt in the program the SQL stmts are replaced by a CALL to the DB2 runtime interface module, along with parameters. All SQL statements are extracted and put in a Database Request Module (DBRM)

CTS-PAC

Version 1.1

109

Contd...

places a time stamp in the modified source and the


DBRM so that these are tied. If there is a mismatch in this a runtime error of -818, timestamp mismatch, is got all DB2 related INCLUDE stmts must be placed between EXEC SQL & END EXEC keywords for the precompiler to recognize them

CTS-PAC

Version 1.1

110

Compile & Link

modified precompiler COBOL output is compiled compiled source is link edited to an executable load
module appropriate DB2 host language interface module should also be included in the link edit step(i.e DSNALI)

CTS-PAC

Version 1.1

111

Bind

A type of compiler for SQL statements It reads the SQL statements from the DBRM and
produces a mechanism to access data (in an efficient manner) as directed by the SQL statements being bound Checks syntax, checks for correctness of table & column definitions against the catalog info & performs authorization validation

CTS-PAC

Version 1.1

112

Bind Types

BIND PLAN : accepts as input one or more DBRMs


and outputs an application plan containing executable logic representing optimized access paths to DB2 data. BIND PACKAGE : acceps as input a single DBRM and produces a single package containing the optimized access path. The PLAN in this case contains a reference to the physical location of the package(s).

CTS-PAC

Version 1.1

113

What is a Package ?

It is a single bound DBRM with optimized access paths It also contains a location identifier, a collection
identifier and a package identifier A package can have multiple versions, each with its own version identifier

CTS-PAC

Version 1.1

114

Advantages of Package

Reduced bind time can specify bind options at the programmer level versioning provides for remote data access(in version DB2 V2.3 or higher)

CTS-PAC

Version 1.1

115

Data Control language

GRANT & REVOKE GRANT : grants the table privileges, plan & package
privileges, collection privileges, database privileges, use privileges and system privileges user with a SYSADM privilege will be responsible for overall control of the system

CTS-PAC

Version 1.1

116

contd...

Format of GRANT :
GRANT SELECT, UPDATE(NAME,NO) ON TABLE EMPL TO A, B, C(or PUBLIC); GRANT ALL ON EMPL TO PUBLIC; GRANT EXECUTE ON PLAN PLANA TO USER;

CTS-PAC

Version 1.1

117

contd...

The table privileges allowed are SELECT, UPDATE,


DELETE, INSERT, (both base tables & views), ALTER(Table) & (Create)INDEX(only to base tables) There are no specific DROP privilages;the table can be dropped by its owner or a SYSADM

CTS-PAC

Version 1.1

118

contd...

A user having authority to grant privilege to another,


also has the authority to grant the privilage with with the GRANT Option

CTS-PAC

Version 1.1

119

contd...

REVOKE : this stmt revokes the privileges given to a


user. The user granting the privileges has the authority to REVOKE also. It is not possible to be column specific when revoking an UPDATE privilege REVOKE SELECT ON TABLE EMPL FROM USERA;

CTS-PAC

Version 1.1

120

For the following refer handout

List of common SQL return codes and solutions JCLs for bind, compile of DB2 program

CTS-PAC

Version 1.1

121

Application development guidelines

Code modular DB2 programs and make them as small

as possible use unqualified SQL stmts;this enables movement from one environment to another(test to prodn) Never use Select* in an embedded SQL program; use joins rather than subqueries

CTS-PAC

Version 1.1

122

contd...

use WHERE clause and filter out data use cursors when fetching multiple rows, though they
add overheads use FOR UPDATE OF clause for UPDATE or DELETE with cursor - this ensures data integrity. use INSERTs minimally ; use LOAD utility instead of INSERT, if the inserts are not application dependent

CTS-PAC

Version 1.1

123

QMF - Query Management Facility

It is an MVS- and VM- based query tool allows end users to enter SQL queries to produce a
variety of reports and graphs as a result of this query QMF queries can be formulated in several ways : by direct SQL stmts, by means of relational prompted query interface or by query-by-example (QBE). QBE is similar to SQL in some ways but more user friendly

CTS-PAC

Version 1.1

124

SPUFI

supports the online execution of SQL statements from a

TSO terminal used for developers to check SQL statements or view table details Spufi menu contains the input file in which the SQL statements are coded, option for default settings and editing and the output file.

CTS-PAC

Version 1.1

125

Session 4

CTS-PAC

Version 1.1

126

Topics to be covered in this Session


The duration of this session is 0.5 days DB2 Utilities DB2 Security DB2 catalog & Optimizer Performance tuning

CTS-PAC

Version 1.1

127

DB2 System administration

DB2 UTILITIES CHECK COPY, MERGECOPY RECOVER LOAD REORG, RUNSTATS EXPLAIN

CTS-PAC

Version 1.1

128

Check

checks the integrity of DB2 data structures checks the referential integrity between two tables and
also checks DB2 indexes for consistency

CTS-PAC

Version 1.1

129

contd...

can delete invalid rows and copies them to a exception

table Use CHECK DATA when loading a table without specifying the ENFORCE CONSTRAINTS option or after the partial recovery of tablespaces in a referential set

CTS-PAC

Version 1.1

130

Copy

used to create an imagecopy for the complete


tablespace or a partition of the tablespace - full imagecopy or incremental imagecopy every succesful execution of COPY utility places in the table SYSIBM.SYSCOPY, atleast one row that indicates the status of the imagecopy

CTS-PAC

Version 1.1

131

Mergecopy

The MERGECOPY utility combines multiple


incremental image copy data sets into a new full or incremental image copy data set

CTS-PAC

Version 1.1

132

Recover

Standard unit of recovery is a Tablespace restore DB2 tablespaces and indexes to a specific
instance data can be recovered for single pages,pages that contain I/O errors, a single partition or an entire tablespace indexes are always recovered from the actual table data, not from image copy and log data, as in the case of tablespace recovery
Version 1.1 133

CTS-PAC

Load

to accomplish bulk inserts into DB2 table can replace the current data or append to it .i.e. LOAD
DATA REPLACE or LOAD DATA RESUME(S) if a job terminates in any phase of LOAD REPLACE the utility has to be terminated and rerun

CTS-PAC

Version 1.1

134

contd...

if a job terminates in any phase other than


UTILINIT(which sets up and initializes the LOAD utility), the tablespace must be first restored using the full RECOVER, if LOG NO option of the LOAD was mentioned.. After the tablespace is restored, the error is to be corrected, the utility terminated and the job rerun.

CTS-PAC

Version 1.1

135

Reorg

to reorganize DB2 tables and indexes and thereby

improving their efficiency of access reclusters data, resets free space to the amount specified in the create ddl statement and deletes and redefines underlying vsam datasets for stogroup defined objects

CTS-PAC

Version 1.1

136

Runstats

collects statistical information for DB2 tables,

tablespaces, partitions, indexes, and columns. it can place this info in the catalog tables with DB2 optimizer statistics or DBA monitoring statistics or with all statistics that have been gathered it can be used on specific SQL queries without updting the current usable statistics

CTS-PAC

Version 1.1

137

Reorg Job stream

the total reorg schedule should include a RUNSTATS job or step : to record current tablespace
and index statistics to DB catalog two copy steps for each tablespace being reorganized : so that data is recoverable. The second copy job is required after the REORG if it was performed with a LOG NO option

CTS-PAC

Version 1.1

138

contd...

After a REORG is run with LOG NO option, DB2


turns on the copy pending status flag for tablespaces specified in the REORG. When LOG NO parameter is specified it is better to take a imagecopy of the tablespace being reorganized immediately after reorg a REBIND job for all plans using tables in any of the tblspaces being organized

CTS-PAC

Version 1.1

139

Explain

this feature can be detail the access paths chosen by the

DB2 optimizer for SQL statements used for performance monitoring When EXPLAIN is requested the access paths that the DB2 chooses are put in coded format into the table PLAN_TABLE, which is created in the default database

CTS-PAC

Version 1.1

140

contd...

To EXPLAIN a single SQL stmt precede that SQL stmt


with the EXPLAIN Command EXPALIN ALL SET QUERYNO = integer FOR SQL stmt the other method is specifying EXPLAIN YES with the Bind command then PLAN_TABLE is to be queried to get the required information.

CTS-PAC

Version 1.1

141

contd...

the information provided include the type of access of


particualar tables used in the SQL or Package or Plan, the order in which the tables or joined in a JOIN, whether SORT is required and so on Since the EXPLAIN results are dependent on the DB catalog, it is better to run RUNSTATS before running a EXPLAIN

CTS-PAC

Version 1.1

142

DB2 Security

LOCKING SERVICES :
These are provided by an MVS subsystem called the IMS resource Lock Manager(IRLM). It is used to control concurrent access DB2 data, regardless of whether IMS is present in a system or not.

CTS-PAC

Version 1.1

143

contd...

The above is based on Transaction Processing - the


system component that provides this is A TRANSACTION MANAGER COMMIT & ROLLBACK are key methods of implementing this

CTS-PAC

Version 1.1

144

Explicit locking facilities

the SQL statement LOCK TABLE the ISOLATION parameter on the BIND PACKAGE
command - the two possible values are RR(Repeatable Read) & CS(Cursor Stability)

CTS-PAC

Version 1.1

145

contd...

the tablespace LOCKSIZE parameter - physically DB2


locks data in terms of pages or tables or tablespaces. This parameter is specified in CREATE or ALTER Tablespace option LOCKSIZE. The options are Tablespace, Table, Page or Any

CTS-PAC

Version 1.1

146

contd...

the ACQUIRE/RELEASE parameters on the BIND


PLAN command specifies when table locks(which are implicitly acquired by DB2) are to be acquired and released. Types : ACQUIRE USE & ACQUIRE ALLOCATE RELEASE USE & RELEASE ALLOCATE

CTS-PAC

Version 1.1

147

Session 5

CTS-PAC

Version 1.1

148

Topics to be covered in this Session


The duration of this session is 0.5 days DB2 Catalog & Directory

CTS-PAC

Version 1.1

149

Catalog Tables & the DB2 directory

Repository for all DB2 objects - contains 43 tables Each table maintains data about an aspect of the DB2
environment The data refers to info about tablespaces, tables, indexes, privileges, on utilities run on DB2 and so on eg : SYSIBM.SYSTABLES, SYSINDEXES/SYSCOLUMNS ......

CTS-PAC

Version 1.1

150

contd...

When standard DB2 SQL is used, the DB2 catalog is


either accessed or updated. eg. When a CREATE TABLE stmt is issued the catalog tables SYSIBM.SYSTABLES, SYSIBM.SYSCOLUMNS & SYSIBM.SYSFIELDS are updated. However the DB2 catalog is semi active only. This is because updates to number of rows, the physical order of the rows for a set of keys and the like are updated only after running a RUNSTATS utility DB2 catalog is integrated - DB2 catalog and DB2 DBMS are inherently bound together
Version 1.1 151

CTS-PAC

contd...

It is nonsubvertible - DB2 catalog cannot be updated


behind DB2s back. i.e. if a table of 10 columns is created, it is not possible to go and change the number of columns directly on the catalog to 15. It has to be done using the standard SQL statements for dropping and recreating the table

CTS-PAC

Version 1.1

152

DB2 Optimizer

Analyzes the SQL statements and determines the most


efficient way to access data - gives Physical data independence It evaluates the following factors : CPU cost, I/O cost, DB2 catalog statistics & the SQL statement it estimates CPU time, cost involved in applying predicates, traversing pages and sorting

CTS-PAC

Version 1.1

153

contd...

It estimates the cost of physically retrieving and writing

the data The information pertaining to the state of the tables that will be accessed by the SQL statements are provided by the Catalog

CTS-PAC

Version 1.1

154

Performance Tuning

The performance of an application can be monitored


and enhanced in the application, as well as database level In application side the SQLs can be tuned to make them more efficient, and avoid redundancy It is better to structure the SQLs so that they perform only the necessary operations

CTS-PAC

Version 1.1

155

contd...

On the database side, the major enhancements can be


done to the definitions of tables, indexes & the distribution of tablespace and indexspace The application run statistics are obtained from EXPLAIN or DB2PM monitor report

CTS-PAC

Version 1.1

156

Thank U

CTS-PAC

Version 1.1

157

You might also like