Professional Documents
Culture Documents
CTS-PAC
Version 1.1
Session 1
CTS-PAC
Version 1.1
CTS-PAC
Version 1.1
Introduction to Databases
What is Data ? A representation of facts or instruction in a form suitable for communication - IBM Dictionary
CTS-PAC
Version 1.1
contd...
What is a database system ? An integrated and shared repository for stored data or collection of stored operational data used by application systems of some particular enterprise.
Or
CTS-PAC
Version 1.1
CTS-PAC
Data redundancy Multiple views Shared data Data independence (logical/physical) Data dictionary Search versatility Cost effective Security & Control Recovery restart & Backup Concurrency
Version 1.1 6
CTS-PAC
Version 1.1
contd...
CTS-PAC
Version 1.1
contd...
CTS-PAC
Version 1.1
contd...
RELATIONAL Data stored in table in the form of tables and rows. Examples - DB2, Oracle, Sybase, Ingres etc OBJECT -ORIENTED MODEL Data attributes and methods that operate on those
attributes are encapsulated in structures called objects
CTS-PAC
Version 1.1
10
RELATIONAL DB CONCEPTS
CTS-PAC
Version 1.1
11
Relational Properties
CTS-PAC
Version 1.1
12
Relation : A table or File Tuple : Row contains an entry for each attribute Attributes : Columns or the characteristics that
CTS-PAC
define the entity Domain:. A range of values (or Pool) Entity : Some object about which we wish to store information Null : Represents an unknown value Atomic : Smallest unit of data; the individual data value
Version 1.1 13
CTS-PAC
business area Represented as entities, relationship between entities and attributes of both relationships and entities E-R models are outputs of analysis phase i.e they are conceptual data models expressed in the form of an ER diagram
CTS-PAC
Version 1.1
15
standadized mode 1NF : All entities must have a unique identifier, or key, that can be composed of one or more attributes. All attributes must be atomic and non repeating. 2NF : Partial functional dependencies removed - all attributes that are not a part of the key must depend on the entire key for that entity.
CTS-PAC
Version 1.1
16
contd...
CTS-PAC
Version 1.1
17
Types of Integrity
part of a primary key can have a null value Referential Integrity : Rule states that every foreign key in the first table must either match a primary key value in the second table or must be wholly null Domain Integrity : Integrity of information allowed in column
CTS-PAC
Version 1.1
18
CTS-PAC
Version 1.1
19
many) Each order relates to only one customer (one-to-one) One order can contain many products (one-to-many) One Product can be a part of many orders(one-tomany)
CTS-PAC
Version 1.1
20
contd...
called ENTITIES. An Entity may transform into table(s). The unique identity for information stored in an ENTITY is called a PRIMARY KEY. Eg. CustomerNo uniquely identifies each customer
CTS-PAC
Version 1.1
21
contd...
A table essentially consists of Attributes, which define the characteristics of the table Primary key, which uniquely identifies each row of data stored in a table Secondary & Foreign Keys/indexes
CTS-PAC
Version 1.1
22
contd...
Table Definition : Table Customer Attributes - Customer-No, Cust-name, Cust-location, Cust-Id, Order-no...
CTS-PAC
Version 1.1
23
contd...
CTS-PAC
Version 1.1
24
contd...
one or more tables Apart from the Primary Key, a table can have many secondary keys/indexes, which exist in Indexspaces. These tablespaces and indexspaces together exist in a Database
CTS-PAC
Version 1.1
25
contd...
CTS-PAC
Version 1.1
26
CTS-PAC
Version 1.1
27
contd...
CTS-PAC
Version 1.1
28
contd...
CTS-PAC
Version 1.1
29
contd...
CTS-PAC
Version 1.1
30
contd...
CTS-PAC
Version 1.1
31
contd...
CTS-PAC
Version 1.1
32
contd...
12. If a relational system has a low-level(singlerecord-at-a-time)language, that low level cannot be used to subvert or bypass the integrity rules and constraints expressed in the higher-level relational language(multiple-records-at-a-time)
CTS-PAC
Version 1.1
33
An introduction to SQL
SQL or Structured Query Language is A Powerful language that performs the functions of data manipulation(DML), data definition(DDL) and data control or data authorization(DAL/DCL). A Non procedural language - the capability to act on a set of data and the lack of need to know the how to retrieve it. An SQL can perform the functions of more than a procedure. Very flexible
CTS-PAC
Version 1.1
34
contd...
SQL - Features What you want and not how to get it Unlike COBOL or 4GLs, SQL is coded without data-navigational instructions.The optimal access paths are determined by the DBMS. This is advantageous because the database knows better how it has stored data than the user. Set level processing & multiple row processing
CTS-PAC
Version 1.1
35
The following are the Operations that can be performed by a SQL on the database tables :
CTS-PAC
Version 1.1
36
Session 2
CTS-PAC
Version 1.1
37
objects manipulation, creation and use, involve SQLs. DB2 objects - Database, Tablespaces & Indexspaces creation & use, and other terminologies associated with databases.
CTS-PAC
Version 1.1
38
Definition and Types usage of SQLs with examples, scalar and column
functions Subqueries and Multiple queries, DMLs Static & Dynamic SQLs
CTS-PAC
Version 1.1
39
Standard query language for RDBMS Non procedural lang : Programmer specifies what data
is needed but not how to retrieve it Used also to define data structures, control access to the data and delete occurrences of data Uses set-level processing
CTS-PAC
Version 1.1
40
DROP Data Manipulation Language (DML) - DELETE, INSERT, SELECT, UPDATE Data Control Language (DCL) - GRANT, REVOKE
CTS-PAC
Version 1.1
41
SQL - Types
Production SQL or Ad-Hoc SQL Embedded SQL or Stand-alone SQL Static or Dynamic SQL
CTS-PAC
Version 1.1
42
Select retrieves a specific number of rows from a table Projection operation retrieves a specified subset of
columns(but all rows) from the table Eg : Select Cust-no, Cust-name from Customer; The WHERE clause defines the Predicates for the SQL operation. The above WHERE clause can have multiple conditions using AND & OR.
CTS-PAC
Version 1.1
43
CTS-PAC
Version 1.1
44
contd...
Select Cust-no, Cust-name, Cust-addr where Cust-id like/not like 425% Note :- _ for a single char ; % for a string of chars Escape \ - escape char;if precedes _ or % overrides their meaning
CTS-PAC
Version 1.1
45
contd...
NULL : To check null the syntax is IS NULL or IS NOT NULL. Select Cust-no, Cust-name, order-no where order-no IS NULL; However if there are null values for order-no, then these are always evaluated as a Not True condition in a Query.
CTS-PAC Version 1.1 46
uses the WHERE clause Group by operator causes the table represented by the FROM clause to be rearranged into groups, such that within one group all rows have the same value for the Group by column (not physically in the database). The Select clause is applied to the grouped data and not to the original table. Here HAVING is used to eliminate groups, just like WHERE is used for rows.
Version 1.1 47
CTS-PAC
Example :Select Order-No, SUM(No-Prodts) From ORDER Group by Order-No Having AVG(No-Prodts) < 10 Order by Order-No ;
CTS-PAC
Version 1.1
48
Functions
CTS-PAC
Version 1.1
49
Column Functions
specified column(s) AVG, COUNT, MAX, MIN, SUM Rules for column Functions - Refer Handout
CTS-PAC
Version 1.1
50
Scalar Functions
single value. CHAR, DATE, DAY(S), DECIMAL, DIGITS, FLOAT, HEX, HOUR, INTEGER, LENGTH, MICROSECOND, MINUTE, MONTH, SECOND, SUBSTR, TIME, TIMESTAMP, VALUE, VARGRAPHIC, YEAR Rules for Scalar Functions - Refer handout
CTS-PAC
Version 1.1
51
Complex SQLs
CTS-PAC
Version 1.1
52
Subqueries
Nested select statements specified using the IN(or NOT IN) predicate, equality
or non-equality predicate(= or <>) and comparative operator(<, <=, >, >=) When using the equality, non-equality or comparative operators, the inner query should return only a single value
CTS-PAC
Version 1.1
53
contd...
CTS-PAC
Version 1.1
54
contd...
for querying multiple tables A specialized form is Correlated Subquery - the nested Select stmt refers back to the columns in previous select stmts It works on Top-Bottom-Top fashion Noncorrelated Subquery works in Bottom-to-Top fashion
CTS-PAC
Version 1.1
55
Eg - Correlated Subquery..
CTS-PAC
Version 1.1
56
CTS-PAC
Version 1.1
57
CTS-PAC
Version 1.1
58
Joins
OUTER JOIN : For one or more tables being joined, both matching and nonmatching rows are returned. Duplicate columns may be eliminated The nonmatching columns will have nulls in them.
INNER JOIN: Here there is a possibility one or more of the rows from either or both tables being joined will not be included in the table that results from the join operation
CTS-PAC Version 1.1 59
DMLs
INSERT : Eg: INSERT INTO Tablename(column1, column2, column3 ,......) VALUES( value1, value2, value3 ,........)
If any column is omitted in an INSERT stmt and that column is NOT NULL, then INSERT fails; if null it is set to null
CTS-PAC Version 1.1 60
contd...
DEFAULT, it is set to that default value Omitting the list of columns is equivalent to specifying all values SELECT - INSERT INSERT INTO TEMP (A#, B) SELECT A#, SUM(B) FROM TEMP1 GROUP BY A# ;
CTS-PAC
Version 1.1
61
contd...
UPDATE:
Eg:
CTS-PAC
Version 1.1
62
contd...
DELETE:
Eg:
CTS-PAC
Version 1.1
63
Static SQL
Hard-coded into an application program cannot be modified during the programs execution
except for changes to the values assigned to the host variables Cursors are used to access set-level data The general form is EXEC SQL [SQL stmts] END-EXEC.
CTS-PAC
Version 1.1
64
Dynamic SQL
Stmts can change throughout the programs execution When the SQL is bound, the application plan or
package that is created does not contain the same info as that for a static SQL program The access paths cannot be determined before execution
CTS-PAC
Version 1.1
65
SQL Guidelines :
- Refer handout - Mullins, chapter 2
CTS-PAC
Version 1.1
66
CTS-PAC
Version 1.1
67
DB2 Objects
CTS-PAC
Stogroup
same device type The option is defined as a part of tablespace definition When a given space needs to be extended, storage is acquired from the appropriate stogroup
CTS-PAC
Version 1.1
69
contd...
same stogroup These are, in a sense, the most physical of various storage objects in DB2 More than one volume can be defined in a stogroup. DB2 keeps track of which volume was defined first & uses that volume.
CTS-PAC
Version 1.1
70
VCAT Option
explicitly by the AMS utility IDCAMS Two types of VSAM datasets are used -ESDS & LDS. Linear Data set is more efficiently used by DB2 Vsam datasets defined here are different from the plain vsam datasets - can access them only thru VSAM Media Manager
CTS-PAC
Version 1.1
71
Tablespaces
or more tables A SPACE is basically an extendable collection of pages with each page of size 4K or 32K bytes. It is the storage unit for for recovery and reorganizing purpose Three Type of Tablespaces - Simple, Partitioned & Segmented
CTS-PAC
Version 1.1
72
Simple Tablespaces
Can contain more than one stored table Depending on appln, storing more than one Table
might enable faster retrieval for joins using these tables Usually only one is preferred. This is because a single page can contain rows from all tables defined in the database. LOAD with replace option deletes all data
CTS-PAC
Version 1.1
73
Segmented Tablespaces
segemented space A Segment consists of a logically contiguous set of n pages No segement is allowed to contain records for more than one table Sequential access to a particular table is more efficient
CTS-PAC
Version 1.1
74
contd...
Tablespace Reorganizing the tablespace will restore every table to its clustered order Lock Table on table locks only the table, not the entire tablespace If a table is dropped, the space for that table can be reclaimed with minimum reorg
CTS-PAC
Version 1.1
75
Partitioned Tablespaces
partitions/TS It is partitioned in accordance with value ranges for single or a combination of columns. Hence these column(s) cannot be updated
CTS-PAC
Version 1.1
76
contd...
and reorganized Different partitions can be stored on different storage groups for efficient access.
CTS-PAC
Version 1.1
77
for the given TS Page Table Tablespace ANY - DB2 decides the starting page
CTS-PAC
Version 1.1
78
contd...
Vcat PCTFREE - % of space available for future inserts FREEPAGE - no of pages after which an empty page is available Bufferpool - BPQ, BP1, BP2 & BP32K CLOSE - Yes/No - whether the underlying vsam datasets be closed each time the table is used.Max no of datasets that can be open in DB2 at a time is 10,000
Version 1.1 79
CTS-PAC
contd...
CTS-PAC
Version 1.1
80
CTS-PAC
Version 1.1
81
contd...
1. LIKE Table name / View name 2. IN Database Tablespace Name Foreign Key references dbname.table on relation
condition for delete Table1 references table2(target) - Table2s Primary key is the foreign key defined in Table1
CTS-PAC
Version 1.1
82
contd...
CTS-PAC
Version 1.1
83
CTS-PAC
Version 1.1
84
contd...
DROP : DROP TABLE <Tablename> Similar stmts are there for INDEX.
CTS-PAC
Version 1.1
85
CTS-PAC
Version 1.1
86
contd...
RIs. Do not use RIs on tables build from another RI system Consider using Fieldprocs or Editprocs or Validprocs
CTS-PAC
Version 1.1
87
CTS-PAC
CTS-PAC
Version 1.1
89
contd...
5. overloading of index when row length of a table to be accessed is short 6. Atleast one index must be defined for a table with more than 100 pages 7. Use Multicolumn index rather than a multi-index (appln dependent); however the latter requires more DASD .
CTS-PAC
Version 1.1
90
contd...
8. Create indexes before loading the table. 9. Clustering reduces I/O; DB2 optimizer usually tries to use an index on clustered column before using the other indexes. 10. Optimize Subpages Parameter 11. Specify Indexspace freespace the same as tablespace freespace
CTS-PAC
Version 1.1
91
contd...
12. Use the DEFER option while creating the index. RECOVER INDEX utility can then be used to populate the index. Recover utility populates index entries faster. 13. Use different STOGROUPs for Tablespaces & indexspaces 14. Create Critical indexes in a different bufferpool than the tablespaces.
CTS-PAC
Version 1.1
92
CTS-PAC
Version 1.1
93
CTS-PAC
Version 1.1
94
CTS-PAC
Version 1.1
95
CTS-PAC
Version 1.1
96
Session 3
CTS-PAC
Version 1.1
97
The following topics will be covered in this session Application programming using DB2 - 1 day Data control Language, SPUFI, QMF, Appln pgming
Guidelines - 0.5 days
CTS-PAC
Version 1.1
98
Application environments supporting DB2 : IMS(Batch/Online), CICS, TSO(Batch/Online) CAF - Call Attach Facility All DB2 application types can execute concurrently Host Language support - Cobol, PL/1, C, Fortran or Assembly lang
CTS-PAC
Version 1.1
99
Coding the application using Host variables using Embedded SQL using Cursors issue DCLGEN command
CTS-PAC
Version 1.1
100
contd...
Pre compile the program Compile & Link edit the program Bind
CTS-PAC
Version 1.1
101
Host Variables
CTS-PAC
Version 1.1
102
Host Variables
Can be used in INTO CLAUSE OF SELECT & FETCH
STATEMENTS AS INPUT OF SET CLAUSE OF UPDATE STMTS AS INPUT FOR THE VALUES CLAUSE OF INSERT STATEMENT IN WHERE CLAUSE OF SELECT, INSERT, UPDATE & DELETE AS LITERALS IN SELECT LIST OF A SELECT STATEMENT
Version 1.1 103
CTS-PAC
Example
CTS-PAC
Version 1.1
104
It is like the file I/O Normally the embedded SQL statements contain the
host variables coded in the INTO or SELECT .... as shown above they are preceded by EXEC SQL SELECT, INSERT, UPDATE & DELETE stmts can be coded inline
CTS-PAC
Version 1.1
105
Using Cursors
can be likened to a pointer used when a large number of rows are to be selected can be used for modifying data using a FOR UPDATE
OF clause
CTS-PAC
Version 1.1
106
Cursors
DECLARE : name assigned for a particular SQL stmt OPEN : readies the cursor for row retrieval; sometimes
builds the result table.However it does not assign values to the host variables FETCH : returns data from the results table one row at a time and assigns the value to specified host variables CLOSE : releases all resources used by the cursor
CTS-PAC
Version 1.1
107
DCLGEN
issued for a single table prepares the structure of the table in a COBOL
copybook The copybook contains a SQL DECLARE TABLE stmt along with a working storage host variable defn for the table
CTS-PAC
Version 1.1
108
Precompile
CTS-PAC
Version 1.1
109
Contd...
CTS-PAC
Version 1.1
110
modified precompiler COBOL output is compiled compiled source is link edited to an executable load
module appropriate DB2 host language interface module should also be included in the link edit step(i.e DSNALI)
CTS-PAC
Version 1.1
111
Bind
A type of compiler for SQL statements It reads the SQL statements from the DBRM and
produces a mechanism to access data (in an efficient manner) as directed by the SQL statements being bound Checks syntax, checks for correctness of table & column definitions against the catalog info & performs authorization validation
CTS-PAC
Version 1.1
112
Bind Types
CTS-PAC
Version 1.1
113
What is a Package ?
It is a single bound DBRM with optimized access paths It also contains a location identifier, a collection
identifier and a package identifier A package can have multiple versions, each with its own version identifier
CTS-PAC
Version 1.1
114
Advantages of Package
Reduced bind time can specify bind options at the programmer level versioning provides for remote data access(in version DB2 V2.3 or higher)
CTS-PAC
Version 1.1
115
GRANT & REVOKE GRANT : grants the table privileges, plan & package
privileges, collection privileges, database privileges, use privileges and system privileges user with a SYSADM privilege will be responsible for overall control of the system
CTS-PAC
Version 1.1
116
contd...
Format of GRANT :
GRANT SELECT, UPDATE(NAME,NO) ON TABLE EMPL TO A, B, C(or PUBLIC); GRANT ALL ON EMPL TO PUBLIC; GRANT EXECUTE ON PLAN PLANA TO USER;
CTS-PAC
Version 1.1
117
contd...
CTS-PAC
Version 1.1
118
contd...
CTS-PAC
Version 1.1
119
contd...
CTS-PAC
Version 1.1
120
List of common SQL return codes and solutions JCLs for bind, compile of DB2 program
CTS-PAC
Version 1.1
121
as possible use unqualified SQL stmts;this enables movement from one environment to another(test to prodn) Never use Select* in an embedded SQL program; use joins rather than subqueries
CTS-PAC
Version 1.1
122
contd...
use WHERE clause and filter out data use cursors when fetching multiple rows, though they
add overheads use FOR UPDATE OF clause for UPDATE or DELETE with cursor - this ensures data integrity. use INSERTs minimally ; use LOAD utility instead of INSERT, if the inserts are not application dependent
CTS-PAC
Version 1.1
123
It is an MVS- and VM- based query tool allows end users to enter SQL queries to produce a
variety of reports and graphs as a result of this query QMF queries can be formulated in several ways : by direct SQL stmts, by means of relational prompted query interface or by query-by-example (QBE). QBE is similar to SQL in some ways but more user friendly
CTS-PAC
Version 1.1
124
SPUFI
TSO terminal used for developers to check SQL statements or view table details Spufi menu contains the input file in which the SQL statements are coded, option for default settings and editing and the output file.
CTS-PAC
Version 1.1
125
Session 4
CTS-PAC
Version 1.1
126
CTS-PAC
Version 1.1
127
DB2 UTILITIES CHECK COPY, MERGECOPY RECOVER LOAD REORG, RUNSTATS EXPLAIN
CTS-PAC
Version 1.1
128
Check
checks the integrity of DB2 data structures checks the referential integrity between two tables and
also checks DB2 indexes for consistency
CTS-PAC
Version 1.1
129
contd...
table Use CHECK DATA when loading a table without specifying the ENFORCE CONSTRAINTS option or after the partial recovery of tablespaces in a referential set
CTS-PAC
Version 1.1
130
Copy
CTS-PAC
Version 1.1
131
Mergecopy
CTS-PAC
Version 1.1
132
Recover
Standard unit of recovery is a Tablespace restore DB2 tablespaces and indexes to a specific
instance data can be recovered for single pages,pages that contain I/O errors, a single partition or an entire tablespace indexes are always recovered from the actual table data, not from image copy and log data, as in the case of tablespace recovery
Version 1.1 133
CTS-PAC
Load
to accomplish bulk inserts into DB2 table can replace the current data or append to it .i.e. LOAD
DATA REPLACE or LOAD DATA RESUME(S) if a job terminates in any phase of LOAD REPLACE the utility has to be terminated and rerun
CTS-PAC
Version 1.1
134
contd...
CTS-PAC
Version 1.1
135
Reorg
improving their efficiency of access reclusters data, resets free space to the amount specified in the create ddl statement and deletes and redefines underlying vsam datasets for stogroup defined objects
CTS-PAC
Version 1.1
136
Runstats
tablespaces, partitions, indexes, and columns. it can place this info in the catalog tables with DB2 optimizer statistics or DBA monitoring statistics or with all statistics that have been gathered it can be used on specific SQL queries without updting the current usable statistics
CTS-PAC
Version 1.1
137
the total reorg schedule should include a RUNSTATS job or step : to record current tablespace
and index statistics to DB catalog two copy steps for each tablespace being reorganized : so that data is recoverable. The second copy job is required after the REORG if it was performed with a LOG NO option
CTS-PAC
Version 1.1
138
contd...
CTS-PAC
Version 1.1
139
Explain
DB2 optimizer for SQL statements used for performance monitoring When EXPLAIN is requested the access paths that the DB2 chooses are put in coded format into the table PLAN_TABLE, which is created in the default database
CTS-PAC
Version 1.1
140
contd...
CTS-PAC
Version 1.1
141
contd...
CTS-PAC
Version 1.1
142
DB2 Security
LOCKING SERVICES :
These are provided by an MVS subsystem called the IMS resource Lock Manager(IRLM). It is used to control concurrent access DB2 data, regardless of whether IMS is present in a system or not.
CTS-PAC
Version 1.1
143
contd...
CTS-PAC
Version 1.1
144
the SQL statement LOCK TABLE the ISOLATION parameter on the BIND PACKAGE
command - the two possible values are RR(Repeatable Read) & CS(Cursor Stability)
CTS-PAC
Version 1.1
145
contd...
CTS-PAC
Version 1.1
146
contd...
CTS-PAC
Version 1.1
147
Session 5
CTS-PAC
Version 1.1
148
CTS-PAC
Version 1.1
149
Repository for all DB2 objects - contains 43 tables Each table maintains data about an aspect of the DB2
environment The data refers to info about tablespaces, tables, indexes, privileges, on utilities run on DB2 and so on eg : SYSIBM.SYSTABLES, SYSINDEXES/SYSCOLUMNS ......
CTS-PAC
Version 1.1
150
contd...
CTS-PAC
contd...
CTS-PAC
Version 1.1
152
DB2 Optimizer
CTS-PAC
Version 1.1
153
contd...
the data The information pertaining to the state of the tables that will be accessed by the SQL statements are provided by the Catalog
CTS-PAC
Version 1.1
154
Performance Tuning
CTS-PAC
Version 1.1
155
contd...
CTS-PAC
Version 1.1
156
Thank U
CTS-PAC
Version 1.1
157