You are on page 1of 155

Chapter 2: Database Objects and Programming Methods

Overview
Thirteen percent (13%) of the DB2 UDB V8.1 Family Application Development exam (Exam 703) is designed to evaluate your understanding of the different objects that are available with DB2 Universal Database, test your knowledge of the authorities and privileges that are needed to interact with those objects, and determine how much you know about the various interfaces that can be used to develop DB2 UDB applications. The questions that make up this portion of the exam are intended to evaluate the following: Your ability to identify the different database objects that are available with DB2 UDB. Your ability to identify the naming conventions used for DB2 UDB objects. Your knowledge of the constraints available and your ability to identify when and how NOT NULL, default, check, unique, and referential integrity constraints should be used. Your ability to identify how operations performed on the parent table of a referential integrity constraint are reflected in the child table of the constraint. Your ability to identify the more common privileges used when developing or running DB2 UDB applications. Your knowledge of the various special registers available and your ability to obtain the current value of any special register using SQL. Your ability to identify the similarities and differences between static embedded SQL and dynamic embedded SQL. Your ability to identify the difference between CLI/ODBC, JDBC, and SQLJ. Database administrators do not need to know the basics of application development to do their jobs effectively, but such knowledge can be beneficial. Application developers, on the other hand, must have a basic understanding of DB2 UDB database architecture before they can develop applications that will interact with databases and database objects. This chapter is designed to introduce you to many of the database objects that are available with DB2 UDB and to provide you with an overview of the various privileges that are required to perform specific operations against those objects. This chapter is also designed to introduce you to the programming interfaces that can be used to construct applications that interact with both DB2 UDB databases and with data stored in database objects. Terms you will learn: Database objects Tables Base tables Result tables Materialized query tables Declared temporary tables Typed tables Indexes Views Schemas Qualifier Aliases Alias chain Trigger Subject table Trigger event Trigger activation time Set of affected rows Trigger granularity Triggered action User-defined data types User-defined functions Sequences

CREATE TABLE Data type NOT NULL constraint Default constraint Check constraint Unique constraint Referential integrity constraint Unique key Primary key Foreign key Parent key Parent table Parent row Dependent/child table Dependent/child row Descendant table Referential cycle Self-referencing table Self-referencing row ON UPDATE NO ACTION ON UPDATE RESTRICT ON DELETE CASCADE ON DELETE SET NULL ON DELETE NO ACTION ON DELETE RESTRICT Identity column CREATE VIEW WITH CHECK OPTION Authentication Authorities Privileges Object privileges Special registers VALUES External Stored Procedure SQL Stored Procedure CREATE PROCEDURE CALL Embedded SQL Host language Host application SQL precompiler Static SQL Dynamic SQL Call Level Interface (CLI) Open Database Connectivity (ODBC) Java Database Connectivity (JDBC) SQLJ SQLJ translator DB2 SQLJ Profile Customizer Administrative Application Programming Interface (API) functions Microsoft Data Access Objects DAO RDO ADO OLE DB

Techniques you will master: Recognizing the types of objects that are available and understanding when each is to be used. Understanding how DB2 UDB objects are named. Understanding how NOT NULL constraints, default constraints, check constraints, unique constraints, and referential constraints are defined. Understanding what NOT NULL constraints, default constraints, check constraints, unique constraints, and referential constraints are used for. Recognizing how operations performed on the parent table of a referential integrity constraint are reflected in the child table of the constraint. Understanding how DB2 Universal Database controls data access through a wide variety of authorities and privileges. Recognizing the more common types of privileges available and knowing what each privilege allows a user to do. Recognizing the special registers that are available and understanding how their values can be obtained. Understanding the difference between static embedded SQL and dynamic embedded SQL. Recognizing the different interfaces that can be used to construct DB2 UDB applications.

DB2 UDB Objects


Every DB2 UDB database consists of both a logical and a physical storage model that is comprised of several different, yet related, objects. Four types of objects exist. They are: System objects Recovery objects Storage objects Database (or data) objects You do not have to be familiar with system objects, recovery objects, or storage objects to develop database applications. However, you must be familiar with each of the database objects available before you can begin to develop a database application. That's because most database applications work directly with one or more database objects.

Database (or Data) Objects


Database objectsotherwise known as data objectsare used to logically store and manipulate data, as well as to control how all user data (and some system data) is organized. Database objects include: Tables Indexes Views Schemas Aliases Triggers User-defined data types User-defined functions Sequences

Tables
A table is a logical database object that acts as the main repository in a database. Tables present data as a collection of unordered rows with a fixed number of columns. Each column contains values of the same data type or one of its subtypes and each row contains a set of values for every column available. (Columns can contain both data values and null values.) Usually, the columns in a table are logically related, and additional relationships can be defined between two or more tables. The storage representation of a row is called a record, the storage representation of a column is called a field, and each intersection of a row and column is called a value. Figure 2-1 shows the structure of a simple database table.

Figure 2-1: A simple database table. Tables are created by executing the CREATE TABLE SQL statement (we'll take a closer look at this statement a little later), and five types of tables are available: Base tables. User-defined tables designed to hold persistent user data. Result tables. DB2 Database Manager-defined tables populated with rows retrieved from one or more base tables in response to a query. Materialized query tables (MQT). User-defined tables whose column definitions are based on the results of a query and whose data is in the form of precomputed results that are taken from one or more tables upon which the materialized query table definition is based. (Prior to Version 8.1, DB2 UDB supported summary tables, also known as automatic summary tables (ASTs). Summary tables are now considered a special type of MQT whose fullselect contains a GROUP BY clause that summarizes data from the tables referenced in the fullselect. (MQTs currently are not supported by DB2 UDB for iSeries (AS/400).) Declared temporary tables. User-defined tables used to hold non-persistent data temporarily on behalf of a single application. Declared temporary tables are explicitly created by an application when they are needed and implicitly destroyed when the application that created them terminates its last database connection. Typed tables. User-defined tables whose column definitions are based on the attributes of a userdefined structured data type. (Typed tables and structured data types are not supported by DB2 UDB for iSeries.) Data associated with base tables, materialized query tables, and typed tables is physically stored in tablespacesthe actual tablespace used is specified during the table creation process. And because tables are the basic data objects used to store information, many are often created for a single database.

Indexes
An index is an object that contains an ordered set of pointers that refer to rows in a base table. Each index is based on one or more columns in the base table it refers to, yet indexes are stored as separate entities. Figure 2-2 shows the structure of a simple index and its relationship to a base table.

Figure 2-2: A simple index. Indexes are used primarily to enforce record uniqueness and to help the DB2 Database Manager quickly locate records in response to a query. Indexes can also provide greater concurrency in multi-user environmentsbecause records can be located faster, acquired locks do not have to be held as long. However, there is a price for these benefits: Additional storage space is needed whenever indexes are used, and performance can actually decrease when new data is added to a base table or when existing data is modified. In both cases, the operations performed must be applied to both the base table and any corresponding indexes. Indexes can be created by executing the CREATE INDEX SQL statement. The basic syntax for this statement is: CREATE <UNIQUE> INDEX [IndexName] ON [TableName] ( [PriColumnName] <ASC | DESC> ,... ) where: IndexName TableName PriColumnName

Identifies the name to be assigned to the index to be created. Identifies the name assigned to the base table with which the index to be created is to be associated. Identifies one or more primary columns that are to be part of the index's key. (The combined values of each primary column specified will be used to enforce data uniqueness in the associated base table.)

If the UNIQUE clause is specified when the CREATE INDEX statement is executed, rows in the table associated with the index to be created must not have two or more occurrences of the same values in the set of columns that make up the index key. If the base table that the index is to be created for contains data, this uniqueness is checked when the DB2 Database Manager attempts to create the index specified; once the index has been created, this uniqueness is enforced each time an insert or update operation is performed against the table. In both cases, if the uniqueness of the index key is compromised, the index creation, insert, or update operation will fail, and an error will be generated. Thus, if you wanted to create a unique index for a base table named EMPLOYEES that has the following characteristics: Column Name EMPNO FNAME LNAME TITLE Data Type INTEGER CHAR(20) CHAR(30) CHAR(10)

Column Name DEPARTMENT SALARY

Data Type CHAR(20) DECIMAL(6,2)

such that the index key consists of the column named EMPNO, you could do so by executing a CREATE INDEX statement that looks something like this: CREATE UNIQUE INDEX EMPNO_INDX ON EMPLOYEES (EMPNO)

Views
Views are used to provide a different way of looking at the data stored in one or more base tables. Essentially, a view is a named specification of a result table populated whenever the view is referenced in an SQL statement. Like base tables, views can be thought of as having columns and rows. And in most cases, data can be retrieved from a view the same way it can be retrieved from a table. However, whether or not a view can be used in insert, update, and delete operations depends upon how it was definedviews can be defined as being insertable, updatable, deletable, and read-only. Views are created by executing the CREATE VIEW SQL statement (which we'll take a closer look at a little later). Although views look similar to base tables, they do not contain real data. Instead, views refer to data stored in other base tables or views. Only the view definition itself is stored in the database. (In fact, when changes are made to the data presented in a view, the changes are actually made to the data stored in the underlying base table(s) the view references.) Figure 2-3 shows the structure of a simple view and its relationship to two base tables.

Figure 2-3: A simple view that references two base tables. Because views allow different users to see different presentations of the same data, they are often used to control data access. For example, suppose you had a table that contained information about all employees who worked for a particular company. Managers could be given access to this table using a view that allows them to see only information about the employees who work in their departments. Members of the payroll department, on the other hand, could be given access to the table using a view that allows them to see only the information needed to generate employee paychecks. Both sets of users are given access to the same table; however, because each user works with a different view, it appears that they are working with their own tables.

Schemas
Schemas are objects that are used to logically classify and group other objects in the database. Because schemas are objects themselves, they have privileges associated with them that allow the schema owner to control which users can create, alter, and drop objects within them. Most objects in a DB2 UDB database are named using a two-part naming convention. The first (leftmost) part of the name is called the schema name or qualifier, and the second (rightmost) part is called the object name. Syntactically, these two parts are concatenated and delimited with a period (for example, HR.EMPLOYEE). Each time an object that can be qualified by a schema name is created, it is assigned to the schema that is provided with its name. (If no schema name is provided, the object is assigned to the default schema, which is usually the user ID of the individual who created the object.) Figure 2-4 illustrates how a table is explicitly assigned to a schema during the table creation process.

Figure 2-4: How a table object is assigned to a schema during the table creation process. It is important to note that schema names beginning with the characters "SYS" (such as the schema names SYSIBM, SYSCAT, SYSSTAT, SYSPROC, and SYSFUN that are found with DB2 UDB for Linux, UNIX, and Windows) are implicitly created when a database is created, are reserved, and cannot be used. Schemas are usually created implicitly when other objects are created. However, schemas can be created explicitly, as well. Note Because most objects in a DB2 UDB database are named using a two-part naming convention, applications that interact with DB2 UDB databases should always reference objects by their two-part names. If no schema name is provided when an object is referenced, the current value of the USER special register (we'll look at special registers shortly) will be used as the default, and an attempt will be made to locate the object in this default schema. (If the object cannot be found, an error will be generated; if the object can be found, the operation may be performed against that objecteven though the object found might not be the object desired.) Schemas can be explicitly created by executing the CREATE SCHEMA SQL statement. The basic syntax for this statement is: CREATE SCHEMA [SchemaName] <SQLStatement ,...> or CREATE SCHEMA AUTHORIZATION [AuthorizationName] <SQLStatement ,...> or CREATE SCHEMA [SchemaName]

AUTHORIZATION [AuthorizationName] <SQLStatement ,...> where: SchemaName AuthorizationName SQLStatement

Identifies the name to be assigned to the schema to be created. Identifies the user to be given ownership of the schema to be created. Specifies one or more SQL statements that are to be executed together with the CREATE SCHEMA statement. (Only the following SQL statements are valid: CREATE TABLE, CREATE VIEW, CREATE INDEX, COMMENT ON, and GRANT.)

If a schema name is specified but no authorization name is provided, the authorization ID of the user who issued the CREATE SCHEMA statement is given ownership of the new schema when it is created; if an authorization name is specified but no schema name is provided, the new schema is assigned the same name as the authorization name used. So if you wanted to explicitly create a schema named INVENTORY and a table named PARTS, which is associated with the schema named INVENTORY, you could do so by executing a CREATE SCHEMA SQL statement that looks something like this: CREATE SCHEMA INVENTORY CREATE TABLE PARTS (PARTNO QUANTITY SMALLINT) INTEGER NOT NULL, DESCRIPTION VARCHAR(50),

Aliases
An alias is an alternate name for a table or view. (Aliases can also be created for nicknames that refer to data tables or views located on federated systems, as well as for other aliases.) Aliases can be used to reference any table or view that can be referenced by its primary name. However, an alias cannot be used in every context that a primary table or view name can. For example, an alias cannot be used in the check condition of a check constraint, nor can it be used to reference a user-defined temporary table. Like tables and views, an alias can be created, dropped, and have comments associated with it. However, unlike tables (but similar to views), aliases can refer to other aliases, creating a process known as chaining. Figure 2-5 illustrates a simple alias chain.

Figure 2-5: A simple alias chain. Aliases are publicly-referenced names, so no special authority or privilege is required to use them. However, access to the table or view that is referred to by an alias still has the authorization requirements associated with these types of objects.

So why would you want to use an alias instead of the actual table/view name? Suppose you needed to develop an application that interacts with a table named EMPLOYEES, which resides in your company's payroll database. During the development process, you would like to run the application against a test EMPLOYEES table; then when development is complete, the application will need to run against the production EMPLOYEES table. By using an alias instead of a base table name in all table references made by the application, you can quickly change the application so that it works with the production EMPLOYEES table instead of the test EMPLOYEES table by changing the table name the alias refers to. Aliases can be created by executing the CREATE ALIAS SQL statement. The basic syntax for this statement is: CREATE ALIAS [AliasName] FOR [TableName | ViewName | ExistingAlias] where: AliasName TableName ViewName ExistingAlias

Identifies the name to be assigned to the alias to be created. Identifies the name assigned to the table the alias to be created is to reference. Identifies the name assigned to the view the alias to be created is to reference. Identifies the name assigned to the alias the alias to be created is to reference.

Thus, if you wanted to create an alias that references a table named EMPLOYEES and you wanted to assign it the name EMPINFO, you could do so by executing a CREATE ALIAS SQL statement that looks something like this: CREATE ALIAS EMPINFO FOR EMPLOYEES

Triggers
A trigger is used to define a set of actions that are to be executed whenever an insert, update, or delete operation is performed on a specified table. Triggers can be used, along with referential constraints and check constraints, to enforce data integrity and business rules. (A data integrity rule might be that whenever the record for an employee is deleted from the table that holds employee information, the corresponding record will be deleted from the table that holds payroll information. A business rule might be that an employee's salary cannot be increased by more than 10 percent.) Triggers can also be used to update other tables, automatically generate or transform values for inserted and/or updated rows, or invoke functions to perform special tasks. By using triggers, the logic needed to enforce such business rules can be placed directly in the database, and applications that work with the database can concentrate solely on data storage, data management, and data retrieval. And by storing the logic needed to enforce data integrity rules and business rules directly in the database, users can modify the logic as data integrity rules and business rules change without requiring applications to be recoded and recompiled. Before a trigger can be created, several criteria must be identified: Subject table. The table that the trigger is to interact with. Trigger event. An SQL operation that causes the trigger to be activated whenever it is performed against the subject table. This operation can be an insert operation, an update operation, or a delete operation. Trigger activation time. Indicates whether the trigger should be activated before or after the trigger event occurs. A before trigger will be activated before the trigger event occurs; therefore, it will be able to see new data values before they are inserted into the subject table. An after trigger will be activated after

the trigger event occurs; therefore, it can only see data values that have already been inserted into the subject table. Set of affected rows. The rows of the subject table that are being inserted, updated, or deleted. Trigger granularity. Specifies whether the actions the trigger will perform are to be performed once for the entire insert, update, or delete operation or once for every row affected by the insert, update, or delete operation. Triggered action. An optional search condition and a set of SQL statements that are to be executed whenever the trigger is activated. (If a search condition is specified, the SQL statements will only be executed if the search condition evaluates to true.) If the trigger is a before trigger, the triggered action can include statements that retrieve data, set transition variables, or signal SQL states. If the trigger is an after trigger, the triggered action can include statements that retrieve data, insert records, update records, delete records, or signal SQL states. Triggered actions can refer to the values in the set of affected rows using what are known as transition variables. Transition variables use the names of the columns in the subject table, qualified by a specified name that indicates whether the reference is to the original value (before the insert, update, or delete operation is performed) or the new value (after the insert, update, or delete operation is performed). Another means of referring to values in the set of affected rows is through the use of transition tables. Transition tables also use the names of the columns in the subject table, but they allow the complete set of affected rows to be treated as a table. Unfortunately, transition tables can only be used in after triggers. Once the appropriate trigger components have been identified, a trigger can be created by executing the CREATE TRIGGER SQL statement. The basic syntax for this statement is: CREATE TRIGGER [TriggerName] [NO CASCADE BEFORE | AFTER] [INSERT | UPDATE | DELETE <OF [ColumnName], ... >] ON [TableName] <REFERENCING [Reference]> [FOR EACH ROW | FOR EACH STATEMENT] MODE DB2SQL <WHEN ( [SearchCondition] )> [TriggeredAction] where: TriggerName ColumnName Reference

Identifies the name to be assigned to the trigger to be created. Identifies one or more columns in the subject table of the trigger whose values must be updated before the trigger's triggered action (TriggeredAction) will be executed. Identifies one or more transition variables and/or transition tables that are to be used by the trigger's triggered action (TriggeredAction). The syntax used to create transition variables and/or transition tables that are to be used by the trigger's triggered action is: <OLD <AS> [CorrelationName]> <NEW <AS> [CorrelationName]> <OLD TABLE <AS> [Identifier]> <NEW TABLE <AS> [Identifier]> where: CorrelationNam e Identifies a name to be used to identify a specific row in the subject table of the trigger, either before it was modified by the trigger's triggered action (OLD <AS>) or after it has

been modified by the trigger's triggered action (NEW <AS>). Identifier Identifies a name that is to be used to identify a temporary table that contains a set of rows found in the subject table of the trigger, either before they were modified by the trigger's triggered action (OLD TABLE <AS>) or after they have been modified by the trigger's triggered action (NEW TABLE <AS>).

SearchCondition

Each column affected by an activation event (insert, update, or delete operation) can be made available to the trigger's triggered action by qualifying the column's name with the appropriate correlation name or table identifier. Specifies a search condition that, when evaluated, will return either "TRUE", "FALSE", or "Unknown". This condition is used to determine whether or not the trigger's triggered action (TriggeredAction) is to be performed. Identifies the action to be performed when the trigger is activated. The triggered action must consist of a single SQL statement or a compound SQL statement (i.e., two or more SQL statements enclosed with the keywords BEGIN ATOMIC and END). Every statement used in a compound SQL statement must be terminated with a semicolon (;).

TriggeredAction

Thus, if you wanted to create a trigger for a base table named EMPLOYEES that has the following characteristics: Column Name EMPNO FNAME LNAME TITLE DEPARTMENT SALARY Data Type INTEGER CHAR(20) CHAR(30) CHAR(10) CHAR(20) DECIMAL(6,2)

that will cause the value for the column named EMPNO to be incremented each time a row is added to the table, you could do so by executing a CREATE TRIGGER statement that looks something like this: CREATE TRIGGER EMPNO_INC AFTER INSERT ON EMPLOYEES FOR EACH ROW MODE DB2SQL UPDATE EMPNO SET EMPNO = EMPNO + 1 Note The activation of one trigger may cause the activation of other triggers, or even the reactivation of the same trigger. This event is known as trigger cascading, and because trigger cascading can occur, a significant change can be made to a database as the result of a single INSERT, UPDATE, or DELETE statement.

User-Defined Data Types


As the name implies, user-defined data types (UDTs) are data types that are created (and named) by a database user. A user-defined data type can be a distinct data type that shares a common representation with one of the built-in data types provided with DB2 UDB, or it can be a structured type that consists of a sequence of named attributes, each of which has its own data type. Structured data types can also be created as subtypes of other structured types, thereby defining a type hierarchy. (Structured data types are not supported by DB2 UDB for iSeries.) User-defined data types support strong data typing, which means that even though they may share the same representation as other built-in or userdefined data types, the value of one user-defined data type is only compatible with values of that same type (or of other user-defined data types within the same data type hierarchy). As a result, user-defined data types cannot be used as arguments for most of the built-in functions available. Instead, user-defined functions (or methods) that provide similar functionality must be developed whenever that kind of capability is needed.

User-Defined Functions (or Methods)


User-defined functions (UDFs) are special objects used to extend and enhance the support provided by the built-in functions available with DB2 UDB. Like user-defined data types, user-defined functions (or methods) are created and named by a database user. A user-defined function can be an external function written in a high-level programming language or a sourced function whose implementation is inherited from some other function that already exists. Like built-in functions, user-defined functions are classified as being scalar, column (or aggregate), table, or row in nature. Scalar functions return a single value and can be specified in an SQL statement wherever a regular expression can be used. (The built-in function SUBSTR() is an example of a scalar function.) Column functions return a single-valued answer from a set of like values (a column) and can also be specified in an SQL statement wherever a regular expression can be used. (The built-in function AVG() is an example of a column function.) Table functions return a table and four functions return a row to the SQL statement that references it and can only be specified in the FROM clause of a SELECT statement. (We will look at the SELECT statement and its clauses shortly.) Table functions are used to work with data that does not reside in a DB2 UDB database and/or to convert such data into a format that resembles that of a DB2 table. (The built-in function SNAPSHOT_TABLE() is an example of a table function.)

Sequences
A sequence is an object that is used to automatically generate data values. Unlike an identity column, which is used to generate data values for a specific column in a table, a sequence is not tied to any specific column or any specific table. Instead, a sequence behaves like a unique counter that resides outside the database, with the exception that it does not present the same concurrency and performance problems that can occur when external counters are used. (Sequences are not supported by DB2 UDB for iSeries.) All sequences have the following characteristics: Values generated can be any exact numeric data type that has a scale of zero (SMALLINT, BIGINT, INTEGER, and DECIMAL). Consecutive values can differ by any specified increment value. The default increment value is 1. Counter values are recoverable. Counter values are reconstructed from logs when recovery is required. Values generated can be cached to improve performance. In addition, sequences generate values in one of three ways: Increment or decrement by a specified amount, without bounds. Increment or decrement by a specified amount to a user-defined limit and stop.

Increment or decrement by a specified amount to a user-defined limit, then cycle back to the beginning and start again.

To facilitate the use of sequences in SQL operations, two expressions are available: PREVVAL and NEXTVAL. The PREVVAL expression returns the most recently generated value for the specified sequence, and the NEXTVAL expression returns the next value for the specified sequence.

A Closer Look at Tables


Earlier, we saw that a table is a logical structure used to present data as a collection of unordered rows with a fixed number of columns. Each column contains a set of values of the same data type, and each row contains the actual table data. Because tables are the data objects used to store information, many are often created within a single database. More importantly, of all the database objects available, applications typically interact with tables (and views, which you may recall provide another way to look at one or more tables) the most. Therefore, in order to develop sound database applications, you must have a basic understanding of the various ways in which a table can be constructed.

Creating Tables
Like many of the other database objects available, tables can be created by using a GUI tool that is accessible from the Control Center. Tables can also be created by executing the CREATE TABLE SQL statement. In its simplest form, the syntax for this statement is: CREATE TABLE [TableName] ([ColumnName] [DataType] ,...) where: TableName ColumnName

Identifies the name to be assigned to the table to be created. (A table name must be unique within the schema the table is to be defined in.)

Identifies the unique name (within the table definition) to be assigned to the column that is to be created. DataType Identifies the data type (built-in or user-defined) to be assigned to the column to be created; the data type specified determines the kind of data values that can be stored in the column. (Table 2-1 contains a list of data type definitions that are valid.) Table 2-1: Data Type Definitions That Can Be Used with the CREATE TABLE Statement Definition(s) SMALLINT INTEGER INT BIGINT DECIMAL(Precision, Scale) DEC(Precision, Scale) NUMERIC(Precision, Scale) NUM(Precision, Scale) where Precision is any number between 1 and 31; Scale is any number between 0 and Precision REAL FLOAT(Precision) where Precision is any number between 1 and 24 DOUBLE FLOAT(Precision) where Precision is any number between 25 and 53

TableName ColumnName

Identifies the name to be assigned to the table to be created. (A table name must be unique within the schema the table is to be defined in.) Identifies the unique name (within the table definition) to be assigned to the column that is to be created.

CHARACTER(Length) <FOR BIT DATA>[*] CHAR(Length) <FOR BIT DATA>[*] where Length is any number between 1 and 254 CHARACTER VARYING(MaxLength) <FOR BIT DATA>[*] CHAR VARYING(MaxLength) <FOR BIT DATA>[*] VARCHAR(MaxLength) <FOR BIT DATA>[*] where MaxLength is any number between 1 and 32,672 LONG VARCHAR GRAPHIC(Length) where Length is any number between 1 and 127 VARGRAPHIC(MaxLength) where MaxLength is any number between 1 and 16,336 LONG VARGRAPHIC

C st

D ch st

D ch st

D ch st

DATE TIME TIMESTAMP BINARY LARGE OBJECT(Size <K | M | G>) BLOB(Size <K | M | G>) where Length is any number between 1 and 2,147,483,647; if K (for kilobyte) is specified, Length is any number between 1 and 2,097,152; if M (for megabyte) is specified, Length is any number between 1 and 2,048; if G (for gigabyte) is specified, Length is any number between 1 and 2. CHARACTER LARGE OBJECT(Size <K | M | G>) CHAR LARGE OBJECT(Size <K | M | G>) CLOB(Size <K | M | G>) where Length is any number between 1 and 2,147,483,647; if K (for kilobyte) is specified, Length is any number between 1 and 2,097,152; if M (for megabyte) is specified, Length is any number between 1 and 2,048; if G (for gigabyte) is specified, Length is any number between 1 and 2. DBCLOB(Size <K | M | G>) where Length is any number between 1 and 1,073,741,823; if K (for kilobyte) is specified, Length is any number between 1 and 1,048,576; if M (for megabyte) is specified, Length is any number between 1 and 1,024; if G (for gigabyte) is specified, Length must be 1.
[*]

D ch st

If the FOR BIT DATA option is used with any character string data type definition, the contents of the column the data type is assigned to are treated as binary data. Thus, if you wanted to create a table that had three columns in it, two of which are used to store numeric values and one of which is used to store character string values, you could do so by executing a CREATE TABLE SQL statement that looks something like this: CREATE TABLE EMPLOYEES

(EMPID INTEGER, NAME CHAR(50), DEPT INTEGER) It is important to note that this is an example of a relatively simple table. Table definitions can be quite complex, and as a result, the CREATE TABLE statement has several different permutations. (In fact, the CREATE TABLE statement is probably the most complex SQL statement availablemore than 60 pages of the DB2 UDB SQL reference manual are devoted to this statement alone.) Fortunately, you do not have to know all the nuances of the CREATE TABLE statement to pass the DB2 UDB V8.1 Family Application Development exam (Exam 703). However, you do need to be aware of the various constraints and rules that can be incorporated into a table's definition, because they can have a significant impact on how data can be manipulated. Note To view the complete syntax for the CREATE TABLE SQL statement, refer to the IBM DB2 Universal Database, Version 8 SQL Reference Volume 2 product documentation.

Table Constraints
Within most businesses, data often must adhere to a certain set of rules and restrictions. For example, companies typically have a specific format and numbering sequence they use when generating purchase orders. With DB2 UDB, table constraints (or simply, constraints) can be used to enforce data integrity and to ensure that data that is added to a database table adheres to one or more business rules. Essentially, constraints are rules that govern how data values can be added to a table, as well as how those values can be modified once they have been added. The following types of constraints are available: NOT NULL constraints Default constraints Check constraints Unique constraints Referential integrity constraints Constraints are usually defined during table creation; however, constraints can also be added to existing tables using the ALTER TABLE SQL statement.

NOT NULL Constraints


With DB2 UDB, null values (not to be confused with empty strings) are used to represent missing or unknown data and/or states. And by default, every column in a table will accept a null value. This allows you to add records to a table when not all of the values that pertain to the record are known. However, there may be times when this behavior is unacceptable (for example, a tax identification number might be required for every employee who works for a company). In thse cases, the NOT NULL constraint can be used to ensure that a particular column in a base table is never assigned a null value; once the NOT NULL constraint has been defined for a column, any operation that attempts to place a null value in that column will fail. Figure 2-6 illustrates how the NOT NULL constraint is used.

Figure 2-6: How the NOT NULL constraint prevents null values. Because NOT NULL constraints are associated with a specific column in a base table, they are usually defined during the table creation process.

Default Constraints
Just as there are times when it is inappropriate for the system to accept null values, there may be times when it is desirable to have the system provide a specific value for you (for example, you might want to automatically assign the current date to a particular column whenever a new record is added to a table). In these situations, the default constraint can be used to ensure that a particular column in a base table is assigned a predefined value (unless that value is overridden) each time a record is added to the table. The predefined value provided could be null (unless the NOT NULL constraint has been defined for the column), a user-supplied value compatible with the column's data type, or a value furnished by the DB2 Database Manager. Table 2-2 shows the default values that can be provided by the DB2 Database Manager for the various DB2 UDB data types available. Table 2-2: DB2 Database Manager-Supplied Default Values Column Data Type Small integer Integer Decimal Single-precision floating point Doubleprecision floating point Fixed-length character string Varying-length character string Definition SMALLINT INTEGER or INT DECIMAL, DEC, NUMERIC, or NUM REAL or FLOAT DOUBLE, DOUBLE PRECISION, or FLOAT CHARACTER or CHAR CHARACTER VARYING, CHAR VARYING, or VARCHAR Default Value Provided 0 0 0 0 0

A string of blank characters A zero-length string

Table 2-2: DB2 Database Manager-Supplied Default Values Column Data Type Long varyinglength character string Fixed-length double-byte character string Varying-length double-byte character string Long varyinglength doublebyte character string Date Definition LONG VARCHAR Default Value Provided A zero-length string

GRAPHIC

A string of blank characters

VARGRAPHIC

A zero-length string

LONG VARGRAPHIC

A zero-length string

DATE

The system date at the time the record is added to the table. (When a date column is added to an existing table, existing rows are assigned the date January 01, 0001.) The system time at the time the record is added to the table. (When a time column is added to an existing table, existing rows are assigned the time 00:00:00.) The system date and time (including microseconds) at the time the record is added to the table. (When a time stamp column is added to an existing table, existing rows are assigned a timestamp that corresponds to January 01, 0001-00:00:00.000000.) A zero-length string A zero-length string A zero-length string

Time

TIME

Timestamp

TIMESTAMP

Binary large object Character large object Double-byte character large object Any distinct user-defined data type

BLOB CLOB DBCLOB

The default value provided for the built-in data type the distinct user-defined data type is based on (typecast to the distinct user-defined data type). Adapted from Table 2 on Page 51 of the DB2 SQL ReferenceVolume 2 manual. Figure 2-7 illustrates how the default constraint is used.

Figure 2-7: How the default constraint is used to provide data values. Like NOT NULL constraints, default constraints are associated with a specific column in a base table and are usually defined during the table creation process.

Check Constraints
Sometimes it is desirable to control what values will be accepted for a particular item and what values will not (for example, a company might decide that all nonexempt employees must be paid, at a minimum, the federal minimum wage). When this is the case, the logic needed to determine whether a value is acceptable can be incorporated directly into the data entry program being used to collect the data. A better way to achieve the same objective is by defining a check constraint for the column in the base table that is to receive the data value. A check constraint (also known as a table check constraint) can be used to ensure that a particular column in a base table is never assigned an unacceptable value once a check constraint has been defined for a column, any operation that attempts to place a value in that column that does not meet specific criteria will fail. Check constraints are comprised of one or more predicates (which are connected by the keywords AND or OR) that collectively are known as the check condition. This check condition is then compared with the data value provided, and the result of this comparison is returned as the value "TRUE", "FALSE", or "Unknown". If the check constraint returns the value "TRUE", the value is acceptable, and can be added to the database. If, on the other hand, the check constraint returns the value "FALSE" or "Unknown", the operation attempting to place the value in the database fails, and all changes made by that operation are backed out of the database. However, it is important to note that when the results of a particular operation are rolled back because of a check constraint violation, the transaction that invoked that operation is not terminated, and other operations within that transaction are unaffected. Figure 2-8 illustrates how a simple check constraint is used.

Figure 2-8: How a check constraint is used to control what values are accepted by a column. Like NOT NULL constraints and default constraints, check constraints are associated with a specific column in a base table and are usually defined during the table creation process. And like NOT NULL constraints and default constraints, check constraints are used to enforce business rules.

Unique Constraints
By default, records that are added to a base table can have the same values assigned to any of the columns available any number of times. As long as the records stored in the table do not contain information that should not be duplicated, this kind of behavior is acceptable. However, there are times when certain pieces of information that make up a record should be unique (for example, if an employee identification number is assigned to each individual who works for a particular company, each number used should be uniquetwo employees should never be assigned the same employee identification number). In these situations, the unique constraint can be used to ensure that the value(s) assigned to one or more columns when a record is added to a base table are always unique; once a unique constraint has been defined for one or more columns, any operation that attempts to place duplicate values in those columns will fail. Figure 2-9 illustrates how the unique constraint is used.

Figure 2-9: How the unique constraint prevents the duplication of data values. Unlike NOT NULL constraints, default constraints, and check constraints, which are only associated with single columns in a base table, unique constraints can be associated with an individual column or with a group of columns. Like the other constraints, however, unique constraints are usually defined during the table creation process.

Regardless of when a unique constraint is defined, the DB2 Database Manager looks to see if an index for the columns that the unique constraint, which has just been created, refers to has already been created. If so, that index is marked as being unique and system-required. If not, an appropriate index is created and marked as being unique and system-required. This index is then used to enforce uniqueness whenever new records are added to the column(s) the unique constraint was defined for. Note Although a unique, system-required index is used to enforce a unique constraint, there is a distinction between defining a unique constraint and creating a unique index. Even though both enforce uniqueness, a unique index allows nullable columns and generally cannot be used in a referential constraint (the value "NULL" means a field's value is undefined and distinct from any other value, including other NULL values) while a unique constraint does not. A primary key is a special form of unique constraint. Only one primary key is allowed per table, and every column that is used to define a primary key must be assigned the NOT NULL constraint. In addition to ensuring that every record added to a table has some unique characteristic, primary keys allow tables to participate in referential constraints. A table can have any number of unique constraints; however, a table cannot have more than one unique constraint defined on the same set of columns. And because unique constraints are enforced by indexes, all the limitations that apply to indexes (for example, a maximum number of columns with a combined length of a specific number of bytes is allowed; none of the columns used can have a large object or long character string data type, etc.) also apply to unique constraints.

Referential Integrity Constraints


If you've had the opportunity to design a database in the past, you are probably aware that data normalization is a technique used to ensure that there is only one way to get to a fact stored in a database. Data normalization is possible because two or more individual base tables can have some type of relationship with one another, and information stored in related base tables can be combined, if necessary, using a join operation. Data normalization is also where referential integrity constraints come into play; referential integrity constraints (also known as referential constraints and foreign key constraints) are used to define required relationships between two base tables. To understand how referential constraints work, it helps to look at an example. Suppose you own a small auto parts store and use a database to keep track of the inventory you have on hand. Many of the parts you stock will only work with a particular "make" and "model" of an automobile; therefore, your database has one table named MAKE to hold make information and another table named MODEL to hold model information. Because these two tables are related (every model must belong to a make), a referential constraint can be used to ensure that every record that is stored in the MODEL table has a corresponding record in the MAKE table; the relationship between these two tables is established by comparing values that are to be added to the MAKE column of the MODEL table (known as the foreign key of the child table) with the values that currently exist for the set of columns that make up the primary key of the MAKE table (known as the parent key of the parent table). To create the referential constraint just described, you would define a primary key, using a column in the MAKE table, and you would define a foreign key for a corresponding column in the MODEL table that references the MAKE table's primary key. Assuming a column named MAKEID is used to create the primary key for the MAKE table and a column also named MAKEID is used to create the foreign key for the MODEL table, the referential constraint just described would look something like the one shown in Figure 2-10.

Figure 2-10: How a referential constraint is used to define a relationship between two tables. In this example, a single column is used to define the parent key and the foreign key of the referential constraint. However, as with unique constraints, multiple columns can be used to define the parent key and the foreign key of a referential constraint. Note The name of the column(s) used to create the foreign key of a referential constraint name do not have to be the same as the column(s) used to create the primary key of the constraint (as was the case in the previous example). However, the data types used for the column(s) that make up the primary key and the foreign key of a referential constraint must be identical. As you can see, referential constraints are more complex than NOT NULL constraints, default constraints, check constraints, and unique constraints. In fact, they are so complex that a set of special terms are used to identify the individual components that can make up a single referential constraint. You may already be familiar with some of them; the complete list of terms used can be seen in Table 23. Table 2-3: DB2 UDB Referential Integrity Constraint Terminology Term Unique key Primary key Foreign key Parent key Parent table Meaning A column or set of columns in which every row of values is different from the values of all other rows. A special unique key that does not accept null values. A column or set of columns in a child table whose values must match those of a parent key in a parent table. A primary key or unique key in a parent table that is referenced by a foreign key in a child table. A table that contains a parent key of a referential constraint. (A table can be both a parent table and a dependent table of any number of referential

Table 2-3: DB2 UDB Referential Integrity Constraint Terminology Term Meaning constraints.) Parent row Dependent or child table Dependent or child row Descendent table Descendent row Referential cycle Self-referencing table Self-referencing row A row in a parent table that has at least one matching row in a dependent table. A table that contains at least one foreign key that references a parent key in a referential constraint. (A table can be both a dependent table and a parent table of any number of referential constraints.) A row in a dependent table that has at least one matching row in a parent table. A dependent table or a descendent of a dependent table. A dependent row or a descendent of a dependent row. A set of referential constraints defined in such a way that each table in the set is a descendent of itself. A table that is both a parent table and a dependent table in the same referential constraint. (The constraint is known as a self-referencing constraint.) A row that is a parent of itself.

Controlling Data Manipulation with Referential Constraint Rules


The primary reason referential constraints are defined is to guarantee that data integrity is maintained whenever one table object references another. As long as a referential constraint is in effect, the DB2 Database Manager guarantees that, for every row in a child table that has a value in any column that is part of a foreign key, there is a corresponding row in the parent table. So what happens when an SQL operation attempts to manipulate data in a way that would violate a referential constraint? To answer this question, let's look at what could compromise data integrity if the checks and balances provided by a referential constraint were not in place: An insert operation could add a row of data to a child table that does not have a matching value in the corresponding parent table. (For example, using our MAKE/MODEL scenario, a record could be added to the MODEL table without having a corresponding value in the MAKE table.) An update operation could change an existing value in a child table such that it no longer has a matching value in the corresponding parent table. (For example, a record could be changed in the MODEL table so that it no longer has a corresponding value in the MAKE table.) An update operation could change an existing value in a parent table, leaving rows in a child table with values that no longer match those in the parent table. (For example, a record could be changed in the MAKE table, leaving records in the MODEL table without a corresponding MAKE value.) A delete operation could remove a value from a parent table, leaving rows in a child table with values that no longer match those in the parent table. (For example, a record could be removed from the MAKE table, leaving records in the MODEL table with no corresponding MAKE value.) The DB2 Database Manager can either prohibit (restrict) these types of operations from being performed on tables that are part of a referential constraint, or it can attempt to carry out these actions in a way that

will safeguard data integrity. In either case, DB2 UDB uses a set of rules to control the operation's behavior. Each referential constraint has its own set of rules (which consist of an Insert Rule, an Update Rule, and a Delete Rule), and the way a particular rule will function can be specified as part of the referential constraint creation process.

The Insert Rule for Referential Constraints


The Insert Rule guarantees that a value can never be inserted into the foreign key of a child table unless a matching value can be found in the corresponding parent key of the associated parent table. Any attempt to insert records into a child table that violates this rule will result in an error, and the insert operation will fail. In contrast, no checking is performed when records are added to the parent key of the parent table. The Insert Rule for a referential constraint is implicitly created when the referential constraint itself is created. Figure 2-11 illustrates how a row that conforms to the Insert Rule for a referential constraint is successfully added to a child table; Figure 2-12 illustrates how a row that violates the Insert Rule causes an insert operation to fail.

Figure 2-11: An insert operation that conforms to the Insert Rule of a referential constraint.

Figure 2-12: An insert operation that violates the Insert Rule of a referential constraint. It is important to note that because the Insert Rule exists, records must be inserted in the parent key of the parent table before corresponding records can be inserted into the child table. (Going back to our MAKE/MODEL example, this means that a record for a new MAKE must be added to the MAKE table before a record that references the new MAKE can be added to the MODEL table.)

The Update Rule for Referential Constraints


The Update Rule controls how update operations performed against either table (child or parent) participating in a referential constraint are to be processed. The following two types of behaviors are possible, depending on how the Update Rule is defined: ON UPDATE NO ACTION. This definition ensures that whenever an update operation is performed on either table in a referential constraint, the value for the foreign key of each row in the child table will have a matching value in the parent key of the corresponding parent table; however, the value may not be the same as it was before the update operation occurred. ON UPDATE RESTRICT. This definition ensures that whenever an update operation is performed on the parent table of a referential constraint, the value for the foreign key of each row in the child table will have the same matching value in the parent key of the parent table it had before the update operation was performed. Figure 2-13 illustrates how the Update Rule is enforced when the ON UPDATE NO ACTION definition is used; Figure 2-14 illustrates how the Update Rule is enforced when the ON UPDATE RESTRICT definition is used.

Figure 2-13: How the ON UPDATE NO ACTION Update Rule of a referential constraint is enforced.

Figure 2-14: How the ON UPDATE RESTRICT Update Rule of a referential constraint is enforced.

Like the Insert Rule, the Update Rule for a referential constraint is implicitly created when the referential constraint itself is created. If no Update Rule definition is provided when the referential constraint is defined, the ON UPDATE NO ACTION definition is used as the default. Regardless of which form of the Update Rule is used, if the condition of the rule is not met, the update operation will fail, an error message will be displayed, and any changes made to the data in either table participating in the referential constraint will be backed out (rolled back).

The Delete Rule for Referential Constraints


The Delete Rule controls how delete operations performed against the parent table of a referential constraint are to be processed. The following four types of behaviors are possible, depending on how the Delete Rule is defined: ON DELETE CASCADE. This definition ensures that when a parent row is deleted from the parent table of a referential constraint, all dependent rows in the child table that have matching primary key values in their foreign key are deleted as well. ON DELETE SET NULL. This definition ensures that when a parent row is deleted from the parent table of a referential constraint, all dependent rows in the child table that have matching primary key values in their foreign key will have their foreign key values changed to null. Other values for the dependent row are not affected. ON DELETE NO ACTION. This definition ensures that when a delete operation is performed on the parent table of a referential constraint, the value for the foreign key of each row in the child table will have a matching value in the parent key of the parent table (after all other referential constraints have been applied). ON DELETE RESTRICT. This definition ensures that whenever a delete operation is performed on the parent table of a referential constraint, the value for the foreign key of each row in the child table will have a matching value in the parent key of the parent table (before any other referential constraints are applied). Figure 2-15 illustrates how the Delete Rule is enforced when the ON DELETE CASCADE definition is used; Figure 2-16 illustrates how the Delete Rule is enforced when the ON DELETE SET NULL definition is used; Figure 2-17 illustrates how the Delete Rule is enforced when the ON DELETE NO ACTION definition is used; and Figure 2-18 illustrates how the Delete Rule is enforced when the ON DELETE RESTRICT definition is used.

Figure 2-15: How the ON DELETE CASCADE Delete Rule of a referential constraint is enforced.

Figure 2-16: How the ON DELETE SET NULL Delete Rule of a referential constraint is enforced.

Figure 2-17: How the ON DELETE NO ACTION Delete Rule of a referential constraint is enforced.

Figure 2-18: How the ON DELETE RESTRICT Delete Rule of a referential constraint is enforced. Like the Insert Rule and the Update Rule, the Delete Rule for a referential constraint is implicitly created when the referential constraint is created. If no Delete Rule definition is provided when the referential constraint is defined, the ON DELETE NO ACTION definition is used as the default. No matter which form of the Delete Rule is used, if the condition of the rule is not met, an error message will be displayed, and the delete operation will fail. If the ON DELETE CASCADE Delete Rule is used and the deletion of a parent row in a parent table causes one or more dependent rows to be deleted from the corresponding child table, the delete operation is said to have been propagated to the child table. In such a situation, the child table is said to be delete-connected to the parent table. Because a delete-connected child table can also be the parent table in another referential constraint, a delete operation that is propagated to one child table can, in turn, be propagated to another child table, and so on. Thus, the deletion of one parent row from a single parent table can result in the deletion of several hundred rows from any number of tables, depending upon how tables are delete-connected. Therefore, the ON DELETE CASCADE Delete Rule should be used with extreme caution when a hierarchy of referential constraints permeates a database.

Identity Columns
Often, base tables are designed such that a single column will be used to store a unique identifier that represents an individual record (or row). More often than not, this identifier is a number that is sequentially incremented each time a new record is added to the table. Numbers for such columns can be automatically generated using a before trigger, or the DB2 Database Manager can generate a sequence of numbers (or other values) for such a column and assign a value to that column, using the sequence generated, as new records are added. Before the DB2 Database Manager can generate a sequential set of values for a column, that column must be defined as an identity column. Identity columns are created by specifying the GENERATED AS IDENTITY clause along with one or more of the identity column attributes available as part of the column definition. The syntax used to create an identity column is:

[ColumnName] [DataType] GENERATED <ALWAYS | BY DEFAULT> AS IDENTITY <( <START WITH [1 | StartingValue]> <INCREMENT BY [1 | IncrementValue]> <NO MINVALUE | MINVALUE [MinValue]> <NO MAXVALUE | MAXVALUE [MaxValue]> <NO CYCLE | CYCLE> <NO CACHE | CACHE 20 | CACHE [CacheSize]> <NO ORDER | ORDER> )> or [ColumnName] [DataType] GENERATED <ALWAYS | BY DEFAULT> AS (Expression) where: ColumnName DataType

Identifies the unique name to be assigned to the identity column to be created. Identifies the data type (built-in or user-defined) to be assigned to the identity column to be created. (Table 2-1 contains a list of the built-in data type definitions that are valid.) Identifies the first value that is to be assigned to the identity column created. Identifies the interval that is to be used to calculate the next consecutive value that is to be assigned to the identity column created. Identifies the smallest value that can be assigned to the identity column created. Identifies the largest value that can be assigned to the identity column created. Identifies the number of values of the identity sequence that are to be preallocated and kept in memory. Identifies an expression or user-defined external function that is to be used to generate values for the identity column created.

StartingValue IncrementValue MinValue MaxValue CacheSize Expression

If the CYCLE clause is specified as part of the identity column's definition, values will continue to be generated for the column after any minimum or maximum value specified has been reached. (After an ascending identity column reaches the maximum value allowed, a minimum value will be generated and the cycle will begin again. Likewise, after a descending identity column reaches the minimum value allowed, a maximum value will be generated and the cycle will repeat itself.) On the other hand, if the NO CYCLE clause is specified, or if neither clause is specified, values will not be generated for the column after any minimum or maximum value specified has been reached, and any attempt to insert new data into the table will fail. So, if you wanted to create a table that had a simple identity column in it, you could do so by executing a CREATE TABLE SQL statement that looks something like this:

CREATE TABLE EMPLOYEES (EMPID INTEGER GENERATED BY DEFAULT AS IDENTITY, NAME CHAR(50), DEPT INTEGER) It is important to note that a table can have only one identity column and that the data type used by an identity column must be a numeric data type with a scale of 0 (SMALLINT, INTEGER, BIGINT, or DECIMAL) or a user-defined data type that is based on such a data type. All identity columns are implicitly assigned a NOT NULL constraint; identity columns cannot have a default constraint. Note Values cannot be explicitly assigned to an identity column when DB2 for Linux, UNIX, and Windows is used. Therefore, any insert or update operation that attempts to modify the value assigned to an identity column will fail. (The OVERRIDING SYSTEM VALUE clause of the INSERT and UPDATE statement allows you to provide a user-supplied value when DB2 UDB for iSeries is used.)

A Word about Declared Temporary Tables


Along with base tables, another table commonly used by applications is a special type of table known as a declared global temporary table. Unlike base tables, whose descriptions and constraints are stored in the system catalog of the database to which they belong, declared temporary tables are not persistent and can only be used by the application that creates themand only for the life of the application. When the application that creates a declared temporary table terminates, the rows of the table are deleted and the description of the table is dropped. Declared global temporary tables are useful for storing intermediate result data sets and non-persistent data and can be used with stored procedures, thereby reducing the amount of data that has to be sent over the network. Whereas base tables are created with the CREATE TABLE SQL statement, declared temporary tables are created with the DECLARE GLOBAL TEMPORARY TABLE statement.

A Closer Look at Views


Earlier, we saw that views are used to provide a different way of looking at the data stored in one or more base tables. Essentially, a view is a named specification of a result table that is populated whenever the view is referenced in an SQL statement. Like base tables, views can be thought of as having columns and rows, and in most cases, data can be retrieved from a view the same way it can be retrieved from a table. However, whether a view can be used in insert, update, and delete operations depends on how it was definedviews can be defined as being insertable, updatable, deletable, and read-only. (In general, a view is insertable, updatable, or deletable if each row in the view can be uniquely mapped onto a single row of a base table.) Although views look (and often behave) like base tables, they do not have their own physical storage (contrary to indexes, which are also based on base tables); therefore, they do not contain data. Instead, views refer to data that is physically stored in other base tables. And because a view can reference the data stored in any number of columns found in the base table it refers to, views can be used, together with view privileges, to control what data a user can and cannot see. As we saw earlier, if a company has a database that contains a table that has been populated with information about each employee of that company, Managers might work with a view that only allows them to see information about employees they manage, while users in Corporate Communications might work with a view that only allows them to see contact information for each employee, and users in Payroll might work with a view that allows them to see both contact information and salary information. By creating views and coupling them with the view privileges available, a database administrator can have greater control over how individual users access specific pieces of data. Views can be created by executing the CREATE VIEW SQL statement. The basic syntax for this statement is:

CREATE VIEW [ViewName] <( [ColumnName] ,... )> AS [SELECTStatement] <WITH CHECK OPTION> where: ViewName ColumnName

Identifies the name to be assigned to the view to be created. Identifies the name of one or more columns that are to be included in the view to be created. If a list of column names is specified, the number of column names provided must match the number of columns that will be returned by the SELECT statement used to create the view. (If a list of column names is not provided, the columns of the view will inherit the names that are assigned to the columns returned by the SELECT statement used to create the view.) Identifies a SELECT SQL statement that, when executed, will produce data that can be seen using the view to be created.

SELECTStatement

Thus, if you wanted to create a view that references all data stored in a table named DEPARTMENT and assign it the name DEPT_VIEW, you could do so by executing a CREATE VIEW SQL statement that looks something like this: CREATE VIEW DEPT_VIEW AS SELECT * FROM DEPARTMENT On the other hand, if you wanted to create a view that references specific data values stored in the table named DEPARTMENT and assign it the name ADV_DEPT_VIEW, you could do so by executing a CREATE VIEW SQL statement that looks something like this: CREATE VIEW ADV_DEPT_VIEW AS SELECT (DEPT_NO, DEPT_NAME, DEPT_SIZE) FROM DEPARTMENT WHERE DEPT_SIZE > 25 The view created by this statement would only contain department number, department name, and department size information for each department with more than 25 employees. If the WITH CHECK OPTION clause of the CREATE VIEW SQL statement is specified, insert and update operations performed against the view that is created are validated to ensure that all rows being inserted or updated in the base table that the view refers to conform to the view's definition (otherwise, the insert/update operation will fail). What does this mean? Suppose a view was created using the following CREATE VIEW statement: CREATE VIEW PRIORITY_ORDERS AS SELECT * FROM ORDERS WHERE RESPONSE_TIME < 4 WITH CHECK OPTION Now, suppose a user tries to insert a record into this view that has a RESPONSE_TIME value of 6. The insert operation will fail because the record violates the view's definition. Had the view not been created with the WITH CHECK OPTION clause, the insert operation would have been successful, even though the new record would not be visible to the view used to add it. Figure 2-19 illustrates how the WITH CHECK OPTION works.

Figure 2-19: How the WITH CHECK OPTION clause is used to ensure that insert and update operations conform to a view's definition.

Controlling Database Access


Every database management system must be able to protect data against unauthorized access and/or modification and DB2 UDB is no exception. DB2 UDB uses a combination of external security services and internal access control mechanisms to perform this vital task. In most cases, three levels of security are employed. The first level controls access to the instance under which a database was created (by using a process known as authentication); the second controls access to the database itself; and the third controls access to the data and data objects that reside within the database (by evaluating the authorities and privileges that have been granted to each user). Authorities convey a set of privileges and/or the right to perform high-level administrative and maintenance/utility operations against an instance or a database. Privileges, on the other hand, convey the rights to perform certain actions against specific database objects (such as tables and views). Users can work with only those objects for which they have been given the appropriate authorizationthat is, the required authority or privilege. Figure 2-20 provides a hierarchical view of the authorities and privileges that are recognized by DB2 UDB.

Figure 2-20: Hierarchy of the authorities and privileges available with DB2 UDB for Linux, UNIX, and Windows. As you can see in Figure 2-20, DB2 UDB uses five levels of authorities and eleven sets of privileges (one set of database privileges and ten sets of objects privileges) to control how users interact with instances, databases, and database objects. How (and to whom) these authorities and privileges are granted typically falls under a system or database administrator's control. However, from an application developer's viewpoint, it is important to know that certain privileges are required in order to develop and/or run database applications. Typically, these privileges include, but are not limited to: Schema privileges Table privileges View privileges Package privileges Routine privileges

Schema Privileges
Schema privileges control what users can and cannot do with a particular schema. (A schema is an object that is used to logically classify and group other objects in the database; most objects are named using a naming convention that consists of a schema name, followed by a period, followed by the object name.) Figure 2-21 shows the different types of schema privileges available with DB2 UDB for Linux, UNIX, and Windows. (DB2 UDB for iSeries uses operating system level privileges for schema objects.)

Figure 2-21: Schema privileges available with DB2 UDB for Linux, UNIX, and Windows. As you can see in Figure 2-21, three different schema privileges exist. They are: CREATEIN. Allows a user to create objects within the schema. ALTERIN. Allows a user to change the comment associated with any object in the schema or to alter any object that resides within the schema. DROPIN. Allows a user to remove (drop) any object within the schema. Objects that can be manipulated within a schema include tables, views, indexes, packages, user-defined data types, user-defined functions, triggers, stored procedures, and aliases. The owner of a schema (usually the individual who created the schema) automatically receives these privileges, along with the right to grant any combination of these privileges to other users and groups.

Table Privileges
Table privileges control what users can and cannot do with a particular table in a database. (Earlier, we saw that a table is a logical structure used to present data as a collection of unordered rows with a fixed number of columns.) Figure 2-22 shows the different types of table privileges available.

Figure 2-22: Table privileges available with DB2 UDB. As you can see in Figure 2-22, eight different table privileges exist. They are: CONTROL. Provides a user with every table privilege available, allows the user to remove (drop) the table from the database, and gives the user the ability to grant to or revoke from other users and groups any available table privileges (except the CONTROL privilege). ALTER. Allows a user to execute the ALTER TABLE SQL statement against the table. In other words, this privilege allows a user to add columns to the table, add or change comments associated with the table and/or any of its columns, create a primary key for the table, create a unique constraint for the table, create or drop a check constraint for the table, and create triggers for the table (provided the user holds the appropriate privileges for every object referenced by the trigger). SELECT. Allows a user to execute a SELECT SQL statement against the table. In other words, this privilege allows a user to retrieve data from a table, create a view that references the table, and run the EXPORT utility against the table. INSERT. Allows a user to execute the INSERT SQL statement against the table. In other words, this privilege allows a user to add data to the table and run the IMPORT utility against the table. UPDATE. Allows a user to execute the UPDATE SQL statement against the table. In other words, this privilege allows a user to modify data in the table. (This privilege can be granted for the entire table or limited to one or more columns within the table.) DELETE. Allows a user to execute the DELETE SQL statement against the table. In other words, this privilege allows a user to remove rows of data from the table. INDEX. Allows a user to create an index for the table. REFERENCES. Allows a user to create and drop foreign key constraints that reference the table in a parent relationship. (This privilege can be granted for the entire table or limited to one or more columns within the table, in which case only those columns can participate as a parent key in a referential constraint.)

The owner of a table (usually the individual who created the table) automatically receives CONTROL privilege, along with all other table privileges available for that table. If the CONTROL privilege is later revoked from the table owner, all other privileges that were automatically granted to the owner for that particular table are not automatically revoked. Instead, they must be explicitly revoked in one or more separate operations.

View Privileges
View privileges control what users can and cannot do with a particular view. (Earlier, we saw that a view is a virtual table, residing in memory, that provides an alternative way of working with data that resides in one or more base tables.) Figure 2-23 shows the different types of view privileges available.

Figure 2-23: View privileges available with DB2 UDB. As you can see in Figure 2-23, five different view privileges exist. They are: CONTROL. Provides a user with every view privilege available, allows the user to remove (drop) the view from the database, and gives the user the ability to grant to or revoke from other users and groups any available view privileges (except the CONTROL privilege). SELECT. Allows a user to retrieve data from the view, create a second view that references the view, and run the EXPORT utility against the view. INSERT. Allows a user to add data to the view. UPDATE. Allows a user to modify data in the view. (This privilege can be granted for the entire view or limited to one or more columns within the view.) DELETE. Allows a user to remove rows of data from the view. In order to create a view, a user must hold appropriate privileges on each base table that the view references. Once a view is created, the owner of that view (usually the individual who created the view) automatically receives all available view privilegeswith the exception of the CONTROL privilegefor that view. A view owner will only receive CONTROL privilege for the view if they also hold CONTROL privilege for every base table the view references.

Package Privileges
Package privileges control what users can and cannot do with a particular package. (A package is an object that contains the information needed by the DB2 Database Manager to process SQL statements

in the most efficient way possible on behalf of an embedded SQL application.) Figure 2-24 shows the different types of package privileges available.

Figure 2-24: Package privileges available with DB2 UDB. As you can see in Figure 2-24, three different package privileges exist. They are: CONTROL. Provides a user with every package privilege available, allows the user to remove (drop) the package from the database, and gives the user the ability to grant to or revoke from other users and groups any available package privileges (except the CONTROL privilege). BIND. Allows a user to rebind or add new package versions to a package that has already been bound to a database. (In addition to the BIND package privilege, a user must hold the privileges needed to execute the SQL statements that make up the package before the package can be successfully rebound.) It is important to note that, in order to create a new package in a database, a user must have BINDADD privilege for the database that the package is to be created for. EXECUTE. Allows a user to execute the package. (A user who has EXECUTE privilege for a particular package can execute that package, even if the user does not have the privileges needed to execute the SQL statements stored in the package. That is because any privileges needed to execute the SQL statements are implicitly granted to the package user. It is important to note that for privileges to be implicitly granted, the creator of the package must hold privileges as an individual user or as a member of the group PUBLICnot as a member of another named group.) The owner of a package (usually the individual who created the package) automatically receives CONTROL privilege, along with all other package privileges available for that package. If the CONTROL privilege is later revoked from the package owner, all other privileges that were automatically granted to the owner for that particular package are not automatically revoked. Instead, they must be explicitly revoked in one or more separate operations.

Routine Privileges
Routine privileges control what users can and cannot do with a particular routine. (A routine can be a user-defined function, a stored procedure, or a method that can be invoked by several users.) Figure 225 shows the different types of routine privileges available.

Figure 2-25: Routine privileges available with DB2 UDB. As you can see in Figure 2-25, two different routine privileges exist. They are: CONTROL. Provides a user with every routine privilege available, allows the user to remove (drop) the routine from the database, and gives the user the ability to grant to or revoke from other users and groups any available routine privileges (except the CONTROL privilege). EXECUTE. Allows a user to invoke the routine, create a function that is sourced from the routine (provided the routine is a function), and reference the routine in a DDL statement when creating a constraint. The owner of a routine (usually the individual who created the routine) automatically receives CONTROL and EXECUTE privileges for that routine. If the CONTROL privilege is later revoked from the owner, the EXECUTE privilege will be retained and must be explicitly revoked in a separate operation.

How Privileges Are Granted


So just how does a user receive any privileges needed? There are three ways in which users (and groups) can obtain authorities and privileges. They are: Implicitly. When a user creates a database, he or she implicitly receives DBADM authority along with several database privileges for that database. Likewise, when a user creates a database object, he or she implicitly receives all privileges available for that object along with the ability to grant any combination of those privileges (with the exception of the CONTROL privilege) to other users and groups. Privileges can also be implicitly given whenever a higher-level privilege is explicitly granted to a user (for example, if a user is explicitly given CONTROL privilege for a tablespace, the user will implicitly receive the USE privilege for that tablespace as well). Keep in mind that such implicitly-assigned privileges are not automatically revoked when the higher-level privilege that caused them to be granted is revoked. Indirectly. Indirectly-assigned privileges are usually associated with packages. When a user executes a package that requires privileges to execute that the user does not have (for example, a package that deletes a row of data from a table requires the DELETE privilege on that table), the user is indirectly given those privileges for the express purpose of executing the package. Indirectly-granted privileges are temporary and do not exist outside the scope in which they are granted. (It is important to note that indirect privileges are only assigned when a package contains access plans for static SQL statements; packages that contain dynamic SQL statements do not indirectly assign the privileges needed to execute those statements.) Explicitly. Database-level authorities, database privileges, and object privileges can be explicitly given to or taken from an individual user or a group of users by any user that has the authority to do so. To explicitly grant privileges on most database objects, a user must have SYSADM authority, DBADM authority, or CONTROL privilege on that object. Alternately, a user can explicitly grant any privilege that he or she was assigned with the WITH GRANT OPTION specified. To grant CONTROL privilege for any

object, a user must have SYSADM or DBADM authority; to grant DBADM authority, a user must have SYSADM authority.

DB2 UDB's Special Registers


Although database applications interact primarily with data objects, they can also obtain information from (and in some cases assign values to) several predefined storage areas known as special registers. Special registers are used to store specific information that describes and/or controls the environments in which SQL statements are executed. The special registers available with DB2 UDB for Linux, UNIX, and Windows can be seen in Table 2-4. Table 2-4: Special Registers Used by DB2 UDB for Linux, UNIX, and Windows Special Register CLIENT ACCTNG (or CLIENT_ACCTNG) CLIENT APPLNAME (or CLIENT_APPLNAM E) CLIENT USERID (or CLIENT_USERID) CLIENT WRKSTNNAME (or CLIENT_WRKSTN NAME) CURRENT DATE (or CURRENT_DATE) Contents Contains the client accounting string provided when the current connection was established (if an accounting string was supplied). Contains the client application name associated with the current connection. Contains the client user ID associated with the current connection. Contains the name of the client workstation that the current connection was established from. Contains a date value that is obtained by reading the system clock when an SQL statement is executed at the application server. If this special register is used more than once within a single SQL statement or is used with the CURRENT TIME or CURRENT TIMESTAMP special register within a single statement, all values obtained are based on a single system clock reading. Contains a value that identifies the partition node number of the coordinator node used to process a particular SQL statement. For statements issued from an application, the coordinator node is the partition that the application is connected to; for statements issued from a routine or stored procedure, the coordinator node is the partition from which the routine/stored procedure is invoked. Data Type VARCH AR(255) Updatable Yes

VARCH AR(255)

Yes

VARCH AR(255) VARCH AR(255)

Yes

Yes

DATE

No

CURRENT DBPARTIONNUM (or CURRENT_DBPAR TIONNUM)

INTEGE R

No

Table 2-4: Special Registers Used by DB2 UDB for Linux, UNIX, and Windows Special Register Contents (This special register is set to 0 if the database instance has not been defined to support partitioningin other words, if no db2nodes.cfg file is used.) CURRENT DEFAULT TRANSFORM GROUP (or CURRENT_DEFAU LT_TRANSFORM_ GROUP) Contains the name of the transform group that is used by dynamic SQL statements for exchanging userdefined structured data type values with host programs. (This special register does not identify the transform groups used in static SQL statements or in the exchange of parameters and results with external functions or methods.) Specifies the degree of intrapartition parallelism that is to be used for the execution of dynamic SQL statements. If the value of this special register is 1 when an SQL statement is dynamically prepared, when that statement is executed, it will not use intra-partition parallelism; however, if the value is greater than 1 and less than or equal to 32,767 when an SQL statement is dynamically prepared, the execution of that statement can involve intra-partition parallelism with the degree specified. (If the value of this special register is ANY when an SQL statement is dynamically prepared, the execution of that statement can involve intra-partition parallelism using a degree that is determined by the DB2 Database Manager.) CURRENT EXPLAIN MODE (or CURRENT_EXPLA IN_MODE) Contains a value that controls the behavior of the Explain facility with respect to eligible dynamic SQL statements. The Explain facility generates access plan information for dynamic SQL statements and inserts that information into the appropriate Explain tables. This information does not include Explain Snapshot information. VARCH AR(18) Yes Data Type Updatable

CURRENT DEGREE (or CURRENT_DEGR EE)

CHAR(5 ) (Valid values are the word ANY or the string represe ntation of a number between 1 and 32,767, inclusive .)

Yes

VARCH AR(254) (Valid values are YES, NO, EXPLAI N, RECOM MEND INDEXE

Yes

Table 2-4: Special Registers Used by DB2 UDB for Linux, UNIX, and Windows Special Register Contents (This special register, along with the CURRENT EXPLAIN SNAPSHOT special register, interacts with the Explain facility when it is invoked; this special register also interacts with the EXPLAIN bind option.) CURRENT EXPLAIN SNAPSHOT (or CURRENT_EXPLA IN_SNAPSHOT) Contains a value that controls the behavior of the Explain Snapshot facility. The Explain Snapshot facility generates compressed information (including access plan information, operator costs, and bind-time statistics) about SELECT, SELECT INTO, INSERT, UPDATE, DELETE, VALUES, or VALUES INTO SQL statements and inserts that information into the appropriate Explain tables. By examining this information, "bottle necks" can be found and changes can be made to either the database or the SQL statement to improve overall application performance. (This special register, along with the CURRENT EXPLAIN MODE special register interacts with the Explain facility when it is invoked; this special register also interacts with the EXPLSNAP bind option.) CURRENT MAINTAINED TABLE TYPES FOR OPTIMIZATION (or CURRENT_MAINT AINED_TABLE_TY PES_FOR_OPTIMI ZATION) CURRENT PATH (or CURRENT_PATH) Contains a value that identifies the types of tables that are to be considered when optimizing the processing of dynamic SQL queries. VARCH AR(254) (Valid values are ALL, NONE, SYSTE M, and USER.) VARCH AR(254) Yes Yes Data Type S, and EVALU ATE INDEXE S.) CHAR(8 ) (Valid values are YES, NO, and EXPLAI N.) Yes Updatable

Contains a value that identifies the SQL path that the DB2 Database Manager is to use to resolve function references and data type references that are used in dynamically-prepared SQL statements. It is also used to resolve stored procedure references in CALL SQL statements. The

Table 2-4: Special Registers Used by DB2 UDB for Linux, UNIX, and Windows Special Register Contents CURRENT PATH special register contains a list of one or more schema-names, where the schemanames are enclosed in double quotes and separated by commas (quotes within the string are repeated, as they are in any delimited identifier). A function or data type reference that is not qualified with a schema name will be implicitly qualified with the name of the first schema in the SQL path that contains a function or data type with the same unqualified name. CURRENT QUERY OPTIMIZATION (or CURRENT_QUER Y_OPTIMIZATION) Contains a value that controls the class (level) of query optimization that is to be performed by the optimizer when binding dynamic SQL statements. (The QUERYOPT bind option controls the class of query optimization for static SQL statements.) If the query optimization is set to the minimal class of optimization, the value of this special register is 0. If the query optimization is set to the highest class of optimization available, the value of this special register is 9. Generally, the higher classes cause the optimizer to use more time and memory when selecting optimal access plans, which potentially results in better access plans and improved run time performance. CURRENT REFRESH AGE (or CURRENT_REFRE SH_AGE) Contains a timestamp value that identifies the maximum duration since a timestamped event occurred to a cached data object, such that the cached data object can be used to optimize the processing of a query. If this special register has a value of 99999999999999 (ANY) and the CURRENT QUERY OPTIMIZATION special register contains a class 5 or higher value, the types of tables specified in CURRENT MAINTAINED TABLE TYPES FOR OPTIMIZATION special register are INTEGE R (Valid values are any number between 0 and 9, inclusive .) Yes Data Type Updatable

DECIM AL(20,6)

Yes

Table 2-4: Special Registers Used by DB2 UDB for Linux, UNIX, and Windows Special Register Contents considered when optimizing the processing of dynamic SQL queries. CURRENT SCHEMA (or CURRENT_SCHE MA) Contains a value that identifies the schema name that is to be used to qualify references to unqualified database objects in dynamicallyprepared SQL statements. (The QUALIFIER bind option controls the schema name used to qualify database object references, where applicable, for static SQL statements.) The initial value of this special register is the authorization ID of the current user. CURRENT SERVER (or CURRENT_SERVE R) CURRENT TIME (or CURRENT_TIME) Contains a value that identifies the current application server. (The actual name of the application servernot an aliasis stored in this special register.) Contains a time value that is obtained by reading the system clock when an SQL statement is executed at the application server. If this special register is used more than once within a single SQL statement or in conjunction with the CURRENT DATE or CURRENT TIMESTAMP special registers in a single statement, all values obtained are based on a single clock reading. CURRENT TIMESTAMP (or CURRENT_TIMES TAMP) Contains a timestamp value that is obtained by reading the system clock when an SQL statement is executed at the application server. If this special register is used more than once within a single SQL statement or in conjunction with the CURRENT DATE or CURRENT TIME special registers in a single statement, all values obtained are based on a single clock reading. CURRENT TIMEZONE (or CURRENT_TIMEZ ONE) Contains the difference between Coordinated Universal Time (formerly known as Greenwich Mean Time) and the local time at DECIM AL(6,0) No TIMEST AMP No VARCH AR(18) No VARCH AR(128) Yes Data Type Updatable

TIME

No

Table 2-4: Special Registers Used by DB2 UDB for Linux, UNIX, and Windows Special Register Contents the application server. (The difference is represented by a decimal number in which the first two digits are the number of hours [0-24, exclusive], the next two digits are the number of minutes, and the last two digits are the number of seconds.) The value stored in this special register is calculated (by reading the system clock) at the exact moment an SQL statement is executed at the application server. Subtracting the value of this special register from a local time value results in the conversion of the local time value specified to Coordinated Universal Time. USER Contains the run-time authorization ID (of the current user) that is passed to the DB2 Database Manager when an application connects to a database. (This is the ID that is used for authorization checking for dynamic SQL statements; static SQL statements use the authorization ID of the user who bound the corresponding package to the database.) VARCH AR(128) No Data Type Updatable

The value assigned to a special register can be obtained in one of two ways: By executing the VALUES SQL statement, either from the Command Line Processor or within a query/subquery. In this case, the syntax for this statement is VALUES [SpecialRegister], where SpecialRegister is the name of one of the special registers available. (Alternately, the VALUES INTO SQL statement can be used to copy the value of a special register to an application host variable.) By querying the SYSIBM.SYSDUMMY1 system catalog table. The syntax for such a query is SELECT [SpecialRegister] FROM SYSIBM.SYS-DUMMY1, where SpecialRegister is the name of one of the special registers available. Thus, if you wanted to obtain the value of the CURRENT DATE special register, you could do so by executing a VALUES SQL statement that looks like this: VALUES CURRENT DATE Alternately, you could obtain this value by executing a query against the SYSIBM.SYSDUMMY1 system catalog table that looks like this: SELECT CURRENT DATE FROM SYSIBM.SYSDUMMY1

A Word about SQL Procedures

When you set up a remote DB2 UDB database server and access it from one or more DB2 UDB client workstations, you have, in essence, established a basic DB2 UDB client/server environment. In this environment, each time an SQL statement is executed against the database on the remote server, the statement itself is sent through a network from the client workstation to the database server. The database server then processes the statement, and the results are sent back, again through the network, to the client workstation. (This means that two messages must go through the network for every SQL statement that is executed.) However, if you have an application that contains one or more transactions that perform a relatively large amount of database activity with little or no user interaction, each transaction can be stored on the database server as what is known as a stored procedure. By using a stored procedure, all database processing done by the transaction can be performed directly at the server workstation. And because a stored procedure is invoked by a single SQL statement, fewer messages have to be transmitted across the networkonly the data that is actually needed at the client workstation has to be sent across. This architecture allows the code that interacts directly with a database to reside on a high-performance PC or minicomputer (the database server) where computing power and centralized control can be used to provide quick, coordinated data access. At the same time, the application logic can reside on one or more smaller (client) workstations so that it can make effective use of all the resources the client workstation has to offer. Thus, the resources of both the client workstation and server workstation are utilized to their fullest potential.

Creating a Stored Procedure


Two types of stored procedures can be created and used with a DB2 UDB database: External. The body of the stored procedure is written using a high-level programming language (C, C++, Java, or COBOL). SQL. The body of the stored procedure is written in SQL. Regardless of whether a stored procedure is an external procedure or an SQL procedure, all procedures must be structured such that they perform three distinct tasks: First, they must accept input parameter values, if any, from the client application. Next, they must perform whatever processing is appropriate. Finally, they must return output data, if any, to the client application. At the very least, a stored procedure should always return a value that indicates its success or failure. Before the source code for an external procedure becomes a usable stored procedure, the source code must be precompiled with the DB2 SQL Precompiler (unless the procedure was written with JDBC or DB2 CLI), then the precompiled source code must be compiled and linked to produce a library (program object on DB2 UDB for iSeries), and finally, the library (program object) must be bound to the database. Typically, this binding takes place when the source code is precompiled. Once a library containing the stored procedure has been created, that library must be physically stored on the server. (By default, the DB2 Database Manager looks for stored procedures in the \sqllib\function and \sqllib\function\unfenced subdirectories.) Additionally, the system permissions for the library file containing the stored procedure must be modified so that all users can execute it. For example, in a UNIX environment, the chmod command is used to make a file executable; in a Windows environment, the attrib command is used to make a file executable. Both external procedures and SQL procedures must be registered with the database they are designed to interact with. This is done by executing the appropriate form of the CREATE PROCEDURE SQL statement. The basic syntax for this statement is: CREATE PROCEDURE [ProcedureName] ( [ParamType] [ParamName] [DataType] ,...) <SPECIFIC [SpecificName]> <DYNAMIC RESULT SETS [NumResultSets]> <NO SQL | CONTAINS SQL | READS SQL DATA> <DETERMINISTIC | NOT DETERMINISTIC>

<CALLED ON NULL INPUT> <LANGUAGE SQL> [SQLStatement] or CREATE PROCEDURE [ProcedureName] ( [ParamType] [ParamName] [DataType] ,...) <SPECIFIC [SpecificName]> <DYNAMIC RESULT SETS [NumResultSets]> <NO SQL | CONTAINS SQL | READS SQL DATA> <DETERMINISTIC | NOT DETERMINISTIC> <CALLED ON NULL INPUT> LANGUAGE [C | JAVA | COBOL | OLE] EXTERNAL <NAME [ExternalName] | [Identifier]> <FENCED <THREADSAFE | NOT THREADSAFE> | NOT FENCED <THREADSAFE>> PARAMETER STYLE [DB2GENERAL | DB2SQL | GENERAL | GENERAL WITH NULLS | JAVA | SQL] <PROGRAM TYPE [SUB | MAIN]> <DBINFO | NO DBINFO> where: ProcedureName ParamType ParamName DataType SpecificName

Identifies the name to be assigned to the procedure to be created. Indicates whether the parameter identified by ParamName is an input parameter (IN), an output parameter (OUT), or both an input and an output parameter (INOUT). (Valid values include IN, OUT, and INOUT.) Identifies the name to be assigned to a procedure parameter. Identifies the type of data the procedure expects to receive/send for the parameter identified by ParamName. Identifies the specific name to be assigned to the stored procedure. This name can be used later to comment on the stored procedure or to drop the stored procedure; however, it cannot be used to invoke the stored procedure. Identifies whether or not the stored procedure being registered returns result data sets, and, if so, how many. Identifies a single SQL statement or a compound SQL statement (i.e., two or more SQL statements enclosed with the keywords BEGIN ATOMIC and END and terminated with a semicolon) that is to be executed when the stored procedure is invoked. Identifies the name of the library, along with the name of the function in the library that contains the executable code of the stored procedure being registered. Identifies the name of the library that contains the executable code of the stored procedure being registered, but only if the procedure was written using C or C++. The DB2 Database Manager will look for a function that has the same name as the library name specified.

NumResultSets SQLStatement

ExternalName

Identifier

Thus, a simple SQL procedure could be created by executing a CREATE PROCEDURE statement that looks something like this: CREATE PROCEDURE GET_SALES (IN QUOTA INTEGER, OUT RETCODE CHAR(5)) DYNAMIC RESULT SETS 1 LANGUAGE SQL BEGIN DECLARE SQLSTATE CHAR(5); DECLARE SALES_RESULTS CURSOR WITH RETURN FOR SELECT SALES_PERSON, SUM(SALES) AS TOTAL_SALES FROM SALES GROUP BY SALES_PERSON HAVING SUM(SALES) > QUOTA; DECLARE EXIT HANDLER FOR SQLEXCEPTION SET RETCODE = SQLSTATE; OPEN SALES_RESULTS; SET RETCODE = SQLSTATE; END The resulting SQL stored procedure, called GET_SALES, accepts an integer input value (in an input parameter called QUOTA) and returns a character value (in an output parameter called RETCODE) that reports the procedure's success or failure. The procedure body consists of a compound SQL statement that returns a result data set (i.e., an open cursor) containing the name and total sales figures for each salesperson whose total sales exceed the quota specified in a result data set. This is done by: 1. Indicating that the SQL procedure is to return a result data set by specifying the DYNAMIC RESULT SETS clause of the CREATE PROCEDURE statement and assigning it the value 1. 2. Declaring a cursor within the procedure body (using the WITH RETURN FOR clause) for the result data set that is to be returned. (Earlier, we saw that a cursor is a named control structure that points to a specific row within a set of rows and is used by an application program to retrieve values from this set of rows.) 3. Opening the cursor, which produces the result data set that is to be returned. 4. Leaving the cursor open when the SQL procedure ends. (It is up to the calling application to close the open cursor when it is no longer needed.)

Calling a Stored Procedure


Once a stored procedure has been registered with a database (by executing the CREATE PROCEDURE SQL statement), that procedure can be invoked, either interactively, using a utility such as the Command Line Processor, or from a client application. Registered stored procedures are invoked by executing the CALL SQL statement. The basic syntax for this statement is: CALL [ProcedureName] ( <[InputParameter] | [OutputParameter] | NULL> ,...) where: ProcedureName

Identifies the name assigned to the procedure to be invoked. (Remember, the procedure name, not the specific name, must be used

to invoke the procedure.) InputParameter OutputParameter Identifies one or more parameter values that are to be passed to the procedure being invoked. Identifies one or more parameter markers or host variables that are to receive return values from the procedure being invoked.

Thus, the SQL procedure named GET_SALES that we created earlier could be invoked by connecting to the appropriate database and executing a CALL statement from the Command Line Processor that looks something like this: CALL GET_SALES (25, ?) The same procedure could be invoked from an embedded SQL application by executing a CALL statement that looks something like this: CALL GET_SALES (25, :RetCode) where RetCode is the name of a valid host variable. (We'll take a closer look at host variables in Chapter 4, "Embedded SQL Programming.") When this CALL statement is executed, the value 25 is passed to the input parameter named QUOTA, and a question mark (?) or a host variable named RetCode is used as a placeholder for the value that will be returned in the output parameter RETCODE.

Developing DB2 UDB Applications


So far, we have looked at some of the components that make up a DB2 UDB database, but we have not looked at how applications that interact with these components are constructed. If you look closely at any application you use on a routine basis, you will discover that every application is designed around five basic elements: Input Logic (decision control) Memory (data storage and retrieval) Arithmetic or processing (calculation) Output Input is defined as the way an application receives the information it needs to produce solutions for the problems it has been designed to solve. Once input is received, logic takes over and determines what information should be placed in or taken out of memory (data storage) and what arithmetic operations should be performed on that data. Non-database applications typically use functions supplied by the operating system to store data in (and retrieve data from) simple, byte-oriented files. And finally, when the application has generated a solution to the problem it was designed to solve, it provides appropriate output in the form of either an answer or a specific action. Most DB2 UDB applications contain these same five elements; in fact, the only real difference between non-DB2 UDB applications and DB2 UDB applications is the way in which data is stored and retrieved and the way decision control is exercised. With DB2 UDB applications, operating system file input/output (I/O) is replaced with DB2 database I/O (which provides much more than just data storage and retrieval), and because business logic can be placed directly into the database (in the form of constraints and triggers), DB2 UDB applications can require less decision control (logic).

DB2 UDB Programming Interfaces


Now that we have looked at the basic elements that most DB2 UDB applications are built around, let's turn our attention to the interfaces that can be used to construct a DB2 UDB application. With DB2 UDB Version 8.1, support is provided for the following interfaces: Embedded SQL Call Level Interface (CLI)/Open Database Connectivity (ODBC) Java Database Connectivity (JDBC) and Embedded SQL for Java (SQLJ) Administrative Application Programming Interface (API) functions (DB2 UDB for Linux, UNIX, and Windows only)

Microsoft Data Access Objects (DAO), Remote Data Objects (RDO), Active X Data Objects (ADO), and Object Linking and Embedding for Databases (OLE DB)

Most DB2 UDB applications are built by combining the functionality provided with one or more of these interfaces with a high-level programming language such as C, Java, or COBOL. (A high-level programming language provides the framework within which SQL statements, CLI/ODBC function calls, API function calls, etc. are contained. This framework allows you to control the sequence in which an application's tasks are performed and provides a way for applications to receive user input and generate appropriate output. This framework also enables you to combine operating system calls with DB2 functionality in the same application program.) With the right combination of interfaces, almost every task that can be performed from the DB2 Command Line Processor can be conducted from a properly-coded application program.

Embedded SQL
Structured Query Language (SQL) is a standardized language used to work with database objects and the data they contain. SQL is comprised of several different statements that are used to define, alter, and destroy database objects as well as add, update, delete, and retrieve data values. However, because SQL is nonprocedural by design, it is not an actual programming language (SQL statements are executed by DB2, not by the operating system). Therefore, most applications that rely on the data storage, manipulation, and retrieval capabilities of SQL are constructed by embedding the SQL statements needed in the application's source code file(s). (The high-level programming language used to construct the application's source code files is often referred to as the host language; the application itself is known as the host application.) Because high-level programming language compilers cannot interpret SQL statements directly, source code files containing embedded SQL statements must first be processed by an SQL precompiler before they can be compiled. Likewise, the DB2 Database Manager cannot work directly with high-level programming language variables. Instead, it must work with special host variables that have been defined in the source code file. The SQL precompiler is responsible for translating all SQL statements embedded in a source code file into appropriate host language function calls and for evaluating the data types of declared host variables and determining which data conversion methods are needed to move data to and from the database. Additionally, the SQL precompiler performs error checking on each SQL statement used. Two types of SQL statements can be embedded in a host application: static SQL statements and dynamic SQL statements. And as you might imagine, each has its advantages and disadvantages.

Static SQL
A static SQL statement is an SQL statement that can be hard-coded in an application program at development time because information about its structure and the objects (i.e., tables, columns, and data types) that it is intended to interact with are known in advance. Because the details of a static SQL statement are known at development time, the work of analyzing the statement and selecting the optimum data access plan to use to execute the statement is done by the DB2 optimizer during the development process. Thus, static SQL statements execute quickly because their operational form already exists in the database and does not have to be generated at application run time. The downside to this is that all static SQL statements must be prepared (in other words, their access plans must be generated and stored as a package in the database) before they can be executed; the statements themselves cannot be altered at run time, and each application that uses static SQL must "bind" its operational package(s) to every database the application is to interact with. Note Because static SQL applications require prior knowledge of database objects, changes made to these objects after the application is developed can produce undesirable results. Generally, static SQL statements are well suited for high-performance applications that execute predefined operations against a known set of database objects.

Dynamic SQL
A dynamic SQL statement is an SQL statement that must be constructed at application run time because information about its structure and the objects (i.e., tables, columns, and data types) that it is intended to interact with are not known at development time. Because a dynamic SQL statement does not have a precoded, fixed format, the data object(s) the statement is designed to interact with can change each time the statement is executed. Dynamic SQL statements also enable the SQL optimizer to see the real values of arguments, so host variables are not needed. Because dynamic SQL statements are constructed at application run time, rather than during the development process, they are generally more powerful than static SQL statements. Unfortunately, they are also more complicated to incorporate into an application, and because the work of analyzing the statement and selecting the optimum data access plan to use when executing the statement is done at application run time, dynamic SQL statements can also take longer to execute than their equivalent static SQL counterparts. (Dynamic SQL statements can take advantage of the database statistics available at application run time, so there are some cases in which a dynamic SQL statement will execute faster than an equivalent static SQL statement, but those are the exceptions and not the norm.) Generally, dynamic SQL statements are well suited for applications that interact with a rapidly-changing database or that allow users to define and execute ad-hoc queries.

Call Level Interface (CLI)/Open Database Connectivity (ODBC)


DB2 UDB's Call Level Interface (CLI) is a collection of API function calls that are designed to facilitate database access without having to use embedded SQL (and all that its use entails). To understand CLI, it helps to understand the basis of its development and to see how it compares with existing, callable, SQL interfaces. In the early 1990s, the X/Open Company and the SQL Access Group (SAG), now a part of X/Open, jointly developed a standard specification for a callable SQL interface known as the X/Open Call-Level Interface, or X/Open CLI. The goal of the X/Open CLI was to increase the portability of database applications by allowing them to become independent of any one database management system's programming interface. Most of the X/Open CLI specifications were later accepted as part of a new ISO CLI international standard, and DB2 UDB's CLI is based on this ISO CLI standard interface specification. In 1992, Microsoft Corporation developed a callable SQL interface called Open Database Connectivity (ODBC) for the Microsoft Windows operating system. ODBC is based on the X/Open CLI standards specification but provides an extended set of APIs that, in turn, provides additional functionality. The ODBC specification also defines an operating environment in which database-specific ODBC drivers are dynamically loaded (based on the database name provided with the connection request) at application run time by an ODBC Driver Manager. This Driver Manager provides a central point of control for each data source-specific library (driver) used. (Each library is responsible for implementing the ODBC APIs that interact with a specific database management system.) By using drivers, an application can be linked directly to a single ODBC driver library rather than to each DBMS itself. As the application runs, the ODBC Driver Manager examines every function call made and ensures that each is routed to the appropriate driver for processing. Applications that incorporate DB2 CLI are linked directly to the DB2 CLI load library, which can be loaded as an ODBC driver by any ODBC Driver Manager or used independently by the DB2 Database Manager. This load library provides support for all ODBC 3.x Core functions (except SQLDrivers()), all ODBC Level 1 functions, all ODBC Level 2 functions, some X/Open CLI-specific functions that are not supported by ODBC, and some DB2 UDB-specific functions that are not recognized by X/Open CLI or ODBC. Many differences exist between applications written using embedded SQL and those written using DB2 CLI. For one thing, because SQL statements are issued through API function calls, CLI applications do not have to be precompiled. Additionally, CLI applications use common access plans (packages) that are provided with DB2; hence, CLI applications are not required to "bind" their operational package(s) to every database the application intends to interact with. And because CLI applications do not have to be

precompiled, they can be executed on a variety of database systems without undergoing any type of alteration.

Java Database Connectivity (JDBC) and SQLJ


So far, we have focused primarily on the DB2 UDB application development interfaces that are designed to be used with compiled high-level programming languages such as C, C++, and COBOL. However, DB2 UDB also provides interfaces that can be used with most types of Java programs (which are written using an interpreted language) including applications, applets, and servlets. Java programs that interact with DB2 UDB databases have the option of using Java Database Connectivity (JDBC) or Embedded SQL for Java (SQLJ). Regardless of which interface is used, interaction between Java programs and DB2 UDB databases is performed using standard Java classes and methods. JDBC is comprised of a set of API functions that are very similar in nature to the functions provided by CLI/ODBC. Like CLI/ODBC, when JDBC is used, dynamic SQL statements are passed to a special driver that, in turn, forwards all SQL statements encountered in a Java program to the DB2 Database Manager for processing. Thus, JDBC applications do not have to be precompiled and can be executed on a wide variety of database systems without having to undergo any type of modification. SQLJ on the other hand, allows static SQL statements to be embedded in a Java program in the same way they can be embedded in any other high-level programming language source code file; essentially SQLJ extends embedded SQL programming to the Java environment. As a result, an application that was coded using SQLJ must be translated with an SQLJ translator to produce native Java source code before it can be executed (similar to the way an embedded SQL application written in a high-level programming language must be precompiled before it can be compiled and linked to produce an executable program). Additionally, once an SQLJ program has been translated, packages for the application must be created and bound to the appropriate database using a tool that is known as the DB2 SQLJ Profile Customizer. (It is important to note that some mechanisms contained in SQLJ rely on JDBC to provide basic functionality; an example is the mechanism used to establish a database connection.)

Administrative Application Programming Interface (API) Functions


If you have ever had the opportunity to administer a DB2 UDB for Linux, UNIX, and Windows database, you are probably aware that a wide variety of commands exist solely for the purpose of performing administrative tasks. In addition to providing a rich set of administrative commands that can be executed from either a script or the Command Line Processor, DB2 UDB also provides a set of Administrative Application Programming Interface (API) functions that provide the same functionality as their command counterparts. Essentially, the Administrative APIs are a collection of DB2 product-specific functions that provide services other than the data storage, manipulation, and retrieval services that are provided by SQL statements, CLI/ODBC functions, and JDBC. The Administrative APIs are available in many high-level programming languages including C, C++, and COBOL. Every API has a call and return interface, and data is exchanged between applications using Administrative APIs and the DB2 Database Manager via one or more special data structures. Like embedded static SQL statements, Administrative APIs are hard-coded directly in the source code file(s) used to build the application. However, unlike when embedded static SQL statements are used, source code files containing Administrative APIs do not have to be precompiled and bound to one or more databases as part of the development process (unless of course those files contain embedded SQL as well). (The Administrative APIs are not supported by DB2 UDB for iSeries.)

Microsoft Data Access Objects


Just as you can develop applications that use embedded SQL, CLI/ODBC, JDBC, and SQLJ to perform a wide variety of operations on a database or the data under a database's control, you can develop Microsoft Visual Basic and Microsoft Visual C++ applications that provide similar functionality using Data Access Object (DAO) and Remote Data Object (RDO) technology. (Applications using either of these specifications interface with DB2 using the DB2 CLI/ODBC driver.) RDO provides a set of objects that

make it easy to connect to a database, execute queries/stored procedures, manipulate data, and commit changes to data; it is designed specifically to access remote ODBC data sources and allows applications to take advantage of ODBC without requiring complex application code. In fact, RDO is the primary method used to access a relational database that is exposed via an ODBC driver. DB2 UDB also supports applications that are developed using the ActiveX Data Object (ADO) specification. (Applications using this specification interact with DB2 using either the OLE:ODBC bridge or the native OLE DB (Object Linking and Embedding for Databases) driver that is provided with DB2 UDB.) ADO allows applications to access and/or manipulate data using an OLE DB provider, which provides access to a much broader set of data sources than those available with ODBC. In addition to providing access to a wider variety of data sources, ADO also executes faster than ODBC, is easy to use, has low memory requirements, and has a small disk footprint.

Practice Questions
Question 1 The following commands are issued against a database containing a table named PAYROLL.EMPLOYEES : CREATE ALIAS payroll.emp FOR payroll.employees CREATE ALIAS hr.emp FOR temp.emp CREATE ALIAS user1.workers FOR hr.emp If user USER1 issues the following statement: SELECT * FROM emp For which of the following objects will access be attempted?

A. B. C.
Question 2

PAYROLL.EMP HR.EMP TEMP.EMP

Question 3

D. USER1.EMP An embedded SQL application contains static SQL statements that connect to a remote database named SAMPLE and insert data into a table named PAYROLL.EMPLOYEES. USER1 needs to be able use this application and all other non-administrative users should be denied access to PAYROLL.EMPLOYEES. Which of the following privileges must be granted and/or revoked to accomplish this? A. USER1 requires EXECUTE privilege on the package; EXECUTE privilege on the package must be revoked from PUBLIC B. USER1 requires REFERENCES privilege on the package; REFERENCES privilege on the package must be revoked from PUBLIC C. USER1 requires EXECUTE privilege on the package and INSERT privilege on PAYROLL.EMPLOYEES; EXECUTE privilege on the package must be revoked from PUBLIC D. USER1 requires REFERENCES privilege on the package and INSERT privilege on PAYROLL.EMPLOYEES; REFERENCES privilege on the package must be revoked from PUBLIC Given the following CREATE TABLE statement: CREATE TABLE tab1 (empid INTEGER GENERATED ALWAYS AS

Question 4

Question 5

Question 6

IDENTITY, name CHAR(50), dept CHAR(3)) Which of the operations will cause an error to be generated? A. INSERT INTO tab1 VALUES (DEFAULT, Jagger, E01) B. INSERT INTO tab1 VALUES (1, Jagger, E01) C. INSERT INTO tab1 (name) VALUES (Jagger) D. INSERT INTO tab1 (name) VALUES (NULL) Which of the following can be used to retrieve the value of the CURRENT SCHEMA special register? A. ? CURRENT SCHEMA B. SHOW CURRENT SCHEMA C. SELECT CURRENT SCHEMA FROM SYSIBM.SYSDUMMY1 D. SELECT CURRENT SCHEMA FROM SYSIBM.SPECIALREGISTERS Which two of the following will produce sequentially increasing values that can be used to provide values for a primary key? A. ROWID data type B. Generated IDENTITY column C. The GENERATE_UNIQUE( ) built-in function D. A sequence E. The GENKEY( ) built-in function Given the following tables: EMPLOYEES EMPID LASTNAME DEPT 1 Jagger 1 2 Richards 1 3 Watts 2 4 Wood 1 DEPARTMENT DEPTID DEPTNAME 1 Planning 2 Support If the following statements are executed: ALTER TABLE employees ADD FOREIGN KEY(dept) REFERENCES department (deptid) ON DELETE SET NULL DELETE FROM department WHERE deptid = 1 How many rows will be deleted? A. 1 B. 2 C. 3 D. 4 Given a table created using the following statement: CREATE TABLE hr.depts

Question 7

Question 8

Question 9

(deptid INTEGER, deptname CHAR(20)) USER1 needs to be able to access data stored in table HR.DEPTS using an implicit schema. Assuming the necessary privileges have been granted, which two of the following statements can database administrator DBA1 execute to meet this requirement? A. CREATE ALIAS depts FOR hr.depts B. CREATE ALIAS hr.depts. FOR user1.depts C. CREATE ALIAS user1.depts. FOR hr.depts D. CREATE VIEW hr.depts. FOR user1.depts E. CREATE VIEW user1.depts. AS SELECT * FROM hr.depts An embedded SQL application named INVENTORY uses dynamic SQL to retrieve a value from a table named TAB1 and inserts the value retrieved into a table named TAB2. Assuming the user can connect to the database, which of the following privileges must a user be granted in order before they can use the application? EXECUTE privilege for INVENTORY SELECT privilege on TAB1, INSERT privilege on TAB2 SELECT privilege on TAB1, INSERT privilege on TAB2, EXECUTE privilege on INVENTORY CONTROL privilege on TAB1 and TAB2, RUN privilege on INVENTORY Given the following table definition: PARTS PARTID NAME PRICE 1 Printer 175.00 2 Monitor 499.99 3 Scanner 129.99 and the following trigger definition: CREATE TRIGGER partnum NO CASCADE BEFORE INSERT ON parts REFERENCING NEW AS n FOR EACH ROW MODE DB2SQL BEGIN ATOMIC SET n.partid = (SELECT MAX(partid) FROM parts); SET n.partid = VALUE(n.partid + 1, 1); END If the following statement is executed: INSERT INTO parts (name, price)VALUES (Keyboard, 100.00) What part number will be assigned to the keyboard when the insert operation is complete?

Question 10

A. null B. 0 C. 4 D. The INSERT statement will fail and no record will be added Given the following table definition: TAB1 COL_1 COL_2 A 10 B 20 C 30 D 40 and the following stored procedure definition: CREATE PROCEDURE myproc (OUT value INTEGER) DYNAMIC RESULT SETS 0 LANGUAGE SQL BEGIN DECLARE tempval INTEGER; DECLARE cursor CURSOR FOR SELECT col_2 FROM tab1 ORDER BY col_2 DESC; OPEN cursor; FETCH cursor INTO tempval; SET value = tempval; CLOSE cursor; END If the statement CALL myproc(:maxval) is coded in an embedded SQL application, what is the value assigned to the host variable MAXVAL after the statement is executed? A. 10 B. 20 C. 30 D. 40

Answers Question 1

Question 2

The correct answer is D. Because no schema name was specified when EMP was referenced in the SQL statement "SELECT * FROM emp", the contents of the USER special register is used as the schema name by default. In this case, because the authorization ID of the current user is USER1, the USER special register contains the value USER1, and the object USER1.EMP is the object that is accessed. The correct answer is A. USER1 must have EXECUTE privilege on the package in order to use the application, and EXECUTE privilege must be revoked from the group PUBLIC to prevent all nonadministrative users from accessing the table PAYROLL.EMPLOYEES. Because the application uses static SQL statements to add data to the table PAYROLL.EMPLOYEES, the INSERT privilege needed to perform the necessary insert operations is indirectly granted to USER1. Therefore, the INSERT privilege does not have to be explicitly granted. (If dynamic SQL statements were used to add data to the table, the opposite would be true USER1 would need both EXECUTE privilege on the package and INSERT privilege on table PAYROLL.EMPLOYEES.)

Question 3

Question 4

Question 5

Question 6

Question 7

Question 8

Question 9

The correct answer is B. Values cannot be explicitly assigned to identity columns, and the SQL statement "INSERT INTO tab1 VALUES (1, Jagger, E01)" attempts to do just that. Therefore, when this statement is executed, an error will be produced and the record will not be added to the table. The correct answer is C. Because special registers are designed to be accessed with SQL instead of a high-level programming language, there are two ways in which the value assigned to a special register can be obtained: By executing the VALUES SQL statement, either from the Command Line Processor or within a query/subquery. In this case, the syntax for this statement is VALUES [SpecialRegister], where SpecialRegister is the name of one of the special registers available. By querying the system catalog table SYSIBM.SYSDUMMY1. The syntax for such a query is SELECT [SpecialRegister] FROM SYSIBM.SYSDUMMY1, where SpecialRegister is the name of one of the special registers available. The correct answers are B and D. Both an identity column and a sequence can be used to produce unique numbers, which can in turn be used as primary key values. (An identity column can be included in a primary key, as can a column that relies on a sequence for its values.) The correct answer is A. The first row in the table named DEPARTMENT will be deleted, and three rows in the table named EMPLOYEES that referenced the deleted row (in the DEPARTMENT table) will have a null value assigned to their DEPT column. The correct answers are C and E. If the statement shown for answer A is executed, an alias named DBA1.DEPTS will be created. If the statement shown for answer B is executed, an alias named HR.DEPTS would be created for a table named USER1.DEPTS just the opposite of what we're trying to accomplish. If the statement shown for answer D is executed, an error will be generated because that is not the proper syntax for the CREATE VIEW statement. If the statement shown for answer C is executed, an alias named USER1.DEPTS will be created, and USER1 will be able to reference this alias without providing a schema name because the schema name USER1 will be used implicitly. If the statement shown for answer E is executed, a view named USER1.DEPTS, which references the table named HR.DEPTS, will be created and again, when this view can be referenced by USER1 without specifying a schema name. The correct answer is C. In order to use the INVENTORY application, a user must have EXECUTE privilege for the package associated with the application. Because the INVENTORY application uses dynamic SQL, a user must have SELECT privilege on TAB1 (where data is retrieved from) and INSERT privilege on TAB2 (where data is inserted). (If static SQL had been used instead of dynamic SQL, only the EXECUTE privilege would have been requiredall other privileges needed would have been indirectly assigned to the user running the application.) The correct answer is C. When an insert operation is performed on the PARTS table, the PARTNUM trigger will locate the largest PARTID used and increment that value by one to produce the

Question 10

PARTID that is to be inserted with the rest of the record's data. In this case, the largest PARTID number used was 3, so the trigger incremented that value and assigned the PARTID column the value 4 when the record for the keyboard was inserted into the PARTS table. The correct answer is D. The MYPROC stored procedure retrieved all values found in column COL_2 of table TAB1, sorted them in descending order, selected the first record found in the sorted list, and returned that record's value to the calling application. In this case, that value was 40.

Chapter 3: Data Manipulation


Overview
Twenty-six percent (26%) of the DB2 UDB V8.1 Family Application Development exam (Exam 703) is designed to test your knowledge of the Structured Query Language (SQL) statements used to manipulate data and to evaluate your understanding of how transactions are used to define points of consistency as data is being manipulated. The questions that make up this portion of the exam are intended to evaluate the following: Your ability to identify the Data Manipulation Language (DML) statements that are available with DB2 UDB. Your ability to perform insert, update, and delete operations against a database. Your ability to retrieve and format data using various forms of the SELECT SQL statement. Your ability to use DB2 SQL functions. Your ability to use common table expressions. Your knowledge of cursors, including the types of cursors available, and the scope in which a cursor can be used once it has been created. Your ability to identify when cursors should be used in an application. Your ability to initiate, terminate, and manage transactions. This chapter is designed to introduce you to the SQL statements that are used to manipulate data and to show you the various ways in which queries can be constructed. This chapter is also designed to show you how data stored in a result data set (produced in response to a query) can be retrieved within an application program and to provide you with information on how transactions are used to define points of consistency as data is manipulated. Terms you will learn: Structured Query Language (SQL) Data Control Language (DCL) Statements Data Definition Language (DDL) Statements Data Manipulation Language (DML) Statements Transaction Management Statements INSERT Subquery UPDATE Cursor DELETE SELECT Query The DISTINCT Clause The FROM Clause The WHERE Clause Relational Predicates The BETWEEN Predicate The LIKE Predicate

Wild Card Characters The IN Predicate The EXISTS Predicate The NULL Predicate The GROUP BY Clause GROUP BY ROLLUP GROUP BY CUBE The HAVING Clause The ORDER BY Clause The FETCH FIRST Clause Cartesian Product Inner Join Left Outer Join Right Outer Join Set Operator The UNION Set Operator The UNION ALL Set Operator The EXCEPT Set Operator The EXCEPT ALL Set Operator The INTERSECT Set Operator The INTERSECT ALL Set Operator SQL Functions Cursors DECLARE CURSOR OPEN FETCH CLOSE SELECT INTO VALUES INTO Transactions COMMIT ROLLBACK Savepoints Techniques you will master: Recognizing the various Data Manipulation Language (DML) statements available and understanding how each is used. Understanding how to add data to a table using the INSERT SQL statement. Understanding how to modify data stored in a table using the UPDATE SQL statement. Understanding how to remove data from a table using the DELETE SQL statement. Knowing how to construct simple and complex queries using the SELECT SQL statement and its clauses. Knowing how to construct and use common table expressions. Knowing how to retrieve the results of a query within an application program. Recognizing the types of cursors available and knowing when each type of cursor should be used. Understanding how transactions are initiated and terminated and knowing how transaction savepoints are created and used.

Structured Query Language (SQL) Revisited


Earlier, we saw that Structured Query Language (SQL) is a standardized language used to work with database objects and the data they contain. Using SQL, you can define, alter, and remove database objects as well as add, update, delete, and retrieve data values. One of the strengths of SQL is that it can be used in a variety of ways: SQL statements can be executed interactively using tools such as the Command Center and the Command Line Processor (CLP), placed in UNIX shell scripts or Windows

batch files for submission to the CLP or some other process, embedded in high-level programming language source code files that are precompiled/compiled to create a database application, or submitted dynamically from applications that do not require precompilation. Like other languages, SQL has a defined syntax and a set of language elements. Most SQL statements can be categorized according to the functions they have been designed to perform; SQL statements typically fall under one of the following categories: Data Control Language (DCL) Statements. SQL statements used to grant and revoke authorities and privileges. Data Definition Language (DDL) Statements. SQL statements used to create, alter, and delete database objects. Data Manipulation Language (DML) Statements. SQL statements used to store data in and retrieve or remove data from database objects. Transaction Management Statements. SQL statements used to establish and terminate active transactions. Typically, database administrators use DDL and DCL statements to construct database objects and to control access to those objects once they have been created. Application developers, on the other hand, use, almost exclusively, DML statements. (When developing embedded SQL applications, the Transaction Management Statements are used as well; when developing CLI/ODBC, JDBC, and SQLJ applications, functions and methods that provide the same functionality are used instead.) With that in mind, this chapter focuses on introducing you to the DML statements available with DB2 UDB. Note Although basic syntax is presented for most of the SQL statements covered in this chapter, the actual syntax supported may be much more complex. To view the complete syntax for a specific SQL statement or to obtain more information about a particular statement, refer to the IBM DB2 Universal Database, Version 8 SQL Reference Volume 2 product documentation.

Data Manipulation Language (DML) Statements


The majority of the work performed by most database applications focuses on data manipulation. Data is often collected from a user via a custom interface and stored in a database; over time, data that has been stored in a database may need to be modified or deleted; and eventually, the need to retrieve specific pieces of data stored in a database arises. To perform these types of operations, database applications rely on Data Manipulation Language (DML) statements. With DB2 UDB (as with most other relational database management systems), four DML statements are available: INSERT UPDATE DELETE SELECT Note Technically, SELECT is not an actual SQL statement. Instead, it is an informality that is used to define a query (which the ISO SQL standard defines as "an operation on zero or more tables that produces a derived table as a result"). That's why the syntax for the SELECT statement is covered under Chapter 4, "Queries in the IBM DB2 Universal Database, Version 8 SQL Reference" rather than in Chapter 5, "Statements." (The DB2 UDB documentation also uses the terms "fullselect" and "subselect" to refer to queries.) The INSERT Statement When a table is first created, it is empty. However, once created, a table can be populated in a variety of ways: It can be bulk-loaded using the LOAD utility, it can be bulk-loaded using the IMPORT utility, or one or more rows can be added to it by executing the INSERT SQL statement. Of the three methods available, the INSERT statement is the one most commonly used, and it can work directly with the table to be populated or with an updatable view that references the table to be populated. The basic syntax for the INSERT statement is:

INSERT INTO [TableName| ViewName] < ([ColumnName] ,... ) > VALUES ( [Value] ,...) or INSERT INTO [TableName | ViewName] < ( [ColumnName] ,... ) > [SELECTStatement] where: TableName ViewName ColumnName

Identifies the name assigned to the table to which data is to be added. Identifies the name assigned to the updatable view to which data is to be added. Identifies the name of one or more columns that data values being added to the table/view are to be assigned to. Each name provided must identify an existing column in the table or updatable view specified. Identifies one or more data values that are to be added to the column(s), table, or updatable view specified. Identifies a SELECT SQL statement that, when executed, will produce the data values to be added to the column(s), table, or updatable view specified (by retrieving data from other tables and/or views).

Value SELECTStatement

So, if you wanted to add a record to a base table named DEPARTMENT that has the following characteristics: Column Name DEPTNO DEPTNAME MGRID Data Type INTEGER CHAR(20) INTEGER

you could do so by executing an INSERT statement that looks something like this: INSERT INTO DEPARTMENT (DEPTNO, DEPTNAME, MGRID) VALUES (001, 'SALES', 1001) It is important to note that the number of values provided in the VALUES clause must be equal to the number of column names provided in the column name list. Furthermore, the values provided will be assigned to the columns specified based on the order in which they appearin other words, the first value provided will be assigned to the first column identified in the column name list, the second value provided will be assigned to the second column identified, and so on. Each value provided must also be compatible with the data type of the column that the value is to be assigned to. If values are provided for every column found in the table (in the VALUES clause), the column name list can be omitted. In this case, the first value provided will be assigned to the first column found in the table, the second value provided will be assigned to the second column found, and so on. Thus, the row of data that was added to the DEPARTMENT table in the previous example could just as well have been added by executing the following INSERT statement: INSERT INTO DEPARTMENT VALUES (001, 'SALES', 1001)

Along with literal values, two keywords can be used to designate values that are to be assigned to base table columns. The first of these is the DEFAULT keyword, which is used to assign a system or usersupplied default value to a column defined with the WITH DEFAULT constraint. The second is the NULL keyword, which is used to assign a NULL value to any column that was not defined with the NOT NULL constraint. (Both of these constraints were covered in detail in Chapter 2, "Database Objects and Programming Methods.") Thus, you could add a record that contains a NULL value for the MGRID column to the DEPARTMENT table we looked at earlier by executing an INSERT statement that looks something like this: INSERT INTO DEPARTMENT VALUES (001, 'SALES', NULL) By using a special form of the INSERT SQL statement, the results of a query can also be used to provide values for one or more columns in a base table. With this form of the INSERT statement, a SELECT statement (known as a subquery) is provided in place of the VALUES clause (we'll look at the SELECT statement shortly), and the results of the SELECT statement are assigned to the appropriate columns. (This form of the INSERT statement creates a type of "cut and paste" action in which values are retrieved from one base table or view and inserted into another.) As you might imagine, the number of values returned by the subquery must match the number of columns provided in the column name list (or the number of columns found in the table if no column name list is provided), and the order of assignment is the same as that used when literal values are provided in a VALUES clause. Therefore, using the results of a query, you could add a record to the DEPARTMENT table we looked at earlier by executing an INSERT statement that looks something like this: INSERT INTO DEPARTMENT (DEPTNO, DEPTNAME) SELECT DEPTNO, DEPTNAME FROM OLD_DEPARTMENT You may have noticed that the INSERT statement used in the last example did not provide values for every column found in the DEPARTMENT table. Just as there are times you may want to insert complete records into a table, there may be times when you wish to insert partial records into a table. Such operations can be performed by listing only the columns you have data values for in the column names list and providing the corresponding values using either the VALUES clause or a subquery. However, for such an INSERT statement to execute correctly, all columns in the table that the record is being inserted into that do not appear in the column name list provided must either accept null values or have a default value constraint defined. Otherwise, the INSERT statement will fail. Note Subqueries usually appear within the search condition of a WHERE clause or a HAVING clause (although subqueries can also be used with insert, update, and delete operations). A subquery may include search conditions of its own, and these search conditions may in turn include their own subqueries. When such "nested" subqueries are processed, the DB2 Database Manager executes the innermost query first and uses the results to execute the next outer query, and so on until all nested queries have been processed. The UPDATE Statement Data stored in a database is rarely static; over time, the need to modify (or even remove) one or more values residing in a database can arise. In such situations, specific data values can be changed by executing the UPDATE SQL statement. The basic syntax for this statement is: UPDATE [TableName | ViewName] SET [[ColumnName] = [Value] | NULL | DEFAULT ,... ] <WHERE [Condition]> or UPDATE [TableName | ViewName] SET ([ColumnName] ,... ) = ([Value] | NULL | DEFAULT ,... ) <WHERE [Condition]>

or UPDATE [TableName | ViewName] SET ([ColumnName] ,... ) = ( [SELECTStatement] ) <WHERE [Condition]> where: TableName ViewName ColumnName

Identifies the name assigned to the table that contains the data to be modified. Identifies the name assigned to the updatable view that contains the data to be modified. Identifies the name of one or more columns that contain data values to be modified. Each name provided must identify an existing column in the table or updatable view specified. Identifies one or more data values that are to be used to replace existing value(s) found in the column(s) specified. Identifies a SELECT SQL statement that, when executed, will produce the data values to be used to replace existing values found in the columns specified (by retrieving data from other tables and/or views). Identifies the search criterion that is to be used to locate one or more specific rows whose data values are to be modified. (This condition is coded like the WHERE clause that can be used with a SELECT SQL statement; we will look at the WHERE clause and its predicates later.) If no condition is provided, every row found in the table or updatable view specified will be updated.

Value SELECTStateme nt Condition

Thus, if you wanted to modify the records stored in a base table named EMPLOYEE that has the following characteristics: Column Name EMPNO FNAME LNAME TITLE DEPARTMENT SALARY Data Type INTEGER CHAR(20) CHAR(30) CHAR(10) CHAR(20) DECIMAL(6,2)

such that the salary of every employee who has the title of DBA is increased by 10%, you could do so by executing an UPDATE statement that looks something like this: UPDATE EMPLOYEE SET SALARY = SALARY * 1.10 WHERE TITLE = 'DBA' The UPDATE statement can also be used to remove values from nullable columns. This is done by changing the column's current value to NULL. Thus, the value assigned to the DEPARTMENT column of the EMPLOYEE table shown in the previous example could be removed by executing the following UPDATE statement: UPDATE EMPLOYEE SET SALARY = NULL Like the INSERT statement, the UPDATE statement can work either directly with the table that contains the values to be modified or with an updatable view that references the table containing the values to be modified. Similarly, the results of a query can be used to provide values for one or more columns identified in the column name list provided. As you might imagine, the number of values returned by the query must match the number of columns provided in the column name list specified. Thus, using the results of a query, you could change the value assigned to the DEPARTMENT columns of each record found in the EMPLOYEE table we looked at earlier by executing an UPDATE statement that looks something like this: UPDATE EMPLOYEE SET (DEPARTMENT) = (SELECT DEPTNAME FROM DEPARTMENT WHERE DEPTNO = 1) It is important to note that updates can be conducted by performing either a searched update or a positioned update operation. So far, all of the examples we have looked at have been searched update operations. To perform a positioned update, a cursor must first be created, opened, and positioned on the row that is to be updated. Then, the UPDATE statement that is to be used to modify one or more data values must contain a WHERE CURRENT OF [CursorName] clause (CursorName identifies the cursor being usedwe'll look at cursors shortly). Because positioned update operations can be conducted only when a valid cursor exists, they can only be performed from within embedded SQL applications. Note It is very important that you provide a proper WHERE clause when the UPDATE statement is used. Failure to do so will cause an update operation to be performed on every row found in the specified table or updatable view. The DELETE Statement Although the UPDATE statement can be used to delete individual values from a base table (by setting those values to NULL), it cannot be used to remove entire rows. When one or more rows of data need to be removed from a base table, the DELETE SQL statement must be used instead. As with the INSERT and UPDATE statements, the DELETE statement can work either directly with the table that rows are to

be removed from or with an updatable view that references the table that rows are to be removed from. The basic syntax for the DELETE statement is: DELETE FROM [TableName | ViewName] <WHERE [Condition]> where: TableName ViewName Condition

Identifies the name assigned to the table from which data is to be removed. Identifies the name assigned to the deletable view from which data is to be removed. Identifies the search criterion to be used to locate one or more specific rows that are to be removed. (This condition is coded like the WHERE clause used with a SELECT SQL statement; we will look at the WHERE clause and its predicates later.) If no condition is provided, every row found in the table or deletable view specified will be deleted.

Therefore, if you wanted to remove every record for company XYZ from a base table named SALES that has the following characteristics: Column Name PONUMBER COMPANY PURCHASEDATE SALESPERSON Data Type CHAR(10) CHAR(20) DATE INTEGER

you could do so by executing a DELETE statement that looks something like this: DELETE FROM SALES WHERE COMPANY = 'XYZ' Like update operations, delete operations can be conducted in one of two ways: as a searched delete or as a positioned delete. To perform a positioned delete, a cursor must first be created, opened, and positioned on the row to be deleted. Then, the DELETE statement to be used to remove the row must contain a WHERE CURRENT OF [CursorName] clause (CursorName identifies the cursor being used). Because positioned delete operations can be conducted only when a valid cursor exists, they can only be performed from within embedded SQL applications. Caution Because omitting the WHERE clause in a DELETE SQL statement causes the delete operation to be applied to all rows in the table or view specified, it is important that you always provide a WHERE clause with a DELETE statement unless you explicitly want to erase all data stored in a table. The SELECT Statement Although the primary function of a database is to act as a data repository, sooner or later, almost all database users and/or applications have the need to retrieve specific pieces of information (data) from the database they are interacting with. The operation used to retrieve data from a database is called a query (because it searches or queries the database to find the answer to some question), and the results returned by a query are typically expressed in one of two forms: as a single row of data values or as a set of rows of data values, otherwise known as a result data set (or result set). (If no data values that correspond to the query specification provided can be found in the database, an empty result data set will be returned.) All queries are created using the SELECT SQL statement, which is an extremely powerful statement that can be used to construct a wide variety of queries containing an infinite number of variations (using a finite set of rules). And because the SELECT statement is recursive, a single SELECT statement can derive its out-put from a successive number of nested SELECT statements (which are known as subqueries). (We have already seen how SELECT statements can be used to provide input to INSERT

and UPDATE statements; SELECT statements can also be used to provide input to other SELECT statements in a similar manner.) In its simplest form, the syntax for the SELECT statement is: SELECT * FROM [ [TableName] | [ViewName] ] where: TableName ViewName

Identifies the name assigned to the table from which data is to be retrieved. Identifies the name assigned to the view from which data is to be retrieved.

Consequently, if you wanted to retrieve all values stored in a table or view named DEPARTMENT, you could do so by executing a SELECT statement that looks something like this: SELECT * FROM DEPARTMENT The SELECT Statement and

Its Clauses

We just saw that if you wanted to retrieve all values stored in a base table, you could do so by executing a SELECT statement that looks something like this: SELECT * FROM [TableName] But what if you only wanted to see the values stored in two columns of a table? Or what if you wanted the data retrieved to be ordered alphabetically in ascending order? (Data is stored in a table in no particular order, and unless otherwise specified, a query only returns data in the order in which it is found.) How do you construct a query using a SELECT SQL statement that retrieves only certain data values and returns those values in a very specific format? You do so by using a more advanced form of the SELECT SQL statement to construct your query. The syntax used to construct more advanced forms of the SELECT SQL statement is: SELECT <DISTINCT> [* | [Expression] <<AS> [NewColumnName]> ,...] FROM [[TableName] | [ViewName] <<AS> [CorrelationName]> ,...] <WhereClause> <GroupByClause> <HavingClause> <OrderByClause> <FetchFirstClause> where: Expression

Identifies one or more columns for which values are to be returned when the SELECT statement is executed. The value specified for this option can be any valid SQL language element; however, column names that correspond to the table or view specified in the FROM clause are commonly used. Identifies a new column name to be used in place of the corresponding table or view column name specified in the result data set returned by the SELECT statement. Identifies the name(s) assigned to one or more tables from which data is to be retrieved.

NewColumnName

TableName

ViewName CorrelationName

Identifies the name(s) assigned to one or more views from which data is to be retrieved. Identifies a shorthand name that can be used when referencing the table or view that the correlation name is associated with in any of the SELECT statement clauses. Identifies a WHERE clause that is to be used with the SELECT statement. Identifies a GROUP BY clause that is to be used with the SELECT statement. Identifies a HAVING clause that is to be used with the SELECT statement. Identifies an ORDER BY clause that is to be used with the SELECT statement. Identifies a FETCH FIRST clause that is to be used with the SELECT statement.

WhereClause GroupByClause HavingClause OrderByClause FetchFirstClause

If the DISTINCT clause is specified with the SELECT statement, duplicate rows are not repeated in the final result data set returned. (Rows are considered duplicates if the values in all corresponding columns are identical. For the purpose of determining whether values are identical, null values are considered equal.) However, if the DISTINCT clause is used, the result data set produced must not contain columns that hold LONG VARCHAR, LONG VARGRAPHIC, DATALINK, BLOB, CLOB, or DBCLOB data. So if you wanted to retrieve all values for the columns named WORKDEPT and JOB from a table named EMPLOYEE, you could do so by executing a SELECT statement that looks something like this: SELECT WORKDEPT, JOB FROM EMPLOYEE And when this statement is executed, you might see a result data set that looks something like this: WORKDEPT JOB -------- ------A00 B01 C01 E01 D11 D21 E11 E21 A00 A00 C01 C01 D11 D11 D11 D11 D11 PRES MANAGER MANAGER MANAGER MANAGER MANAGER MANAGER MANAGER SALESREP CLERK ANALYST ANALYST DESIGNER DESIGNER DESIGNER DESIGNER DESIGNER

D11 D11 D11 D21 D21 D21 D21 D21 E11 E11 E11 E11 E21 E21 E21

DESIGNER DESIGNER DESIGNER CLERK CLERK CLERK CLERK CLERK OPERATOR OPERATOR OPERATOR OPERATOR FIELDREP FIELDREP FIELDREP

32 record(s) selected. On the other hand, if you wanted to retrieve the same data values but remove all duplicate records found, you could do so by executing the same SELECT statement using the DISTINCT clause. The resulting SELECT statement would look something like this: SELECT DISTINCT WORKDEPT, JOB FROM EMPLOYEE This time, when the SELECT statement is executed you should see a result data set that looks something like this: WORKDEPT JOB -------- -------C01 A00 D21 D11 E21 B01 C01 D11 D21 E01 E11 E21 E11 A00 A00 ANALYST CLERK CLERK DESIGNER FIELDREP MANAGER MANAGER MANAGER MANAGER MANAGER MANAGER MANAGER OPERATOR PRES SALESREP

15 record(s) selected. Now suppose you wanted to retrieve all unique values (no duplicates) for the column named JOB from a table named EMPLOYEE and you wanted to change the name of the JOB column in the result data set produced to TITLES. You could do so by executing a SELECT statement that looks something like this: SELECT DISTINCT JOB AS TITLE FROM EMPLOYEE When this statement is executed, you should see a result set that looks something like this: TITLE -------ANALYST CLERK DESIGNER FIELDREP MANAGER OPERATOR PRES SALESREP 8 record(s) selected. You could also produce the same result data set by executing the same SELECT SQL statement using the correlation name "EMP" for the table named EMPLOYEE. The only difference is that the SELECT statement would look something like this: SELECT DISTINCT EMP.JOB AS TITLE FROM EMPLOYEE AS EMP Notice that the column named JOB is qualified with the same correlation name assigned to the table named EMPLOYEE. For this example, this is not really necessary because data is only being retrieved from one table and no two columns in a table can have the same name. However, if data was being retrieved from two or more tables and if columns in different tables had the same name, the qualifier (either the table name or the correlation name) would be needed to tell the DB2 Database Manager which table to retrieve data from for that particular column. If you were counting when we examined the syntax for the SELECT statement earlier, you may have noticed that a single SELECT statement can contain up to seven different clauses. These clauses are: The DISTINCT clause The FROM clause The WHERE clause The GROUP BY clause The HAVING clause The ORDER BY clause The FETCH FIRST clause (Incidentally, these clauses are processed in the order shown.) We saw in the previous SELECT statement examples how the DISTINCT and FROM clauses are used. Now let's turn our attention to the other clauses that the SELECT statement recognizes. The WHERE Clause The WHERE clause is used to tell the DB2 Database Manager how to select the rows that are to be returned in the result data set produced in response to a query. When specified, the WHERE clause is

followed by a search condition that is essentially a simple test that, when applied to a row of data, will evaluate to "TRUE", "FALSE", or "Unknown". If this test evaluates to "TRUE", the row is to be returned in the result data set produced; if the test evaluates to "FALSE" or "Unknown", the row is skipped. The search condition of a WHERE clause is made up of one or more predicates that are used to compare the contents of a column with a constant value, the contents of a column with the contents of another column from the same table, or the contents of a column in one table with the contents of a column from another table (just to name a few). Some of the more common types of WHERE clause predicates DB2 UDB recognizes include: Relational predicates (comparisons) BETWEEN LIKE IN EXISTS NULL Each of these predicates can be used alone, or two or more can be combined by using parentheses or logical operators such as AND, OR, and NOT.

Relational Predicates
The relational predicates (or comparison operators) consist of a set of operators that are used to define a comparison relationship between the contents of a column and a constant value, two columns from the same table, or a column in one table with those of a column from another table. The following comparison operators are available: < (Less than) > (Greater than) <= (Less than or equal to) >= (Greater than or equal to) = (Equal to) < > (Not equal to) Typically, relational predicates are used to include or exclude specific rows from the final result data set produced in response to a query. Thus, if you wanted to retrieve values for the columns named EMPNO and SALARY in a table named EMPLOYEE in which the value for the SALARY column is greater than or equal to $40,000.00, you could do so by executing a SELECT statement that looks something like this: SELECT EMPNO, SALARY FROM EMPLOYEE WHERE SALARY >= 40000.00 When this SELECT statement is executed, you might see a result data set that looks something like this: EMPNO -----000010 000020 000050 000110 SALARY 52750.00 41250.00 40175.00 46500.00 --------

4 record(s) selected. It is important to note that the data types of all items involved in a relational predicate comparison must be compatible or the comparison will fail. If necessary, scalar functions can be used (to make the necessary conversions) in conjunction with the relational predicate to meet this requirement. Also keep

in mind that all character data is case sensitive. Again, functions are available that can be used to construct queries that will locate character values, regardless of the case used when they were stored.

The BETWEEN Predicate


The BETWEEN predicate is used to define a comparison relationship in which a value is checked to determine whether it falls within a range of values. As with relational predicates, the BETWEEN predicate is used to include or exclude specific rows from the result data set produced in response to a query. So, if you wanted to retrieve values for the columns named EMPNO and SALARY in a table named EMPLOYEE in which the value for the SALARY column is greater than or equal to $10,000.00 and less than or equal to $20,000.00, you could do so by executing a SELECT statement that looks something like this: SELECT EMPNO, SALARY FROM EMPLOYEE WHERE SALARY BETWEEN 10000.00 AND 20000.00 When this SELECT statement is executed, you might see a result data set that looks something like this: EMPNO -----000210 000250 000260 000290 000300 000310 000320 SALARY 18270.00 19180.00 17250.00 15340.00 17750.00 15900.00 19950.00 --------

7 record(s) selected. If the NOT (negation) operator is used with the BETWEEN predicate (or with any other predicate, for that matter), the meaning of the predicate is reversed. (In the case of the BETWEEN predicate, contents of a column are checked, and only values that fall outside the range of values specified are returned to the final result data set produced.) Thus, if you wanted to retrieve values for the columns named EMPNO and SALARY in a table named EMPLOYEE in which the value for the SALARY column is less than $10,000.00 and more than $30,000.00, you could do so by executing a SELECT statement that looks something like this: SELECT EMPNO, SALARY FROM EMPLOYEE WHERE SALARY NOT BETWEEN 10000.00 AND 30000.00 When this SELECT statement is executed, you might see a result data set that looks something like this: EMPNO -----000010 000020 000030 000050 000060 000070 SALARY 52750.00 41250.00 38250.00 40175.00 32250.00 36170.00 --------

000110

46500.00

7 record(s) selected.

The LIKE Predicate


The LIKE predicate is used to define a comparison relationship in which a character value is checked to see whether it contains a specific pattern of characters. The pattern of characters specified can consist of regular alphanumeric characters and/or special metacharacters that DB2 UDB recognizes, which are interpreted as shown below: The underscore character (_) is treated as a wild card character that stands for any single alphanumeric character. The percent character (%) is treated as a wild card character that stands for any sequence of alphanumeric characters. Thus, if you wanted to retrieve values for the columns named EMPNO and LASTNAME in a table named EMPLOYEE in which the value for the LASTNAME column begins with the letter "S", you could do so by executing a SELECT statement that looks something like this: SELECT EMPNO, LASTNAME FROM EMPLOYEE WHERE LASTNAME LIKE 'S%' When this SELECT statement is executed, you might see a result data set that looks something like this: EMPNO -----000060 000100 000180 000250 000280 000300 000310 LASTNAME STERN SPENSER SCOUTTEN SMITH SCHNEIDER SMITH SETRIGHT ---------

7 record(s) selected. When using wild card characters, you must take care to ensure that they are placed in the appropriate location in the pattern string specified. Note that in the previous example, only records for employees whose last names begin with the letter "S" are returned. If the character string pattern specified had been "%S%", records for employees whose last name contains the character "S" (anywhere in the name) would have been returned, and the result data set produced may have looked something like this instead: EMPNO -----000010 000020 000060 000070 000090 LASTNAME HAAS THOMPSON STERN PULASKI HENDERSON ----------

000100 000110 000140 000150 000170 000180 000210 000230 000250 000260 000280 000300 000310

SPENSER LUCCHESSI NICHOLLS ADAMSON YOSHIMURA SCOUTTEN JONES JEFFERSON SMITH JOHNSON SCHNEIDER SMITH SETRIGHT

18 record(s) selected. Likewise, you must also be careful about using uppercase and lowercase characters in pattern strings; if the data being examined is stored in a case-sensitive manner, the characters used in a pattern string must match the case that was used to store the data in the column being searched, or no corresponding records will be found. Note Although the LIKE predicate provides a relatively easy way to search for data values, it should be used with caution; the overhead involved in processing a LIKE predicate can be extremely resource-intensive.

The IN Predicate
The IN predicate is used to define a comparison relationship in which a value is checked to see whether it matches a value in a finite set of values. This finite set of values can consist of one or more literal values that are coded directly in the SELECT statement, or it can be composed of the non-null values found in the result data set generated by a second SELECT statement (or subquery). Thus, if you wanted to retrieve values for the columns named EMPNO and WORKDEPT in a table named EMPLOYEE in which the value for the WORKDEPT column matches a value in a list of department codes, you could do so by executing a SELECT statement that looks something like this: SELECT LASTNAME, WORKDEPT FROM EMPLOYEE WHERE WORKDEPT IN ('E11', 'E21') When this SELECT statement is executed, you might see a result data set that looks something like this: LASTNAME WORKDEPT --------- -------HENDERSON E11 SPENSER E21 SCHNEIDER E11 PARKER E11 SMITH E11 SETRIGHT E11 MEHTA E21

LEE E21 GOUNOT E21 9 record(s) selected. Assuming we don't know that the values E11 and E21 have been assigned to the departments named OPERATIONS and SOFTWARE SUPPORT but we do know that department names and numbers are stored in a table named DEPARTMENT (for normalization) that has two columns named DEPTNO and DEPTNAME, we could produce the same result data set by executing a SELECT statement that looks like this: SELECT LASTNAME, WORKDEPT FROM EMPLOYEE WHERE WORKDEPT IN (SELECT DEPTNO FROM DEPARTMENT WHERE DEPTNAME = 'OPERATIONS' OR DEPTNAME = 'SOFTWARE SUPPORT') In this case, the subquery SELECT DEPTNO FROM DEPARTMENT WHERE DEPTNAME = OPERATIONS OR DEPTNAME = SOFTWARE SUPPORT produces a result data set that contains the values E11 and E21, and the main query evaluates each value found in the WORKDEPT column of the EMPLOYEE table to determine whether it matches one of the values in the result data set produced by the subquery.

The EXISTS Predicate


The EXISTS predicate is used to determine whether a particular value exists in a set of rows. The EXISTS predicate is always followed by a subquery, and it returns either "TRUE" or "FALSE" to indicate whether a specific value is found in the result data set produced by the subquery. Thus, if you wanted to learn which values found in the column named DEPTNO in a table named DEPARTMENT are used in the column named WORKDEPT found in a table named EMPLOYEE, you could do so by executing a SELECT statement that looks something like this: SELECT DEPTNO, DEPTNAME FROM DEPARTMENT WHERE EXISTS (SELECT WORKDEPT FROM EMPLOYEE WHERE WORKDEPT = DEPTNO) When this SELECT statement is executed, you might see a result data set that looks something like this: DEPTNO -----A00 B01 C01 D11 D21 E01 E11 E21 DEPTNAME ---------------------------SPIFFY COMPUTER SERVICE DIV. PLANNING INFORMATION CENTER MANUFACTURING SYSTEMS ADMINISTRATION SYSTEMS SUPPORT SERVICES OPERATIONS SOFTWARE SUPPORT

8 record(s) selected. In most situations, EXISTS predicates are AND-ed with other predicates to determine final row selection.

The NULL Predicate


The NULL predicate is used to determine whether a particular value is a NULL value. Therefore, if you wanted to retrieve values for the columns named FIRSTNME, MIDINIT, and LASTNAME in a table named EMPLOYEE in which the value for the MIDINIT column is a NULL value, you could do so by executing a SELECT statement that looks something like this: SELECT FIRSTNME, MIDINIT, LASTNAME FROM EMPLOYEE WHERE MIDINIT IS NULL When this SELECT statement is executed, you might see a result data set that looks something like this: FIRSTNME MIDINIT LASTNAME -------- ------- --------SEAN BRUCE DAVID WING O'CONNELL ADAMSON BROWN LEE

4 record(s) selected. When using the NULL predicate, it is important to keep in mind that null, zero (0), blank ( ), and an empty string () are not the same value. Null is a special marker that is used to represent missing information, while zero, blank, and an empty string are actual values that can be stored in a column to indicate a specific value (or lack thereof). Furthermore, some columns accept null values, while other columns do not, depending on their definition. So, before writing SQL statements that check for null values, make sure that the null value is supported by the column(s) being specified. The GROUP BY Clause The GROUP BY clause is used to tell the DB2 Database Manager how to organize rows of data returned in the result data set produced in response to a query. In its simplest form, the GROUP BY clause is followed by a grouping expression that is usually one or more column names (that correspond to column names found in the result data set to be organized by the GROUP BY clause). The GROUP BY clause is also used to specify what columns are to be grouped together to provide input to aggregate functions such as SUM() and AVG(). Thus, if you wanted to obtain the average salary for all departments found in the column named DEPTNAME in a table named DEPARTMENT using salary information stored in a table named EMPLOYEE and you wanted to organize the data retrieved by department, you could do so by executing a SELECT statement that looks something like this: SELECT DEPTNAME, AVG(SALARY) AS AVG_SALARY FROM DEPARTMENT D, EMPLOYEE E WHERE E.WORKDEPT = D.DEPTNO GROUP BY DEPTNAME When this statement is executed, you might see a result data set that looks something like this: DEPTNAME AVG_SALARY 25153.33 30156.66 24677.77

---------------------------- ---------ADMINISTRATION SYSTEMS INFORMATION CENTER MANUFACTURING SYSTEMS

OPERATIONS PLANNING SOFTWARE SUPPORT SUPPORT SERVICES 8 record(s) selected.

20998.00 41250.00 23827.50 40175.00

SPIFFY COMPUTER SERVICE DIV. 42833.33

In this example, each row in the result data set produced contains the department name and the average salary for individuals who work in that department. Note A common mistake that is often made when using the GROUP BY clause is the addition of nonaggregate columns to the list of columns that follow the GROUP BY clause. Because grouping is performed by combining all of the nonaggregate columns together into a single concatenated key and breaking whenever that key value changes, extraneous columns can cause unexpected breaks to occur.

The GROUP BY ROLLUP Clause


The GROUP BY ROLLUP clause is used to analyze a collection of data in a single (hierarchal) dimension, but at more than one level of detail. For example, you could group data by successively larger organizational units, such as team, department, and division, or by successively larger geographical units, such as city, county, state or province, country, and continent. Thus, if you were to execute a SELECT statement that looks something like this: SELECT WORKDEPT AS DEPARTMENT, AVG(SALARY) AS AVG_SALARY FROM EMPLOYEE GROUP BY ROLLUP (WORKDEPT) you might see a result data set that looks something like this: DEPARTMENT AVERAGE_SALARY ---------- -------------A00 B01 C01 D11 D21 E01 E11 E21 27303.59 42833.33 41250.00 30156.66 24677.77 25153.33 40175.00 20998.00 23827.50

9 record(s) selected. This result data set contains average salary information for all employees found in the table named EMPLOYEE, regardless of which department they work in (the first line in the result data set returned), as well as average salary information for each department available (the remaining lines in the result data set returned). In this example, only one expression (known as the grouping expression) is specified in the GROUP BY ROLLUP clause (in this case, the grouping expression is WORKDEPT). However, one or more grouping expressions can be specified in a single GROUP BY ROLLUP clause (for example, GROUP BY ROLLUP

(DIVISION, WORKDEPT)). When multiple grouping expressions are specified, the DB2 Database Manager groups the data by all grouping expressions used, then by all but the last grouping expression used, and so on. Then, it makes one final grouping that consists of the entire contents of the specified table. In addition, when specifying multiple grouping expressions, it is important to ensure that they are listed in the appropriate orderif one kind of group is logically contained inside another (for example, departments within a division), that group should be listed after the group that it is contained in (i.e., GROUP BY ROLLUP (DIVISION, DEPARTMENT)), and not before.

The GROUP BY CUBE Clause


The GROUP BY CUBE clause is used to analyze a collection of data by organizing it into groups in multiple dimensions. Thus, if you were to execute a SELECT statement that looks something like this: SELECT SEX, WORKDEPT, AVG(SALARY) AS AVG_SALARY FROM EMPLOYEE GROUP BY CUBE (SEX, WORKDEPT) you might see a result data set that looks something like this: SEX WORKDEPT AVG_SALARY --- -------- ---------F M F F F F F M M M M M M M A00 B01 C01 D11 D21 E01 E11 E21 A00 C01 D11 D21 E11 A00 B01 D11 D21 E01 E11 E21 42833.33 41250.00 30156.66 24677.77 25153.33 40175.00 20998.00 23827.50 27303.59 28411.53 26545.52 52750.00 30156.66 24476.66 26933.33 23966.66 37875.00 41250.00 24778.33 23373.33 40175.00 16545.00 23827.50

23 record(s) selected.

This result set contains average salary information for each department found in the table named EMPLOYEE (the lines that contain a null value in the SEX column and a value in the WORKDEPT column of the result data set returned); average salary information for all employees found in the table named EMPLOYEE, regardless of which department they work in (the line that contains a NULL value for both the SEX and the WORKDEPT columns of the result data set returned); average salary information for each sex (the lines that contain a value in the SEX column and a NULL value in the WORKDEPT column of the result data set returned); and average salary information for each sex in each department available (the remaining lines in the result data set returned). In other words, the data in the result data set produced is grouped: By department only By sex only By sex and department As a single group that contains all sexes and all departments. The term CUBE is intended to suggest that data is being analyzed in more than one dimension. As you can see in the previous example, data analysis was performed in two dimensions, which resulted in four types of groupings. If the SELECT statement: SELECT SEX, WORKDEPT, JOB, AVG(SALARY) AS AVG_SALARY FROM EMPLOYEE GROUP BY CUBE (SEX, WORKDEPT, JOB) had been used instead, data analysis would have been performed in three dimensions, and the data would have been broken into eight types of groupings. Thus, the number of types of groups produced by a CUBE operation can be determined by the formula 2n where n is the number of expressions (dimensions) used in the GROUP BY CUBE clause. The HAVING Clause The HAVING clause is used to apply further selection criteria to columns referenced in a GROUP BY clause. This clause behaves like the WHERE clause, except that it refers to data that has already been grouped by a GROUP BY clause. (The HAVING clause is used to tell the DB2 Database Manager how to select the rows to be returned in a result data set from rows that have already been grouped.) Like the WHERE clause, the HAVING clause is followed by a search condition that acts as a simple test that, when applied to a row of data, will evaluate to "TRUE", "FALSE", or "Unknown". If this test evaluates to "TRUE", the row is to be returned in the result data set produced; if the test evaluates to "FALSE" or "Unknown", the row is skipped. In addition, the search condition of a HAVING clause can consist of the same predicates that are recognized by the WHERE clause. Thus, if you wanted to obtain the average salary for all departments found in the column named DEPTNAME in a table named DEPARTMENT using salary information stored in a table named EMPLOYEE and you wanted to organize the data retrieved by department, but you are only interested in departments whose average salary is greater than $30,000.00, you could do so by executing a SELECT statement that looks something like this: SELECT DEPTNAME, AVG(SALARY) AS AVG_SALARY FROM DEPARTMENT D, EMPLOYEE E WHERE E.WORKDEPT = D.DEPTNO GROUP BY DEPTNAME HAVING AVG(SALARY) > 30000.00 When this statement is executed, you might see a result data set that looks something like this: DEPTNAME AVG_SALARY ---------------------------- ----------

INFORMATION CENTER PLANNING SUPPORT SERVICES 4 record(s) selected.

30156.66

41250.00 40175.00

SPIFFY COMPUTER SERVICE DIV. 42833.33

In this example, each row in the result data set produced contains the department name for every department whose average salary for individuals working in that department is greater than $30,000.00, along with the actual average salary for each department. The ORDER BY Clause The ORDER BY clause is used to tell the DB2 Database Manager how to sort and order the rows that are to be returned in a result data set produced in response to a query. When specified, the ORDER BY clause is followed by the name of the column(s) whose data values are to be sorted. Multiple columns can be used for sorting, and each column used can be ordered in either ascending or descending order. If the keyword ASC follows the column's name, ascending order is used, and if the keyword DESC follows the column name, descending order is used. (If neither keyword is used, ascending order is used by default.) Furthermore, when more than one column is identified in an ORDER BY clause, the corresponding result data set is sorted by the first column specified (the primary sort key), then the sorted data is sorted again by the next column specified, and so on until the data has been sorted by each column identified. So, if you wanted to retrieve values for the columns named LASTNAME, FIRSTNME, and EMPNO in a table named EMPLOYEE in which the value for the EMPNO column is greater than 000200 and you wanted the information sorted by LASTNAME followed by FIRSTNME, you could do so by executing a SELECT statement that looks something like this: SELECT LASTNAME, FIRSTNME, EMPNO FROM EMPLOYEE WHERE EMPNO > '000200' ORDER BY LASTNAME ASC, FIRSTNME ASC When this statement is executed, you might see a result data set that looks something like this: LASTNAME FIRSTNME EMPNO --------- --------- -----GOUNOT JOHNSON JONES LEE LUTZ MARINO MEHTA PARKER PEREZ JASON SYBIL WILLIAM WING 000340 000230 000260 000210 JEFFERSON JAMES

000330

JENNIFER 000220 SALVATORE 000240 RAMLAL JOHN MARIA 000320 000290 000270 000280 000310

SCHNEIDER ETHEL SETRIGHT MAUDE

SMITH SMITH

DANIEL PHILIP

000250 000300

14 record(s) selected. As you can see, the data returned is ordered by employee last names and employee first names. (The LASTNAME values are placed in ascending alphabetical order, and the FIRSTNME values are also placed in ascending alphabetical order.) Using the ORDER BY clause is easy if the result data set is composed entirely of named columns. But what happens if the result data set produced needs to be ordered by a summary column or a result column that cannot be specified by name? Because these types of situations can exist, an integer value that corresponds to a particular column's number can be used in place of the column name with the ORDER BY clause. When integer values are used, the first or left-most column in the result data set produced is treated as column 1, the next is column 2, and so on. Therefore, you could have produced the same result data set produced earlier by executing a SELECT statement that looks like this: SELECT LASTNAME, FIRSTNME, EMPNO FROM EMPLOYEE WHERE EMPNO > '000200' ORDER BY 1 ASC, 2 ASC It is important to note that even though integer values are primarily used in the ORDER BY clause to specify columns that cannot be specified by name, they can be used in place of any column name as well.

The FETCH FIRST Clause


The FETCH FIRST clause is used to limit the number of rows returned to the result data set produced in response to a query. When used, the FETCH FIRST clause is followed by a positive integer value and the keywords ROWS ONLY (or ROW ONLY). This tells the DB2 Database Manager that the user/application executing the query does not want to see more than n number of rows, regardless of how many rows might exist in the result data set that would be produced were the FETCH FIRST clause not specified. Thus, if you wanted to retrieve the first 10 values found for the columns named WORKDEPT and JOB from a table named EMPLOYEE, you could do so by executing a SELECT statement that looks something like this: SELECT WORKDEPT, JOB FROM EMPLOYEE FETCH FIRST 10 ROWS ONLY When this SELECT statement is executed, you might see a result data set that looks something like this: WORKDEPT JOB -------- -------A00 B01 C01 E01 D11 D21 E11 PRES MANAGER MANAGER MANAGER MANAGER MANAGER MANAGER

E21 A00 A00

MANAGER SALESREP CLERK

10 record(s) selected.

Joining Tables
Most of the examples we have looked at so far have involved only one table. However, one of the more powerful features of the SELECT statement (and the element that makes data normalization possible) is the ability to retrieve data from two or more tables by performing what is known as a join operation. A join is a binary operation on two (not necessarily distinct) tables that produces a "derived" table as a result. (If you go back through the examples that have been presented so far, you will see an occasional "sneak preview" of a join operationparticularly in the examples provided for the IN predicate and the HAVING and GROUP BY clauses.) In its simplest form, the syntax for a SELECT statement that performs a join operation is: SELECT * FROM [ [TableName] | [ViewName] ,...] where: TableName ViewName

Identifies the name(s) assigned to one or more tables that data is to be retrieved from. Identifies the name(s) assigned to one or more views that data is to be retrieved from.

Consequently, if you wanted to retrieve all values stored in a base table named DEPARTMENT and all values stored in a base table named ORG, you could do so by executing a SELECT statement that looks something like this: SELECT * FROM DEPARTMENT, ORG When such a statement is executed, the result data set produced will contain all possible combinations of the rows found in each table specified (otherwise known as the Cartesian product). Every row in the result data set produced is a row from the first referenced table concatenated with a row from the second referenced table, concatenated in turn with a row from the third referenced table, and so on. The total number of rows found in the result data set produced is the product of the number of rows in all the individual tables referenced. Thus, if the table named DEPARTMENT in our previous example contains five rows and the table named ORG contains two rows, the result data set produced by the statement SELECT * FROM DEPARTMENT, ORG will consist of 10 rows (2 5 = 10). Caution A Cartesian product join operation should be used with extreme caution when working with large tables; the amount of resources required to perform such a join operation can have a serious negative impact on performance. A more common join operation involves collecting data from two or more tables that have one specific column in common and combining the results to create an intermediate result table that contains the values needed to resolve a query. The syntax for a SELECT statement that performs this type of join operation is: SELECT [* | [Expression] <<AS> [NewColumnName]> ,...] FROM [[TableName] <<AS> [CorrelationName]> ,...] [JoinCondition] where: Expression

Identifies one or more columns whose values are to be returned when

the SELECT statement is executed. The value specified for this option can be any valid SQL language element; however, column names that correspond to the table or view specified in the FROM clause are commonly used. NewColumnName Identifies a new column name that is to be used in place of the corresponding table or view column name specified in the result data set returned by the SELECT statement. Identifies the name(s) assigned to one or more tables that data is to be retrieved from. Identifies a shorthand name that can be used when referencing the table name specified in the TableName parameter. Identifies the condition to be used to join the tables specified. Typically, this is a WHERE clause in which the values of a column in one table are compared with the values of a similar column in another table.

TableName CorrelationName JoinCondition

Thus, a simple join operation could be conducted by executing a SELECT statement that looks something like this: SELECT LASTNAME, DEPTNAME FROM EMPLOYEE E, DEPARTMENT D WHERE E.WORKDEPT = D.DEPTNO When this SELECT statement is executed, you might see a result data set that looks something like this: LASTNAME DEPTNAME -------- ---------------------------HAAS KWAN GEYER STERN SPIFFY COMPUTER SERVICE DIV. INFORMATION CENTER SUPPORT SERVICES MANUFACTURING SYSTEMS THOMPSON PLANNING

PULASKI ADMINISTRATION SYSTEMS HENDERSON OPERATIONS SPENSER SOFTWARE SUPPORT LUCCHESSI SPIFFY COMPUTER SERVICE DIV. O'CONNELL SPIFFY COMPUTER SERVICE DIV. QUINTANA INFORMATION CENTER NICHOLLS INFORMATION CENTER ADAMSON MANUFACTURING SYSTEMS LUTZ SMITH PEREZ PARKER MEHTA MANUFACTURING SYSTEMS ADMINISTRATION SYSTEMS ADMINISTRATION SYSTEMS OPERATIONS SOFTWARE SUPPORT

SCHNEIDER OPERATIONS SETRIGHT OPERATIONS

GOUNOT

SOFTWARE SUPPORT

21 record(s) selected. This type of join is referred to as an inner join. Aside from a Cartesian product, only two types of joins can exist: inner joins and outer joins. As you might imagine, a significant difference exists between the two.

Inner Joins
An inner join can be thought of as the cross product of two tables, in which every row in one table that has a corresponding row in another table is combined with that row to produce a new record. This type of join works well as long as every row in the first table has a corresponding row in the second table. However, if this is not the case, the result table produced may be missing rows found in either or both of the tables that were joined. Earlier, we saw the most common SELECT statement syntax used to perform an inner join operation. The following syntax can also be used to create a SELECT statement that performs an inner join operation: SELECT [* | [Expression] <<AS> [NewColumnName]> ,...] FROM [[TableName1] <<AS> [CorrelationName1]>] <INNER> JOIN [[TableName2] <<AS> [CorrelationName2]>] ON [JoinCondition] where: Expression

Identifies one or more columns whose values are to be returned when the SELECT statement is executed. The value specified for this option can be any valid SQL language element; however, column names that correspond to the table or view specified in the FROM clause are commonly used. Identifies a new column name to be used in place of the corresponding table or view column name specified in the result data set returned by the SELECT statement. Identifies the name assigned to the first table that data is to be retrieved from. Identifies a shorthand name that can be used when referencing the leftmost table of the join operation. Identifies the name assigned to the second table that data is to be retrieved from. Identifies a shorthand name that can be used when referencing the rightmost table of the join operation. Identifies the condition to be used to join the two specified tables.

NewColumnName

TableName1 CorrelationName1 TableName2 CorrelationName2 JoinCondition

Consequently, the same inner join operation we looked at earlier could be conducted by executing a SELECT statement that looks something like this: SELECT LASTNAME, DEPTNAME FROM EMPLOYEE E INNER JOIN DEPARTMENT D ON E.WORKDEPT = D.DEPTNO Figure 3-1 illustrates how such an inner join operation would work.

Figure 3-1: A simple inner join operation. It is important to note that inner join queries like the one just shown are typically written without using the keywords INNER JOIN. Thus, the previous query would be more likely to be coded like this: SELECT LASTNAME, DEPTNAME FROM EMPLOYEE E, DEPARTMENT D WHERE E.WORKDEPT = D.DEPTNO

Outer Joins
Outer join operations are used when a join operation is needed and when any rows that would normally be eliminated by an inner join operation need to be preserved. With DB2 UDB, three types of outer joins are available: Left outer join. When a left outer join operation is performed, rows that would have been returned by an inner join operation, together with rows stored in the leftmost table of the join operation (i.e., the table listed on the left side of the OUTER JOIN operator) that would have been eliminated by the inner join operation, are returned in the result data set produced. Right outer join. When a right outer join operation is performed, rows that would have been returned by an inner join operation, together with rows stored in the rightmost table of the join operation (i.e., the table listed on the right side of the OUTER JOIN operator) that would have been eliminated by the inner join operation, are returned in the result data set produced. Full outer join. When a full outer join operation is performed, rows that would have been returned by an inner join operation, together with rows stored in both tables of the join operation that would have been eliminated by the inner join operation, are returned in the result data set produced. To understand the basic principles behind an outer join operation, it helps to look at an example. Suppose Table A and Table B are joined by an ordinary inner join operation. Any row in either Table A or Table B that does not have a matching row in the other table (according to the rules of the join condition) is eliminated from the final result data set produced. By contrast, if Table A and Table B are joined by an outer join, any row in either Table A or Table B that does not contain a matching row in the other table is included in the result data set (exactly once), and columns in that row that would have contained matching values from the other table are assigned a null value. Thus, an outer join operation adds nonmatching rows to the final result data set produced where an inner join operation excludes them. A

left outer join of Table A with Table B preserves all nonmatching rows found in Table A, a right outer join of Table A with Table B preserves all nonmatching rows found in Table B, and a full outer join preserves nonmatching rows found in both Table A and Table B. The basic syntax for a SELECT statement used to perform an outer join operation is: SELECT [* | [Expression] <<AS> [NewColumnName]> ,...] FROM [[TableName1] <<AS> [CorrelationName1]>] [LEFT | RIGHT | FULL] OUTER JOIN [[TableName2] <<AS> [CorrelationName2]>] ON [JoinCondition] where: Expression

Identifies one or more columns whose values are to be returned when the SELECT statement is executed. The value specified for this option can be any valid SQL language element; however, column names that correspond to the table or view specified in the FROM clause are commonly used. Identifies a new column name that is to be used in place of the corresponding table or view column name specified in the result data set returned by the query. Identifies the name assigned to the first table that data is to be retrieved from. This table is considered the "left" table in an outer join. Identifies a shorthand name that can be used when referencing the leftmost table of the join operation. Identifies the name assigned to the second table that data is to be retrieved from. This table is considered the "right" table in an outer join. Identifies a shorthand name that can be used when referencing the rightmost table of the join operation. Identifies the condition to be used to join the two tables specified.

NewColumnName

TableName1 CorrelationName1 TableName2 CorrelationName2 JoinCondition

Thus, a simple left outer join operation could be conducted by executing a SELECT statement that looks something like this: SELECT LASTNAME, DEPTNAME FROM EMPLOYEE E LEFT OUTER JOIN DEPARTMENT D ON E.WORKDEPT = D.DEPTNO The same query could be used to perform a right outer join operation or a full outer join operation by substituting the keyword RIGHT or FULL for the keyword LEFT. Figure 3-2 illustrates how such a left outer join operation would work; Figure 3-3 illustrates how such a right outer join operation would work; and Figure 3-4 illustrates how such a full join operation would work.

Figure 3-2: A simple left outer join operation.

Figure 3-3: A simple right outer join operation.

Figure 3-4: A simple full outer join operation.

Combining Two or More Queries with a Set Operator


With DB2 UDB, it is possible to combine two or more queries into a single query by using a special operator known as a set operator. When a set operator is used, the results of each query executed are combined in a specific manner to produce a single result data set. The following set operators are available: UNION. When the UNION set operator is used, the result data sets produced by each individual query are combined, and all duplicate rows are eliminated. UNION ALL. When the UNION ALL set operator is used, the result data sets produced by each individual query are combined, and any duplicate rows found are retained. EXCEPT. When the EXCEPT set operator is used, duplicate rows found in each result data set produced are eliminated from the result data set of the first query and this modified result data set is returned. EXCEPT ALL. When the EXCEPT ALL set operator is used, all rows found in the first result data set produced that do not have a matching row in the second result data set are returned. (Duplicate rows in the first result data set are retained.) INTERSECT. When the INTERSECT set operator is used, the result data sets produced by each individual query are compared, and every record that is found in both result data sets is copied to a new result data set, all duplicate rows in this new result data set are eliminated, and the new result data set is returned. INTERSECT ALL. When the INTERSECT ALL set operator is used, the result data sets produced by each individual query are compared, and each record that is found in both result data sets is copied to a new result data set; all duplicate rows found in this new result data set are retained. For two result data sets to be combined with a set operator, both must have the same number of columns, and each of those columns must have the same data types assigned to them. So when would you want to combine the results of two queries by using a set operator? Suppose your company keeps individual employee expense account information in a table whose contents are archived at the end of each fiscal year. When a new fiscal year begins, expenditures for that year are essentially recorded in a new table. Now suppose that, for tax purposes, you need a record of all employees expenses for the last two years. To obtain this information, each archived table must be queried, and the results must then be combined. Rather than do this by running individual queries against the archived tables and storing the results in some kind of temporary table, you could perform this operation simply by using the

UNION set operator with two SELECT SQL statements. Such a combination might look something like this: SELECT * FROM EMP_EXP_02 UNION SELECT * FROM EMP_EXP_01 ORDER BY EXPENSES DESC Figure 3-5 illustrates how such a set operation would work.

Figure 3-5: A simple UNION set operation. The same set of queries could be combined using the UNION ALL, EXCEPT, EXCEPT ALL, INTERSECT, or INTERSECT ALL set operator simply by substituting the appropriate keywords for the keyword UNION. However, the results of each operation would be significantly different.

Using SQL Functions to Transform Data


Along with a rich set of SQL statements, DB2 UDB comes with a set of builtin functions that can return a variety of data values or convert data values from one data type to another. (A function is an operation denoted by a function name followed by a pair of parentheses enclosing zero or more arguments.) Most of the built-in functions provided by DB2 UDB are classified as being aggregate (or column, because they work on all values of a column), scalar (because they work on a single value in a table or view), row (because they return a single row to the SQL statement that references them), or table (because they return multiple rows to the SQL statement that references them). The argument of a column function is a collection of like values. A column function returns a single value (possibly null) and can be specified in an SQL statement wherever an expression can be used. Some of the more common column functions include: SUM(Column). Returns the sum of the values in the column specified. AVG(Column). Returns the sum of the values in the column specified divided by the number of values found in that column (the average). MIN(Column). Returns the smallest value found in the column specified. MAX(Column). Returns the largest value found in the column specified. COUNT(Column). Returns the total number of non-null values found in the column specified.

The arguments of a scalar function are individual scalar values, which can be of different data types and can have different meanings. A scalar function returns a single value (possibly null) and can be specified in an SQL statement wherever an expression can be used. Some of the more common scalar functions include: ABS(Value). Returns the absolute value of the value specified. COALESCE(Expression, Expression,). Returns the first expression found in the list provided that it is not null (for example, COALESCE(EMPID, 0) returns the value for EMPID unless that value is null, in which case the value 0 will be returned instead). LENGTH(CharacterString). Returns the number of bytes found in the character string value specified. LCASE(CharacterString) or LOWER(CharacterString). Returns a character string in which all of the characters in the character string value specified are converted to lowercase characters. UCASE(CharacterString) or UPPER(CharacterString). Returns a character string in which all of the characters in the character string value specified are converted to uppercase characters. DATE(Value | CharacterString). Returns a date value from a numeric value or string. MONTH(DateValue). Returns the month portion of the date value specified. DAY(DateValue). Returns the day portion of the date value specified. YEAR(DateValue). Returns the year portion of the date value specified. Row and table functions are special functions because they return one or more rows to the SQL statement that reference them and can only be specified in the FROM clause of a SELECT statement. Such functions are typically used to work with data that does not reside in a DB2 UDB database and/or to convert such data into a format that resembles that of a DB2 table. (The built-in function SNAPSHOT_TABLE is an example of a table function.) Note A complete listing of the functions available with DB2 UDB can be found in the IBM DB2 Universal Database, Version 8 SQL Reference Volume 1 product documentation.

A Word About User-Defined Functions


User-defined functions (UDFs) are special objects that are used to extend and enhance the support provided by the built-in functions available with DB2 UDB. Like user-defined data types (UDTs), userdefined functions (or methods) are created and named by a database user. A user-defined function can be an external function written in a high-level programming language, an internal function written entirely in SQL, or a sourced function whose implementation is inherited from another function that already exists. Like built-in functions, user-defined functions are classified as being scalar, column, row, or table in nature. (User-defined functions are often created to provide the same functionality for user-defined data types that built-in functions provide for the built-in data types upon which user-defined data types are based.) User-defined data types and user-defined functions are covered in much more detail in Chapter 8, "User Defined Routines".

Common Table Expressions


Common table expressions are mechanisms that are used to construct local temporary tables that reside in memory and only exist for the life of the SQL statement that defines them. (In fact, the table that is created in response to a common table expression can only be referenced by the SQL statement that created it.) Common table expressions are typically used: In place of a view (when the creation of a view is undesirable, when general use of a view is not required, and when positioned update or delete operations are not used) To enable grouping by a column that is derived from a subquery or a scalar function that performs some external action When the desired result table is based on host variables When the same result table needs to be used by several different queries When the results of a query need to be derived using recursion The temporary table associated with a common table expression is created by prefixing a SELECT SQL statement with the WHERE keyword. The basic syntax for using this keyword is:

WITH [CommonTableName] <( [ColumnName] ,... )> AS ([SELECTStatement]) where: CommonTableName ColumnName

Identifies the name to be assigned to the temporary table to be created. Identifies the name(s) of one or more columns that are to be included in the temporary table to be created. If a list of column names is specified, the number of column names provided must match the number of columns that will be returned by the SELECT statement used to create the temporary table. (If a list of column names is not provided, the columns of the temporary table will inherit the names that are assigned to the columns returned by the SELECT statement used to create the temporary table. Identifies a SELECT SQL statement that, when executed, will produce data that will populate the temporary table to be created.

SELECTStatement

Thus, if you wanted to use a common table expression to create a temporary table that contains a list of all employees (along with their department names) whose current salary is greater than or equal to $25,000.00 and if you then wanted to use this table to find out which of these employees are female, you could do so by executing a SELECT SQL statement that looks something like this: WITH SOME_EMPLOYEES AS (SELECT LASTNAME, SALARY, SEX, DEPTNAME FROM EMPLOYEE E, DEPARTMENT D WHERE E.WORKDEPT = D.DEPTNO AND SALARY > = 25000.00) SELECT * FROM SOME_EMPLOYEES WHERE SEX = 'F' When this statement is executed, you might see a result data set that looks something like this: LASTNAME SALARY HAAS KWAN LUTZ PULASKI PEREZ SEX DEPTNAME --------- -------- --- ---------------------------52750.00 F SPIFFY COMPUTER SERVICE DIV. 38250.00 F INFORMATION CENTER 29840.00 F MANUFACTURING SYSTEMS 36170.00 F ADMINISTRATION SYSTEMS 27380.00 F ADMINISTRATION SYSTEMS

NICHOLLS 28420.00 F INFORMATION CENTER

HENDERSON 29750.00 F OPERATIONS SCHNEIDER 26250.00 F OPERATIONS 8 record(s) selected. Multiple common table expressions can be specified following the single WITH keyword, and each common table expression specified can be referenced by name in the FROM clause of subsequent common table expressions. However, if multiple common table expressions are defined within the same WITH keyword, the table name assigned to each temporary table created must be unique from all other table names used in the SELECT statement that creates them. It is also important to note that the table name assigned to the temporary table created by a common table expression will take precedence over any existing table, view, or alias (in the system catalog) that has the same qualified name and that the

SELECT SQL statement that references the original table, view, or alias will actually be working with the temporary table created. (Existing tables, views, and aliases whose names match that of the temporary table are not altered but are simply no longer accessible.)

Retrieving Results from a Result Data Set Using a Cursor


So far, we have looked at a variety of ways in which a query can be constructed using the SELECT SQL statement. And we have seen how the results of a query in some cases can be returned to the user when an SQL statement is executed from the Command Line Processor. However, we have not seen how the results of a query can be obtained when a SELECT statement is executed from an application program. When a query is executed from within an application, DB2 UDB uses a mechanism known as a cursor to retrieve data values from the result data set produced. The name "cursor" probably originated from the blinking cursor found on early computer screens, and just as that cursor indicated the current position on the screen and identified where typed words would appear next, a DB2 UDB cursor indicates the current position in the result data set (i.e., the current row) and identifies which row of data will be returned to the application next. Depending upon how it has been defined, a cursor can fall into one of three categories: Read-only. Read-only cursors are cursors that have been constructed in such a way that rows in their corresponding result data set can be read but not modified or deleted. A cursor is considered read-only if it is based on a read-only SELECT statement. (For example, the statement SELECT DEPTNAME FROM DEPARTMENT is a read-only SELECT statement.) Updatable. Updatable cursors are cursors that have been constructed in such a way that rows in their corresponding result data set can be modified or deleted. A cursor is considered updatable if the FOR UPDATE clause was specified when the cursor was created. (Only one table can be referenced in the SELECT statement that is used to create an updatable cursor.) Ambiguous. Ambiguous cursors are cursors that have been constructed in such a way that it is impossible to tell if they are meant to be read-only or updatable. (Ambiguous cursors are treated as readonly cursors if the BLOCKING ALL option was specified during precompiling or binding. Otherwise, they are considered updatable.) Regardless of which type of cursor is used, the following steps must be followed if a cursor is to be incorporated into an application program: 1. Declare (define) the cursor along with its type and associate it with the desired query (SELECT SQL statement). 2. Open the cursor. This action will cause the corresponding query to be executed and a result data set to be produced. 3. Retrieve (fetch) each row in the result data set, one by one, until an "End of data" condition occurseach time a row is retrieved from the result data set, the cursor is automatically moved to the next row. 4. If appropriate, modify or delete the current row (but only if the cursor is updatable). 5. Close the cursor. This action will cause the result data set that was produced when the corresponding query was executed to be deleted. With DB2 UDB (as with most other relational database management systems), the following SQL statements are used to carry out the preceding steps: DECLARE CURSOR OPEN FETCH CLOSE The DECLARE CURSOR Statement Before a cursor can be used in an application program, it must be created and associated with the SELECT statement that will be used to generate its corresponding result data set. This is done by executing the DECLARE CURSOR SQL statement. The basic syntax for this statement is: DECLARE CURSOR [CursorName]

<WITH HOLD> <WITH RETURN <TO CLIENT | TO CALLER>> FOR [[SELECTStatement] | [StatementName]] <FOR READ ONLY | FOR FETCH ONLY FOR UPDATE <OF [ColumnName,...]>> where: CursorName SELECTState ment StatementNa me

Identifies the name to be assigned to the cursor to be created. Identifies a SELECT SQL statement that, when executed, will produce a result data set that is to be associated with the cursor to be created. Identifies a prepared SELECT SQL statement that, when executed, will produce a result data set that is to be associated with the cursor to be created. (This SELECT statement must be prepared with the PREPARE SQL statement before it is used to create a cursor; this statement can contain parameter markers.) Identifies the name of one or more columns in the result data set to be produced whose values can be modified by performing a positioned update or a positioned delete operation. (Each name provided must identify an existing column in the result data set produced.)

ColumnName

If the WITH HOLD option is specified when the DECLARE CURSOR statement is executed, the cursor created will remain open (once it has been opened) across transaction boundaries and must be explicitly closed. (If this option is not used, the scope of the cursor is limited to the transaction in which it is defined and will be closed automatically when the transaction that declares and opens the cursor is terminated.) If the WITH RETURN option is specified when the DECLARE CURSOR statement is executed, it is assumed that the cursor has been created from within a stored procedure and that once opened, the cursor is to remain open when control is passed back to either the calling application or the client application, depending on how the WITH RETURN option was specified. Note The clauses FOR READ ONLY, FOR FETCH ONLY, and FOR UPDATE <OF [ColumnName,]> are actually part of the SELECT statement used to build the result data set associated with the cursor and are not part of the DECLARE CURSOR statement's syntax. As you might imagine, the use (or lack) of these clauses determine whether the cursor to be created will be a read-only, updatable, or ambiguous cursor. Thus, if you wanted to define a read-only cursor named MY_CURSOR that is associated with a result data set that contains values obtained from the columns WORKDEPT and JOB found in a table named EMPLOYEE, you could do so by executing a DECLARE CURSOR statement that looks something like this: DECLARE MY_CURSOR CURSOR FOR SELECT WORKDEPT, JOB FROM EMPLOYEE FOR READ ONLY Multiple cursors can be created within a single application; however, each cursor created must have a unique name (within the same source code file). The OPEN Statement Although a cursor is defined when the DECLARE CURSOR SQL statement is executed, the result data set associated with the cursor is not actually produced until the cursor is opened; when a cursor is opened, all rows that satisfy the query associated with the cursor's definition are retrieved and copied to a result data set. Cursors are opened by executing the OPEN SQL statement. The basic syntax for this statement is:

OPEN [CursorName] <USING [HostVariable],... | USING DESCRIPTOR [DescriptorName]> where: CursorName HostVariable

DescriptorName

Identifies the name to be assigned to the cursor to be opened. Identifies one or more host variables that are to be used to provide values for any parameter markers that were coded in the SELECT statement used to create the cursor to be opened. (Host variables and parameter markers are described in detail in Chapter 4, "Embedded SQL Programming.") Identifies an SQL Descriptor Area (SQLDA) data structure variable that contains descriptions of each host variable that is to be used to provide values for parameter markers coded in the SELECT statement used to create the cursor to be opened. (The SQLDA data structure variable is described in detail in Chapter 4, "Embedded SQL Programming.")

Thus, if you wanted to open a cursor named MY_CURSOR (which, in turn, would cause the corresponding result data set to be produced), you could do so by executing an OPEN statement that looks like this: OPEN MY_CURSOR On the other hand, if you wanted to open a cursor named MY_CURSOR and associate two host variables (named LastName and FirstName) with parameter markers that were coded in the SELECT statement that was used to create the cursor to be opened, you could do so by executing an OPEN statement that looks like this: OPEN MY_CURSOR USING :LastName, :FirstName It is important to note that the rows of the result data set associated with a query may be derived during the execution of the OPEN statement (in which case a temporary table may be created to hold them); or they may be derived during the execution of each subsequent FETCH statement. In either case, when a cursor is opened, it is placed in the "Open" state, and the cursor pointer is positioned before the first row of data in the result data set produced; if the result data set is empty, the position of the cursor is effectively "after the last row," and any subsequent FETCH operations performed will generate a NOT FOUND (+100) condition. Note It is important to note that once a cursor has been opened, it can be in one of three possible positions: "Before a Row of Data," "On a Row of Data," or "After the Last Row of Data." If a cursor is positioned "Before a Row of Data," it will be moved just before the first row of the result data set, and the data values stored in that row will be assigned to the appropriate host variables when the FETCH statement is executed. If a cursor is positioned "On a Row of Data" when the FETCH statement is executed, it will be moved to the next row in the result data set (if one exists), and the data values stored in that row will be assigned to the appropriate host variables. If a cursor is positioned on the last row of the result data set when the FETCH statement is executed, it will be moved to the "After the Last Row of Data" position, the value +100 will be assigned to the sqlcode field of the current SQLCA data structure variable, and the value "02000" will be assigned to the sqlstate field of the current SQLCA data structure variable. (In this case, no data is copied to the host variables specified.) The FETCH Statement Once a cursor has been opened, data is retrieved from its associated result data set by calling the FETCH statement repeatedly until all records have been processed. The basic syntax for the FETCH statement is:

FETCH <FROM> [CursorName] INTO [HostVariable ,...] or FETCH <FROM> [CursorName] USING DESCRIPTOR [DescriptorName] where: CursorName HostVariable

Identifies the name assigned to the cursor that data is to be retrieved from. Identifies one or more host variables that values obtained from the result data set associated with the cursor specified are to be copied to. Identifies an SQL Descriptor Area (SQLDA) data structure variable that contains descriptions of each host variable that values obtained from the result data set associated with the cursor specified are to be copied to.

DescriptorName

Thus, if you wanted to retrieve a record from the result data set associated with a cursor named MY_CURSOR and copy the values obtained to two host variables named DeptNumber and DeptName, you could do so by executing a FETCH statement that looks something like this: FETCH FROM MY_CURSOR CURSOR INTO :DeptNumber, :DeptName The CLOSE Statement When all records stored in the result data set associated with a cursor have been retrieved (and copied to host variables) or when the result data set associated with a cursor is no longer needed, it can be destroyed by executing the CLOSE SQL statement. The syntax for this statement is: CLOSE [CursorName] <WITH RELEASE> where: CursorName

Identifies the name assigned to the cursor to be closed.

If the WITH RELEASE option is specified when the CLOSE statement is executed, an attempt will be made to release all locks that were acquired on behalf of the cursor. (It is important to note that not all of the locks acquired are necessarily released; some locks may be held for other operations or activities.) Therefore, if you wanted to close a cursor named MY_CURSOR and destroy its associated result data set, you could do so by executing a CLOSE statement that looks like this: CLOSE MY_CURSOR

Putting It All Together


Now that we have seen how each of the cursor processing statements available are used, let's examine how they are typically coded in an application. An embedded SQL application written in the C programming language that uses a cursor to obtain and print employee identification numbers and last names for all employees who have the job title DESIGNER might look something like this: #include <stdio.h> #include <stdlib.h> #include <sql.h> void main()

{ /* Include The SQLCA Data Structure Variable */ EXEC SQL INCLUDE SQLCA; /* Declare The SQL Host Memory Variables */ EXEC SQL BEGIN DECLARE SECTION; char EmployeeNo[7]; char LastName[16]; EXEC SQL END DECLARE SECTION; /* Connect To The SAMPLE Database */ EXEC SQL CONNECT TO SAMPLE USER db2admin USING ibmdb2; /* Declare A Cursor */ EXEC SQL DECLARE C1 CURSOR FOR SELECT EMPNO, LASTNAME FROM EMPLOYEE WHERE JOB = 'DESIGNER'; /* Open The Cursor */ EXEC SQL OPEN C1; /* Fetch The Records */ while (sqlca.sqlcode == SQL_RC_OK) { /* Retrieve A Record */ EXEC SQL FETCH C1 INTO :EmployeeNo, :LastName /* Print The Information Retrieved */ if (sqlca.sqlcode == SQL_RC_OK) printf("%s, %s\n", EmployeeNo, LastName); } /* Close The Cursor */ EXEC SQL CLOSE C1; /* Issue A COMMIT To Free All Locks */ EXEC SQL COMMIT; /* Disconnect From The SAMPLE Database */

EXEC SQL DISCONNECT CURRENT; } Remember, an application can use several cursors concurrently; however, each cursor must have its own unique name and its own set of DECLARE CURSOR, OPEN, FETCH, and CLOSE SQL statements.

Retrieving a Single Row of Data


We have just seen how a cursor can be used to process a result data set, and earlier we saw that a result data set can contain any number of rows (including no rows at all). But is a cursor really necessary if it is known in advance that a result data set will only contain one row? The answer to this question is no. If you know in advance that only one row of data will be produced in response to a query, you can copy the contents of that row (record) to host variables within an application program in one of two ways: by executing a special form of the SELECT SQL statement known as the SELECT INTO statement or by executing the VALUES INTO SQL statement. The SELECT INTO SQL Statement The SELECT INTO statement is almost identical to the SELECT statement, in both its syntax and its behavior. However, unlike the SELECT statement, the SELECT INTO statement requires a list of valid host variables to be supplied as part of its syntax and can only be used in an embedded SQL application program. Furthermore, it must not return more than one row of data and must be dynamically prepared. The basic syntax for the SELECT INTO SQL statement is: SELECT <DISTINCT> [* | [Expression] ,...] FROM [[TableName] | [ViewName] <<AS> [CorrelationName]> ,...] INTO [HostVariable] ,...] <WhereClause> <GroupByClause> <HavingClause> <OrderByClause> <FetchFirstClause> where: Expression

Identifies one or more columns for which values are to be returned when the SELECT statement is executed. The value specified for this option can be any valid SQL language element; however, column names that correspond to the table or view specified in the FROM clause are commonly used. Identifies the name(s) assigned to one or more tables that data is to be retrieved from. Identifies the name(s) assigned to one or more views that data is to be retrieved from. Identifies a shorthand name that can be used when referencing the table or view that the correlation name is associated with in any of the SELECT statement clauses. Identifies one or more host variables that values obtained from the result data set produced are to be copied to.

TableName ViewName CorrelationName

HostVariable

WhereClause GroupByClause HavingClause OrderByClause FetchFirstClause

Identifies a WHERE clause that is to be used with the SELECT statement. Identifies a GROUP BY clause that is to be used with the SELECT statement. Identifies a HAVING clause that is to be used with the SELECT statement. Identifies an ORDER BY clause that is to be used with the SELECT statement. Identifies a FETCH FIRST clause that is to be used with the SELECT statement.

When this form of the SELECT statement is executed, all data retrieved is stored in a result data set; if this result data set contains only one record, the first value in that record is copied to the first host variable specified, the second value is copied to the second host variable specified, and so on. On the other hand, if the result data set produced contains more than one record, the operation will fail and an error will be generated. (If the result data set produced is empty, a NOT FOUND warning will be generated.) Thus, if you wanted to query a database for a single row of data and copy the data for that row into host variables without using a cursor, you could do so by executing a SELECT INTO statement that looks something like this: SELECT DEPTNO, DEPTNAME FROM DEPARTMENT INTO :DeptNumber, :DeptName WHERE DEPTNO = 'B01' The VALUES INTO SQL

Statement

Like the SELECT INTO SQL statement, the VALUES INTO statement can be used to retrieve the data associated with a single record and copy it to one or more host variables. The basic syntax for the VALUES INTO statement is: VALUES [Expression] INTO [[HostVariable] ,...] or VALUES ( [Expression] ,... ) INTO [[HostVariable] ,...] where: Expression

Identifies one or more values that are to be returned when the VALUES INTO statement is executed. The value specified for this option can be any valid SQL language element; however, DB2 UDB special registers and SQL functions are typically used. Identifies one or more host variables that values obtained from the result data set produced are to be copied to.

HostVariable

As you can see, the VALUES INTO statement cannot be used to construct complex queries in the same way the SELECT INTO statement can. Like the SELECT INTO statement, when the VALUES INTO statement is executed, all data retrieved is stored in a result data set, and if this result data set contains only one record, the first value in that record is copied to the first host variable specified, the second value is copied to the second host variable specified, and so on. On the other hand, if the result data set produced contains more than one record, the operation will fail, and an error will be generated. (If the result data set produced is empty, a NOT FOUND warning will be generated.) The VALUES INTO statement is often used to obtain the value assigned to one or more of the DB2 UDB special registers available or to obtain the results of one or more SQL functions. (Refer to Chapter 2, "Database Objects and Programming Methods," for more information about the special registers that are

provided with DB2 UDB.) Thus, if you wanted to retrieve the value of the CURRENT PATH special register and copy it to a host variable, you could do so by executing a VALUES INTO statement that looks something like this: VALUES (CURRENT PATH) INTO :Path

Transactions
A transaction (also known as a unit of work) is a sequence of one or more SQL operations grouped together as a single unit, usually within an application process. Such a unit is called "atomic" because, like atoms (before fission and fusion were discovered), it is indivisibleeither all of its work is carried out, or none of its work is carried out. A given transaction can perform any number of SQL operations from a single operation to many hundreds or even thousands, depending on what is considered a "single step" within your business logic. (It is important to note that the longer a transaction is, the more database concurrency decreases and the more resource locks are acquired; this is usually considered the sign of a poorly written application.) The initiation and termination of a single transaction defines points of data consistency within a database; either the effects of all operations performed within a transaction are applied to the database and made permanent (committed), or the effects of all operations performed are backed out (rolled back) and the database is returned to the state it was in before the transaction was initiated. In either case, all locks that were acquired on behalf of the transaction, with the exception of those locks acquired for held cursors, are immediately released. (However, any data pages that were copied to a buffer pool on behalf of a transaction will remain in the buffer pool until their storage space is neededat that time, they will be removed.) In most cases, transactions are initiated the first time an executable SQL statement is executed after a connection to a database has been made or immediately after a preexisting transaction has been terminated. Once initiated, transactions can be implicitly terminated using a feature known as "automatic commit" (in this case, each executable SQL statement is treated as a single transaction, and any changes made by that statement are applied to the database if the statement executes successfully or discarded if the statement fails) or they can be explicitly terminated by executing the COMMIT or the ROLLBACK SQL statement. The basic syntax for these two statements is: COMMIT <WORK> and ROLLBACK <WORK> When the COMMIT statement is used to terminate a transaction, all changes made to the database since the transaction began are made permanent. However, when the ROLLBACK statement is used, all changes made are backed out, and the database is returned to the state it was in just before the transaction began. Figure 3-6 shows the effects of a transaction that was terminated with a COMMIT statement; Figure 3-7 shows the effects of a transaction that was terminated with a ROLLBACK statement.

Figure 3-6: Terminating a transaction with the COMMIT SQL statement.

Figure 3-7: Terminating a transaction with the ROLLBACK SQL statement. It is important to remember that commit and rollback operations only have an effect on changes that have been made within the transaction that they terminate. So, to evaluate the effects of a series of transactions, you must be able to identify where each transaction begins, as well as when and how each transaction was terminated. Figure 3-8 shows how the effects of a series of transactions can be evaluated.

Figure 3-8: Evaluating the effects of a series of transactions. Changes made by a transaction that have not been committed are usually inaccessible to other users and applications (unless those users and/or applications are running under the Uncommitted Read isolation level) and can be backed out with a rollback operation. However, once changes made by a transaction have been committed, they become accessible to all other users and/or applications and can only be removed by executing new SQL statements (within a new transaction). What happens if a system failure occurs before a transaction's changes can be committed? If only the user/application is disconnected (for example, because of a network failure), the DB2 Database Manager backs out all uncommitted changes (by replaying information stored in the transaction log files), and the database is returned to the state it was in just before the transaction that was terminated unexpectedly began. On the other hand, if the database or the DB2 Database Manager is terminated (for example, because of a hard disk failure or a loss of power), the DB2 Database Manager will try to roll back all open transactions that it finds in the transaction log file the next time the database is restarted (which will take place automatically the next time a user attempts to connect to the database if the database configuration parameter autorestart has been set accordingly). Only after this succeeds will the database be placed online again (i.e., made accessible to users and applications).

Transaction Management and Savepoints


Often, it is desirable to limit the amount of work performed within a single transaction so that locks acquired on behalf of the transaction are released in a timely manner. (When locks are held by one transaction, other transactions may be forced to wait for those locks to be freed before they can continue.) Additionally, if a large number of changes are made within a single transaction, it can take a considerable amount of time to back those changes out if the transaction is rolled back. However, using several small transactions to perform a single large task has its drawbacks as well. For one thing, the opportunity for data inconsistency to occur can be increased because business rules may have to cross several transaction boundaries. Furthermore, each time a COMMIT statement is used to terminate a transaction, the DB2 Database manager must perform extra work to commit the current transaction and start a new one. (Another drawback of having multiple commit points for a particular operation is that

portions of an operation might be committed and therefore be visible to other applications before the operation is fully completed.) To get around these issues, DB2 UDB uses a mechanism known as a savepoint, which allows an application to break the work being performed by a single large transaction into one or more subsets. Using application savepoints avoids the exposure to "dirty data" that might occur when multiple commits are performed while providing granular control over an operation. (You can use as many savepoints as you want within a single transaction; however, savepoints cannot be nested.) Savepoints are created by executing the SAVEPOINT SQL statement. The basic syntax for this statement is: SAVEPOINT [SavepointName] <UNIQUE> ON ROLLBACK RETAIN CURSORS <ON ROLLBACK RETAIN LOCKS> where: SavepointName

Identifies the name to be assigned to the savepoint to be created.

If the UNIQUE option is specified when the SAVEPOINT statement is executed, the name assigned to the savepoint created will be unique and cannot be reused by the application that created it as long as the savepoint is active. Thus, if you wanted to create a savepoint named MY_SP, you could do so by executing a SAVEPOINT statement that looks like this: SAVEPOINT MY_SP ON ROLLBACK RETAIN CURSORS Once created, a savepoint can be used in conjunction with a special form of the ROLLBACK SQL statement to return a database to the state it was in at the point in time a particular savepoint was created. The syntax for this form of the ROLLBACK statement is: ROLLBACK <WORK> TO SAVEPOINT <[SavepointName]> where: SavepointName

Identifies the name assigned to the savepoint that all operations performed against the database are to be backed out to.

And finally, when a savepoint is no longer needed, it can be released by executing the RELEASE SAVEPOINT SQL statement. The syntax for this statement is: RELEASE <TO> SAVEPOINT <[SavepointName]> where: SavepointName

Identifies the name assigned to the savepoint that is to be released.

The following embedded SQL application (written in the C programming language) illustrates how a savepoint might be used with the ROLLBACK SQL statement in a single transaction to control the behavior of the transaction: #include <stdio.h> #include <stdlib.h> #include <sql.h>

void main() { /* Include The SQLCA Data Structure Variable */ EXEC SQL INCLUDE SQLCA; /* Connect To The SAMPLE Database */ EXEC SQL CONNECT TO SAMPLE USER db2admin USING ibmdb2; /* Add A Record To The ORDER Table */ EXEC SQL INSERT INTO ORDER (ITEM) VALUES ('Lamp') /* Create And Set The First Savepoint */ EXEC SQL SAVEPOINT SP1 ON ROLLBACK RETAIN CURSORS; /* Add Two More Records To The ORDER Table */ EXEC SQL INSERT INTO ORDER (ITEM) VALUES ('Radio') EXEC SQL INSERT INTO ORDER (ITEM) VALUES ('Power Cord') /* Remove The Last Two Records Added To The ORDER */ /* Table */ EXEC SQL ROLLBACK TO SAVEPOINT SP1; /* Commit The Transaction (Which Releases The */ /* Savepoint) */ EXEC SQL COMMIT; /* Disconnect From The SAMPLE Database */ EXEC SQL DISCONNECT CURRENT; } In this example, the last two records added to the ORDERS table are removed when the ROLLBACK TO SAVEPOINT SQL statement is executed, but the first record added remains and is externalized to the database when the transaction is committed. Once a savepoint is created, all subsequent SQL statements executed are associated with that savepoint until it is released either explicitly by calling the RELEASE SAVEPOINT statement or implicitly by ending the transaction or unit of work that the savepoint was created in. In addition, when you issue a ROLLBACK TO SAVEPOINT SQL statement, the corresponding savepoint is not automatically released as soon as the rollback operation is completed. Instead, you can issue multiple ROLLBACK TO SAVEPOINT statements for a given transaction, and each time a ROLLBACK TO SAVEPOINT statement is executed, the database will be returned to the state it was in at the time the savepoint was created. (If multiple savepoints have been created, it is possible to rollback to any savepoint available; you are not

required to successively rollback to every savepoint, in the opposite order in which they were created, to return the database to the state it was in when an earlier savepoint was created.)

Practice Questions
Question 1 Given the following tables: EMPLOYEES ----------------------------------------EMPID LASTNAME ----------------------1 Jagger 1 2 Richards 1 3 Watts 2 4 Wood 1

DEPT

Question 2

DEPARTMENT ---------------------------DEPTID DEPTNAME -------------------1 Planning 2 Support Assuming EMPLOYEES.DEPT is a foreign key on DEPARTMENT.DEPTID, if the following statements are executed: INSERT INTO employees VALUES (5, 'Wyman', 2) INSERT INTO employees VALUES (6, 'Jones', 1) INSERT INTO employees VALUES (7, 'Stewart', 3) How many rows will be stored in the EMPLOYEES table? A. 4 B. 5 C. 6 D. 7 Given the following tables: COUNTRY ----------------ID NAME ---------1 USA 2 Canada 3 Mexico NATION -------------------ID NAME --------------1 USA 2 Australia 3 Great Britain and the code EXEC SQL DECLARE c1 CURSOR FOR SELECT * FROM country UNION ALL SELECT * FROM nation EXEC SQL OPEN c1 How many rows will exist in the result data set produced? A. 3

Question 3

B. 4 C. 5 D. 6 Given the following tables: TAB1 ---------------------COL_1 COL_2 --------------A 10 B 12 C 14


Question 4

Question 5

TAB2 ----------------------COL_A COL_B --------------A 21 C 23 D 25 and the following query that executes successfully: SELECT * FROM tab1 LEFT OUTER JOIN tab2 ON tab1.col_1 = tab2.col_a How many rows will be returned? A. 4 B. 3 C. 2 D. 1 The following CREATE TABLE statement was used to create a table named TABLEA : CREATE TABLE tablea (col1 INTEGER) Which of the following statements can be used to remove all records stored in table TABLEA ? A. DELETE FROM tablea B. DROP * FROM tablea C. REMOVE ALL FROM tablea D. UPDATE tablea SET col1 = NULL Given the following tables: TAB1 -----COL1 -----10 10 12 14 TAB2 -----COL1 -----10 12 12 If the following SQL statement executes successfully: UPDATE tab1 SET col1 = col1 + 10 WHERE col1 IN

Question 6

Question 7

Question 8

Question 9

(SELECT * FROM tab2) How many rows in TAB1 will be modified? A. 0 B. 1 C. 2 D. 3 Which of the following will extend the functionality provided by the MIN( ) built-in function to a distinct data type? A. A distinct type extender B. A user-defined data type C. A sourced user-defined function D. A function extender Given the following SQL statements: CREATE TABLE empinfo (name CHAR(10), salary DEC NOT NULL WITH DEFAULT) INSERT INTO empinfo VALUES ('Smith', 20000) INSERT INTO empinfo (name) VALUES ('Jones') INSERT INTO empinfo VALUES ('Doe', 25000) INSERT INTO empinfo (name) VALUES (NULL) Which two of the following statements will only retrieve one row? A. SELECT salary/(SELECT SUM(salary) FROM empinfo) FROM empinfo B. SELECT SUM(salary)/COUNT(*) FROM empinfo C. SELECT COALESCE(LCASE(name), Unknown) FROM empinfo D. SELECT salary FROM empinfo WHERE salary IN (SELECT salary/ (SELECT SUM(salary) FROM empinfo) FROM empinfo) E. SELECT COALESCE(MIN(salary), 0) FROM empinfo Which of the following will return values that are only in upper case? A. SELECT name FROM employee WHERE UCASE (name) = smith B. SELECT name FROM employee WHERE UCASE (name) = SMITH C. SELECT UCASE (name) FROM employee WHERE name = smith D. SELECT name FROM employee WHERE name IN (SELECT name FROM employee WHERE UCASE (name) = UCASE (smith)) Given the following SQL statement: WITH emp AS (SELECT lastname, salary FROM employees) SELECT * FROM emp WHERE salary >= 35000.00 What does EMP refer to? A. A user table B. A declared temporary table C. A system catalog table D. A local temporary that resides in

Question 10

Question 11

Question 12

Question 13

memory Which of the following is NOT used to retrieve a single row of data? A. SELECT B. SELECT INTO C. DECLARE CURSOR D. VALUES Given the following table and the statements below: TAB1 -------------------------COL_1 COL_2 --------------A 10 B 20 C 30 D 40 E 50 DECLARE c1 CURSOR WITH HOLD FOR SELECT * FROM tab1 ORDER BY col_1 OPEN c1 FETCH c1 FETCH c1 FETCH c1 COMMIT FETCH c1 CLOSE c1 FETCH c1 Which of the following is the last value obtained for COL_2 ? A. 20 B. 30 C. 40 D. 50 Which of the following is NOT a characteristic of all cursors used in an embedded SQL application? A. Must be declared before they can be used B. Must be reserved in the database C. Must be unique within a source code file D. May be updatable Given the following two tables: TAB1 ------------------------COL_1 COL_2 --------------A 10 B 12 C 14 TAB2 ------------------------COL_A COL_B --------------A 21 C 23 D 25

Question 14

Assuming the following results are desired: COL_1 COL_2 COL_A COL_B A 10 A 21 B 12 C 14 C 23 D 25 Which of the following joins will produce the desired results? A. SELECT * FROM tab1 INNER JOIN tab2 ON col_1 = col_a B. SELECT * FROM tab1 LEFT OUTER JOIN tab2 ON col_1 = col_a C. SELECT * FROM tab1 RIGHT OUTER JOIN tab2 ON col_1 = col_a D. SELECT * FROM tab1 FULL OUTER JOIN tab2 ON col_1 = col_a Given the following set of statements: CREATE TABLE tab1 (col1 INTEGER, col2 CHAR(20)) COMMIT INSERT INTO tab1 VALUES (123, 'Red') INSERT INTO tab1 VALUES (456, 'Yellow') COMMIT DELETE FROM tab1 WHERE col1 = 123 COMMIT INSERT INTO tab1 VALUES (789, 'Blue') ROLLBACK INSERT INTO tab1 VALUES (789, 'Green') ROLLBACK UPDATE tab1 SET col2 = NULL COMMIT Which of the following records would be returned by the statement SELECT * FROM tab1 ? A. COL1 COL2 ------- ------ 123 Red 1 record(s) selected. B.

COL1

COL2

------- ------456 C. COL1 COL2 Yellow

1 record(s) selected.

------- ------456 -

1 record(s) selected.

D. COL1 COL2

------- ------789 Green

Question 15

Question 16

Question 17

1 record(s) selected. Which of the following is NOT a benefit of a userdefined function? A. Simplifies application maintenance B. Improves application concurrency C. Provides built-in function support for user-defined data types D. Name and calling convention controllable Given the following tables: TABLEA TABLEB -------------------- --------------------------------------EMPID NAME EMPID WEEKNO PAYAMT ------------- -----------------------1 USER1 1 1 1000.00 2 USER2 1 2 1000.00 2 1 2000.00 and the fact that TABLEB was defined as follows: CREATE TABLE tableb ( empid SMALLINT, weekno SMALLINT, payamt DECIMAL(6,2), CONSTRAINT const1 FOREIGN KEY (empid) REFERENCES tablea(empid) ON DELETE NO ACTION) If the following command is issued: DELETE FROM tablea WHERE empid = 2 How many rows will be deleted from TABLEA and TABLEB ? A. 0, 0 B. 0, 1 C. 1, 0 D. 1, 1 Given the following table: TAB1 ----------------------COL1 COL2 ------------abc 1 bcde 2 cdefg 4 Which of the following statements will retrieve the largest computed value? A. SELECT SUM(col2)/COUNT(*) FROM tab1 B. SELECT MAX(col2) FROM tab1 C. SELECT LENGTH(col1) FROM tab1 WHERE col2 = 4

Question 18

Question 19

Question 20

Question 21

D. SELECT STRLEN(col1) FROM tab1 WHERE col2 = 4 Which of the following types of functions requires a collection of like values as its input? A. SCALAR B. COLUMN C. TABLE D. GROUPING If the following statements are executed: EXEC SQL DECLARE c1 CURSOR FOR SELECT deptid, deptname FROM department EXEC SQL OPEN c1 and an empty result data set is produced, what will happen when the following statement is executed? EXEC SQL FETCH c1 into :id, :name A. The cursor will remain open and the return code -100 will be returned B. The cursor will be closed and the return code -100 will be returned C. The cursor will remain open and the return code 100 will be returned D. The cursor will be closed and the return code 100 will be returned If the following SQL statements are executed: CREATE TABLE tab1 (col1 INT, col2 INT) COMMIT INSERT INTO tab1 VALUES (1, 1) INSERT INTO tab1 VALUES (2, 2) SAVEPOINT sp1 UPDATE tab1 SET col1 = 10 WHERE col2 = 1 DELETE FROM tab1 WHERE col2 = 2 SAVEPOINT sp2 INSERT INTO tab1 VALUES (20, 20) INSERT INTO tab1 VALUES (40, 40) ROLLBACK TO SAVEPOINT sp1 INSERT INTO tab1 VALUES (20, 20) COMMIT Assuming auto commit was not used, how many rows will be returned when the following statement is executed? SELECT * FROM tab1 A. 0 B. 1 C. 2 D. 3 Given the following table: T1 -----------------C1 C2 ----1 10 2 20 3 30 Which of the following cursor definitions will create a cursor named CUR1 that can be used to update records found in table T1 without restrictions?

Question 22

A. DECLARE cur1 CURSOR FOR SELECT * FROM t1 FOR UPDATE B. DECLARE cur1 CURSOR FOR SELECT * FROM t1 FOR UPDATE OF t1 C. DECLARE cur1 CURSOR FOR UPDATE OF c1 FROM t1 D. DECLARE cur1 CURSOR FOR SELECT * FROM t1 FOR UPDATE OF c1 Which of the following does NOT take place when a transaction is committed? A. All cursors opened within the transaction that were not defined with the WITH HOLD clause are destroyed B. Locks acquired by the transaction are released C. Any data pages that were copied to a buffer pool on behalf of the transaction are deleted D. Changes made to data by the transaction are permanently recorded in the database

Answers Question 1

Question 2

Question 3

The correct answer is C. The EMPLOYEES table contained 4 records initially. Then, an attempt was made to add three more records; however, the statement "INSERT INTO employees VALUES (7, Stewart, 3)" failed because it violates the insert rule of the referential constraint between the EMPLOYEES table and the DEPARTMENT table (there is no record in the DEPARTMENT table that has a DEPTID value of 3). Thus, only 2 records were added to the EMPLOYEES table, bringing the total number of records to 6. The correct answer is D. When the UNION ALL set operator is used, the result data sets produced by each individual query are combined, and any duplicate rows found are retained. Thus, the result data set produced for cursor C1 will contain all records found in each table. The correct answer is B. When a left outer join operation is performed, rows that would have been returned by an inner join operation, together with all rows stored in the leftmost table of the join operation (i.e., the table listed first in the OUTER JOIN clause) that would have been eliminated by the inner join operation, are returned in the result data set produced. Therefore, the result data set produced by this join operation would look like this: ---------------------------------------------------------------COL_1 A B 10 12 COL_2 A COL_A 21 COL_B -------- --------- --------- ---------

Question 4

C 14 C 23 The correct answer is A. Although the UPDATE statement can be used to delete individual values from a base table (by setting those values to NULL), it cannot be used to remove entire rows. When

Question 5

Question 6

Question 7

Question 8

Question 9

Question 10

Question 11

one or more rows of data need to be removed from a base table, the DELETE SQL statement must be used instead. (The DROP statement is used to delete both an object and its datain this case, the DROP statement is coded incorrectly. There is no REMOVE statement.) The correct answer is D. The IN predicate is used to define a comparison relationship in which a value is checked to see whether it matches a value in a finite set of values. This finite set of values can consist of one or more literal values that are coded directly in the UPDATE statement or it can be composed of the non-null values found in the result data set generated by a subquery. In this case, an update operation is performed on every row in TAB1 that has a matching value in TAB2. The correct answer is C. User-defined functions are often created to provide the same functionality for user-defined data types that builtin functions provide for the built-in data types upon which userdefined data types are based. (A user-defined function can be an external function written in a high-level programming language or a sourced function whose implementation is inherited from some other function that already exists.) The correct answers are B and E. When only columnar functions are used in a SELECT statement, often one row of data is returned. Just the opposite is true when scalar functions are used. The statement "SELECT SUM(salary)/COUNT(*) FROM empinfo" contains only columnar functions, as does the statement "SELECT COALESCE(MIN(salary), 0) FROM empinfo"in this case, the COALESCE() function uses output from the MIN() function as its input, so, for all intents and purposes, only one columnar function was used. The correct answer is C. The key wording here is return values in upper case. In order to return values in upper case, the UCASE() function must be used to convert all values retrieved. If the UCASE() function is used in the WHERE clause of a SELECT statement, the conversion is applied for the purpose of locating matching valuesnot to return values that have been converted to upper case. The correct answer is D. The SQL statement "WITH emp AS (SELECT lastname, salary FROM employees) SELECT * FROM emp WHERE salary >= 35000.00" is a common table expression, and EMP refers to a local temporary table that resides in memory. (Common table expressions are mechanisms used to construct local temporary tables that reside in memory and exist only for the life of the SQL statement that defines them; additionally, tables that are created by a common table expression can be referenced only by the SQL statement that created them.) The correct answer is C. A SELECT statement can be constructed so it only returns one row of data; the SELECT INTO statement must return no more than one row of data, or it will failthe same is true for the VALUES and VALUES INTO statements. However, a cursor is always used to retrieve multiple rows of data. The correct answer is C. When a cursor that has been declared with the WITH HOLD option specified (as in the example shown) is opened, it will remain open across transaction boundaries until it is explicitly closed; otherwise, it will be implicitly closed when the transaction that opens it is terminated. In this example, the cursor is

opened, the first three rows are fetched from it, the transaction is committed (but the cursor is not closed), another row is fetched from the cursor, and then the cursor is closed. Thus, the last value obtained will be: TAB1 -----------------------------COL_1 --------Question 12 COL_2 ---------

Question 13

Question 14

Question 15

Question 16

Question 17

Question 18

D 40 The correct answer is B. All cursors must be declared before they can be used; their names must be unique within the source code file that uses them; and they can be read-only, updatable, or ambiguous. However, because cursors are only used by applications (to retrieve and process multiple rows of data), their definitions are not stored in the database, nor is any space reserved in the database for them. The correct answer is D. When a full outer join operation is performed, rows that would have been returned by an inner join operation and all rows stored in both tables of the join operation that would have been eliminated by the inner join operation are returned in the result data set produced. The correct answer is C. Table TAB1 is created, two rows are inserted, and the first row is deleted because a COMMIT statement follows each of these operations. The next two rows that are inserted are removed because a ROLLBACK statement follows their insertion. And finally, the value for COL2 of all rows is set to null again because this operation is followed by a COMMIT statement. The correct answer is B. User-defined functions simplify application maintenance (because the processing they perform is managed by the database and not the application that uses them). They are often used to provide the same functionality for user-defined data types that built-in functions provide for the built-in data types that user-defined data types are based on, and their name and calling convention is controlled by the user who creates them. They have no impact on concurrency. The correct answer is A. The ON DELETE NO ACTION definition ensures that when a delete operation is performed on the parent table in a referential constraint, the value for the foreign key of each row in the child table will have a matching value in the parent key of the parent table (after all other referential constraints have been applied). Therefore, no row will be deleted from TABLEA because a row exists in TABLEB that references the row that the DELETE statement is trying to remove. Because the ON DELETE CASCADE definition was not used, no row will be deleted from TABLEB. The correct answer is C. The statement "SELECT SUM(col2)/COUNT(*) FROM tab1" will return the value 2; the statement "SELECT MAX(col2) FROM tab1" will return the value 4; and the statement "SELECT STRLEN(col1) FROM tab1 WHERE col2 = 4" is invalid because STRLEN() is not a valid function. However, the statement "SELECT LENGTH(col1) FROM tab1 WHERE col2 = 4" will return the value 5, which is the largest computed value. The correct answer is B. The argument of a columnar function is a collection of like values. The arguments of a scalar function are

Question 19

Question 20

Question 21

Question 22

individual scalar values, which can be of different data types and can have different meanings. Table functions are special functions because they return a table to the SQL statement that references them and can only be specified in the FROM clause of a SELECT statement. There are no grouping functions. The correct answer is C. When a cursor is opened, it is placed in the "Open" state, and the cursor pointer is positioned before the first row of data in the result data set produced; if the result data set is empty, the position of the cursor is effectively "after the last row," and any subsequent FETCH operations performed will generate a NOT FOUND (SQLCODE +100, SQLSTATE 02000) condition. The correct answer is D. In this example, the effects of the update and delete operations and the results of the two insert operations that follow the delete operation are removed when the ROLLBACK TO SAVEPOINT sp1 SQL statement is executed. The effects of the first two insert operations and the last insert operation are externalized to the database when the transaction is committed; thus, 3 rows are added to the table TAB1. The correct answer is A. The statement "DECLARE cur1 CURSOR FOR SELECT * FROM t1 FOR UPDATE OF t1" is not valid. Neither is the statement "DECLARE cur1 CURSOR FOR UPDATE OF c1 FROM t1". The statement "DECLARE cur1 CURSOR FOR SELECT * FROM t1 FOR UPDATE OF c1" creates a cursor that allows only update operations to be performed on column C1. Therefore, the correct answer is A. The correct answer is C. When the COMMIT statement is used to terminate a transaction, all changes made to the database since the transaction began are made permanent, and any locks that were acquired on behalf of the transaction, with the exception of those locks acquired for held cursors, are immediately released. However, any data pages that were copied to a buffer pool on behalf of a transaction will remain in the buffer pool until their storage space is neededat that time they will be removed. Furthermore, if a cursor is created within a transaction and the WITH HOLD option is specified when the DECLARE CURSOR statement is executed, the cursor created will remain open (once it has been opened) across transaction boundaries and must be explicitly closed.

Chapter 4: Embedded SQL Programming


Overview
Thirteen percent (13%) of the DB2 UDB V8.1 Family Application Development exam (Exam 703) is designed to test your knowledge of embedded SQL programming and to test your ability to create a simple embedded SQL application. The questions that make up this portion of the exam are intended to evaluate the following: Your ability to identify the steps involved in developing an embedded SQL application. Your ability to declare host variables and to use host variables in a query. Your ability to declare indicator variables and to use indicator variables in a query. Your ability to explain and analyze the contents of an SQLCA data structure variable. Your ability to establish a connection to a database within an embedded SQL application. Your ability to capture and process errors when they occur. Your ability to identify the steps required to convert a source code file containing embedded SQL into an executable application.

This chapter is designed to introduce you to embedded SQL programming and to walk you through the basic steps used to construct an embedded SQL application. This chapter is also designed to introduce you to the process used to convert one or more source code files containing embedded SQL into an executable application. Terms you will learn: Structured Query Language (SQL) Embedded SQL EXEC SQL Static SQL Dynamic SQL Host variables Declare section BEGIN DECLARE SECTION END DECLARE SECTION Indicator variables SQL Communications Area (SQLCA) data structure SQL Descriptor Area (SQLDA) data structure SQLVAR variables Base SQLVARs Secondary SQLVARs INCLUDE CONNECT "Type 1" connections Remote unit of work "Type 2" connections Application-directed distributed unit of work Connection states SET CONNECTION PREPARE EXECUTE EXECUTE IMMEDIATE Parameter markers Typed parameter markers Untyped parameter markers Cursor SELECT INTO VALUES INTO DISCONNECT SQL return code WHENEVER Administrative APIs Get Error Message API Get SQLSTATE Message API SQL precompiler Package Precompiling Compiling Linking Binding Deferred binding Techniques you will master: Understanding how embedded SQL applications are developed. Recognizing the difference between static SQL and dynamic SQL. Knowing how to declare host variables and indicator variables.

Knowing how to use host variables and indicator variables in an embedded SQL statement. Knowing how to analyze the contents of an SQL Communications Area (SQLCA) data structure variable. Knowing how to establish a connection to a database server from within an embedded SQL application. Recognizing the difference between Type 1 and Type 2 connections. Knowing how to use the WHENEVER SQL statement and the Get Error Message API to process errors and obtain diagnostic information. Knowing how to convert a source code file that contains embedded SQL into an executable application

An Introduction to Embedded SQL


Earlier, we saw that Structured Query Language (SQL) is a standardized language used to work with database objects and the data they contain. SQL is comprised of several statements that are used to define, alter, and destroy database objects as well as add, update, delete, and retrieve data values. However, because SQL is nonprocedural by design, it is not a general purpose programming language (SQL statements are executed by DB2, not by the operating system). Therefore, database applications are normally developed by combining the decision and sequence control of a high-level programming language with the data storage, manipulation, and retrieval capabilities of SQL. Several methods are available for sending SQL statements from an application to DB2 for processing, but the simplest technique is a method known as embedded SQL. As the name implies, embedded SQL applications are constructed by embedding SQL statements directly into one or more source code files that will be used to create a database application. Embedded SQL statements can be static or dynamic, and as we will soon see, each has its advantages and disadvantages. One of the drawbacks to developing applications using embedded SQL is that high-level programming language compilers do not recognize, and therefore cannot interpret, any SQL statements encountered. Because of this, source code files containing embedded SQL statements must be preprocessed before they can be compiled (and linked) to produce a database application. To facilitate this preprocessing, every SQL statement coded in a high-level programming language source code file must be prefixed with the keywords "EXEC SQL" and terminated with either a semicolon (C/C++, FORTRAN) or the keywords "END-EXEC" (COBOL). When the preprocessor (a special tool known as the SQL precompiler we will look at the SQL precompiler a little later) encounters these keywords in a source code file, it replaces all text that follows (until a semicolon or the keywords "END-EXEC" are found) with a DB2 UDB-specific function call that forwards the SQL statement specified to the DB2 Database Manager for processing. Additionally, the SQL precompiler performs error checking on each SQL statement used. Likewise, the DB2 Database Manager cannot work directly with high-level programming language variables. Instead, it must use special variables known as host variables to move data between an application and a database. (We will take a closer look at host variables a little later.) Host variables look like any other high-level programming language variables, so to set them apart, they must be defined within a special section known as a declare section. In order for the SQL precompiler to distinguish host variables from other text used in an SQL statement, all references to host variables must be preceded by a colon (:).

Static SQL
You may recall that in Chapter 2, "Database Objects and Programming Methods," we saw that a static SQL statement is an SQL statement that can be hard-coded in an application program at the time of development because information about its structure and the objects (i.e., tables, column, and data types) it is intended to interact with is known in advance. Because the details of a static SQL statement are known at development time, the work of analyzing the statement and selecting the optimum data access plan to use when executing the statement is done by the DB2 optimizer as part of the development process. Thus, static SQL statements execute quickly because their operational form already exists in the database and does not have to be generated at application run time. The downside to this is that all static SQL statements must be prepared (in other words, their access plans must be generated and stored in the database) before they can be executed. The SQL statements themselves

cannot be altered at run time, and each application that uses static SQL must "bind" its operational package(s) to every database the application is to interact with. Note Because static SQL applications require prior knowledge of database objects, changes made to these objects after the application is developed can produce undesirable results. The following are examples of static SQL statements: SELECT COUNT(*) FROM EMPLOYEES UPDATE EMPLOYEES SET LASTNAME = 'Jones' WHERE EMPID = '001' SELECT MAX(SALARY), MIN(SALARY) INTO :MaxSalary, :MinSalary FROM EMPLOYEES Generally, static SQL statements are well suited for high-performance applications that execute predefined operations against a known set of database objects.

Dynamic SQL
Although static SQL statements are relatively easy to incorporate into an application, their use is somewhat limited because their format must be known in advance and they require host variables to move data between the application and a database. Dynamic SQL statements, however, are much more flexible because they can be constructed at application run time; information about a dynamic SQL statement's structure and the objects (i.e., tables, column, and data types) it is intended to interact with does not have to be known at development time. Furthermore, because dynamic SQL statements do not have a precoded, fixed format, the data object(s) used can change each time the statement is executed. Dynamic SQL statements can also work directly with application variables, so host variables are not required. Although dynamic SQL statements are generally more flexible than static SQL statements, they are typically more complicated to incorporate into an application. And because the work of analyzing the statement and selecting the optimum data access plan to use to execute the statement is done at application run time, dynamic SQL statements can take longer to execute than their equivalent static SQL counterparts. (Because dynamic SQL statements can take advantage of the database statistics available at run time, some cases exist in which a dynamic SQL statement will execute faster than an equivalent static SQL statement, but those are the exception and not the norm.) The following are examples of dynamic SQL statements: SELECT COUNT(*) FROM ? INSERT INTO EMPLOYEES VALUES (?, ?) DELETE FROM DEPARTMENT WHERE DEPTID = ? Generally, dynamic SQL statements are well suited for applications that interact with a rapidly changing database or that allow users to define and execute ad-hoc queries. Dynamic SQL statements are also useful when data is to be added to a database and the data values to be added are not know in advance (many applications use dynamic SQL specifically for this reason).

The SQL Statements Available and How They Can Be Used in an Application
Essentially, any SQL statement that is recognized by DB2 UDB can be embedded in an application program source code file. However, how a statement can be used (statically or dynamically) varies. The

SQL statements that are recognized by DB2 UDB, along with information on how they can be used, can be seen in Table 4-1. Table 4-1: DB2 UDB SQL Statements SQL Statement Purpose Usage

Embedded SQL Application Construction BEGIN DECLARE SECTION END DECLARE SECTION DECLARE INCLUDE WHENEVER BEGIN COMPOUND Marks the beginning of a host variable declaration section. Marks the end of a host variable declaration section. Defines an SQL variable or condition (typically used in stored procedures). Inserts declarations into a source code file. Defines (and undefines) actions that are to be taken when SQL errors or warnings are generated. Marks the beginning of a compound SQL statement block. (Compound SQL statement blocks are used to combine two or more SQL substatements into one executable block that is treated as a single SQL statement.) Marks the end of a compound SQL statement block. Marks the beginning of a compound dynamic SQL statement block. Marks the end of a compound dynamic SQL statement block. Removes the association between a large object (LOB) locator variable and its value. Static only Static only Static only Static only Static only Static only

END COMPOUND BEGIN ATOMIC END FREE LOCATOR CREATE TRANSFORM

Static only Dynamic only Dynamic only Static only Static, and in special cases, dynamic Static only

Defines transformation functions or methods, identified by a group name, that are used to exchange structured data type values with host language programs and external functions and methods. Connection and Transaction Management CONNECT Establishes a connection to a specific database and establishes rules for either a remote unit of work (Type 1) or an application-directed unit of work (Type 2). Changes the state of a database connection from "Dormant" to "Current," thereby making the specified connection the current active connection. Places one or more connections in "Release Pending" state. (All connections in this state are automatically terminated when the current transaction is committed.) Terminates and closes a database connection. Prevents concurrent transactions from changing and/or accessing data stored in a table.

SET CONNECTION RELEASE

Static only

Static only

DISCONNECT LOCK TABLE

Static only Static or dynamic

Table 4-1: DB2 UDB SQL Statements SQL Statement COMMIT ROLLBACK SAVEPOINT RELEASE SAVEPOINT GRANT REVOKE Purpose Terminates the current transaction and makes all modifications made by the transaction permanent. Terminates the current transaction and backs out all modifications made by the transaction. Sets a savepoint within a transaction. Usage Static or dynamic Static or dynamic Static or dynamic Static or dynamic Static or dynamic Static or dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and

Releases a savepoint that was set earlier within a transaction. Database Control Language (DCL) Statements Gives one or more users and/or groups one or more authorizations and/or privileges.

Takes one or more authorizations and/or privileges away from one or more users and/or groups. Data Definition Language (DDL) Statements CREATE DATABASE PARTITIONING GROUP ALTER DATABASE PARTITION GROUP CREATE BUFFERPOOL Defines and creates a new database partition group.

Modifies an existing database partition group.

Defines and creates a new buffer pool.

ALTER BUFFERPOOL

Modifies an existing buffer pool.

CREATE TABLESPACE

Defines and creates a new tablespace.

ALTER TABLESPACE

Modifies an existing tablespace.

RENAME TABLESPACE

Renames an existing tablespace.

CREATE TABLE

Defines and creates a new table.

Table 4-1: DB2 UDB SQL Statements SQL Statement Purpose Usage in special cases, dynamic ALTER TABLE Modifies an existing table. Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special

CREATE VIEW

Defines and creates a new view.

ALTER VIEW

Modifies an existing view by altering a reference type column to add scope.

CREATE SCHEMA

Defines and creates a new schema.

CREATE ALIAS

Defines and creates a new alias.

CREATE INDEX

Defines and creates a new index.

CREATE INDEX EXTENSION

Defines and creates a new extension object to use with indexes on tables that have structured type or distinct type columns. Defines and creates a new distinct data type.

CREATE DISTINCT TYPE

ALTER TYPE

Modifies an existing user-defined structured type.

CREATE FUNCTION

Creates and defines (or registers) a new user-defined function.

ALTER FUNCTION

Modifies the properties of an existing user-defined function.

Table 4-1: DB2 UDB SQL Statements SQL Statement Purpose Usage cases, dynamic CREATE METHOD Associates a method body with a method specification that is already part of the definition of a user-defined structured type. Modifies an existing method by changing the method body associated with the method. Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static or dynamic

ALTER METHOD

CREATE PROCEDURE

Creates and defines (or registers) a new stored procedure.

ALTER PROCEDURE

Modifies an existing stored procedure by changing the properties of the procedure.

CREATE SEQUENCE

Creates and defines a new sequence.

ALTER SEQUENCE

Modifies an existing sequence.

CREATE TRIGGER

Creates and defines a new trigger.

DROP

Deletes an existing object.

COMMENT

Adds or replaces comments in the catalog descriptions of various objects.

RENAME

Renames an existing table or index.

Data Manipulation Language (DML) Statements INSERT Adds new data to a table (may or may not be via an updatable view).

Table 4-1: DB2 UDB SQL Statements SQL Statement UPDATE DELETE SELECT SELECT INTO VALUES VALUES INTO Purpose Modifies existing data stored in a table. Removes existing data from a table. Retrieves one or more rows of data from a database and returns them to a result data set. Retrieves one (and only one) row of data from a database and returns it to one or more host variables. Produces a result data set that contains only one row of data. Usage Static or dynamic Static or dynamic Static or dynamic Static only Static or dynamic Static only

Produces a result data set that contains only one row of data and returns its value(s) to one or more host variables. SQL Statement Processing PREPARE DESCRIBE EXECUTE EXECUTE IMMEDIATE DECLARE CURSOR OPEN FETCH Prepares an SQL statement for execution. Obtains information about a prepared SQL statement. Executes a prepared SQL statement. Prepares and executes an SQL statement. Defines a new cursor. Opens a declared cursor. Advances the cursor pointer to the next row in a result data set, retrieves a single row of data, and copies any data retrieved to all host variables specified.

Static only Static only Static only Static only Static only Static only Static only

CLOSE

Closes an open cursor. Federated Server Management CREATE SERVER Defines a new data source to a federated server.

Static only Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and

ALTER SERVER

Modifies the definition of a data source or makes changes in a data source configuration that will persist across multiple connections. Defines and creates a new nickname for a table or view in a federated data source.

CREATE NICKNAME

ALTER

Modifies a federated database's representation of a

Table 4-1: DB2 UDB SQL Statements SQL Statement NICKNAME Purpose data source, table, or view. Usage in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static, and in special cases, dynamic Static or dynamic Static or dynamic Static or special dynamic Static or dynamic Static or dynamic Static or dynamic

CREATE WRAPPER

Defines and creates a new wrapper. (A wrapper is a mechanism by which a federated server interacts with a certain category of data sources.) Modifies the properties of a wrapper.

ALTER WRAPPER

CREATE TYPE MAPPING

Creates a mapping between the data type of a column of a federated data source table or view and a corresponding data source data type. Creates a mapping between a federated data source function or function template and a corresponding data source function. Creates a mapping between an authorization ID that uses a federated database and the authorization ID and password used at a specified data source. Modifies the authorization ID or password that is used at a data source for a specified federated server authorization ID. Specifies a server option setting that is to remain in effect while a user or application is connected to a federated database.

CREATE FUNCTION MAPPING CREATE USER MAPPING

ALTER USER MAPPING

SET SERVER OPTION SET PASSTHRU

Opens and closes a session for submitting a data source's native SQL directly to that data source. Special Register Manipulation SET CURRENT DEFAULT TRANSFORM GROUP SET CURRENT DEGREE SET CURRENT EXPLAIN MODE SET CURRENT EXPLAIN SNAPSHOT Changes the value of the CURRENT DEFAULT TRANSFORM GROUP register.

Changes the value of the CURRENT DEGREE special register. Changes the value of the CURRENT EXPLAIN MODE special register. Changes the value of the CURRENT EXPLAIN SNAPSHOT special register.

Table 4-1: DB2 UDB SQL Statements SQL Statement SET CURRENT MAINTAINED TABLE TYPES SET CURRENT PACKAGESET SET CURRENT QUERY OPTIMIZATION SET CURRENT REFRESH AGE SET PATH SET SCHEMA Database Monitoring CREATE EVENT MONITOR FLUSH EVENT MONITOR SET EVENT MONITOR STATE Defines and creates an event monitor (i.e., identifies database events to be monitored). Forces an event monitor to write (flush) all values stored in its active internal buffers to the appropriate output object. Activates or deactivates an event monitor. Static or dynamic Static or dynamic Static, and in special cases, dynamic Static or dynamic Purpose Changes the value of the CURRENT MAINTAINED TABLE TYPES special register. Changes the value of the CURRENT PACKAGESET special register. Changes the value of the CURRENT QUERY OPTIMIZATION special register. Changes the value of the CURRENT REFRESH AGE special register. Changes the value of the CURRENT PATH special register. Changes the value of the CURRENT SCHEMA special register. Usage Static or dynamic Static or dynamic Static or dynamic Static or dynamic Static or dynamic Static or dynamic

EXPLAIN

Captures information about the access plan chosen for the supplied SQL statement and stores this information in the Explain tables. Miscellaneous Statements DECLARE GLOBAL TEMPORARY TABLE REFRESH TABLE FLUSH PACKAGE CACHE SET INTEGRITY SET ENCRYPTION PASSWORD CALL Defines a temporary table for the current session.

Static or dynamic

Refreshes the data in a materialized query table. Removes all cached dynamic SQL statements currently in the package cache. Enables and disables integrity checking on a table. Sets the password that will be used by the ENCRYPT, DECRYPT_BIN and DECRYPT_CHAR functions. Invokes a stored procedure.

Static or dynamic Static or dynamic Static or dynamic Static or dynamic Static or

Table 4-1: DB2 UDB SQL Statements SQL Statement Purpose Usage dynamic SET Assigns values to local variables or to new transition variables. Dynamic compound statements, triggers, SQL functions, or SQL methods.

Parts of an Embedded SQL Application


Now that we have examined the SQL statements that are available, let's take a look at how embedded SQL applications are developed. Every source code file that makes up an embedded SQL application can be divided into three distinct parts: Prologue Body Epilogue This division is noticeable because some SQL statements are coded at the beginning of a source-code file to handle the transition from high-level programming language to embedded SQL processing while other statements typically come at the end of the file to handle the transition back and, in some cases, to handle error and warning conditions. The Body can consist of any number statements and, with a few exceptions, the statements used can be executed in any order. Figure 4-1 illustrates some of the embedded SQL statements that are typically used to construct the Prologue, Body, and Epilogue of a source code file.

Figure 4-1: Parts of an embedded SQL source code file.

Prologue
As the name implies, the Prologue is located at the beginning of a source code file. This is where all host variables, indicator variables, and SQL data structures that will be used by the application are defined. This is also where SQL statements are coded that tell the SQL precompiler to generate source code that evaluates the SQL Communications Area each time an SQL statement is executed and branch to the epilogue if an error, warning, or exception condition occurs.

Declaring Host Variables


Earlier, we saw that host variables are used to move data between an application and a database. For host variables to be distinguished from other highlevel programming language variables, they must be defined within a special section known as the declare section. The beginning of a declare section is defined by the BEGIN DECLARE SECTION SQL statement, and the end is defined by the END DECLARE SECTION SQL statement. Thus, a typical declare section in a C/C++ source code file might look something like this: EXEC SQL BEGIN DECLARE SECTION char EmployeeID[7]; double Salary; EXEC SQL END DECLARE SECTION A declare section may be coded anywhere variable declarations can appear in accordance with the rules of the high-level programming language being used, and although a source code file usually contains only one declare section, multiple sections are allowed. Host variables that receive data from a database

are known as output host variables, but those that transfer data to a database are known as input host variables. Regardless of whether a host variable is an input variable or output variable, its attributes must be appropriate for the context in which it is used. Thus, you must define host variables such that their data types and lengths are compatible with the data types and lengths of the columns they are designed to work with. To determine the appropriate data type to assign to a host variable, you should obtain information about the data type and length of the column or special register that the variable will be associated with and refer to the conversion charts found in the IBM DB2 Universal Database Application Development GuideProgramming Client Applications documentation. (Table 13 addresses SQL data type to C/C++ data type conversion.) Additionally, each host variable used in an application must be assigned a unique name; duplicate names are not allowed, even when host variables are defined in different declare sections. Note Whenever possible, a special tool known as the Declaration Generator should be used to generate host variable declarations for the columns of a given table in a database. This tool creates embedded SQL declaration source code files, which can easily be inserted into C/C++, Java, COBOL, and FORTRAN applications. For more information about this tool, refer to the db2dclgen command in the DB2 UDB Command Reference product documentation. Once a host variable has been created, just how is it used to move data between an application and a database? The easiest way to answer this question is by examining a simple embedded SQL source code fragment where host variables are used. The following pseudo-source code, written in the C programming language, illustrates the proper use of host variables: ... // Define The SQL Host Variables Needed EXEC SQL BEGIN DECLARE SECTION; char EmployeeNo[7]; char LastName[16]; EXEC SQL END DECLARE SECTION; ... // Retrieve A Record From The Database EXEC SQL SELECT EMPNO, LASTNAME INTO :EmployeeNo, :LastName FROM EMPLOYEE WHERE EMPNO = '000100'; // Do Something With The Results ...

Declaring Indicator Variables


By default, columns in a table can contain null values, and because null values are not stored the same way that conventional data is stored, special provisions must be made if an application intends to work with null data. Specifically, null values cannot be retrieved and copied to host variables in the same manner that other data values can. Instead, a special flag must be examined to determine whether a value is intended to be null. The value of this flag can only be obtained by associating a special variable known as an indicator variable (or null indicator variable) with a host variable that has been assigned to a "nullable" column. Because indicator variables must be accessible by both the DB2 Database Manager and the application program, they must be defined inside a declare section and they must be assigned a data type that is compatible with the DB2 UDB SMALLINT data type. Thus, the code used to define an indicator variable in a C/C++ source code file would typically look something like this:

EXEC SQL BEGIN DECLARE SECTION short SalaryNullIndicator; EXEC SQL END DECLARE SECTION Once an indicator variable has been associated with a host variable (an indicator variable is associated with a host variable by immediately following it when the host variable is used in an SQL statement), it can be examined as soon as its corresponding host variable has been populatedif the indicator variable contains a negative value, a null value was found and the value in the corresponding host variable should be ignored. Again, in order to understand how indicator variables are used, it helps to look at an example embedded SQL source code fragment. The following pseudosource code, written in the C programming language, shows one example of how indicator variables are defined and used: ... // Define The SQL Host Variables Needed EXEC SQL BEGIN DECLARE SECTION; char EmployeeNo[7]; double Salary; // Salary Used If SalaryNI Is // Positive ( >= 0 ) short SalaryNI; // Salary NULL Indicator Used // To Determine If Salary // Value Should Be NULL EXEC SQL END DECLARE SECTION; ... // Declare A Static Cursor EXEC SQL DECLARE C1 CURSOR FOR SELECT EMPNO, DOUBLE(SALARY) FROM EMPLOYEE; // Open The Cursor EXEC SQL OPEN C1; // If The Cursor Was Opened Successfully, Retrieve And // Display All Records Available while (sqlca.sqlcode == SQL_RC_OK) { // Retrieve The Current Record From The Cursor EXEC SQL FETCH C1 INTO :EmployeeNo, :Salary :SalaryNI; // If The Salary Value For The Record Is NULL, ... if (SalaryNI < 0) { printf("No salary information is available for"); printf("employee %s\n", EmployeeNo);

} } // Close The Open Cursor EXEC SQL CLOSE C1; ... Indicator variables can also be used to send one or more null values to a database when an insert or update operation is performed; when processing INSERT and UPDATE SQL statements, the DB2 Database Manager examines the value of the indicator variable first (if one exists), and if it contains a negative value, the DB2 Database Manager assigns a null value to the appropriate column, provided null values are allowed. (If the indicator variable is set to zero or contains a positive number, the DB2 Database Manager assigns the value stored in the corresponding host variable to the appropriate column instead.) Thus, the code used in a C/C++ source code file to assign a null value to a column in a table would look something like this: ValueInd = -1; EXEC SQL INSERT INTO TAB1 VALUES (:Value :ValueInd);

Declaring SQL Data Structure Variables


So far, we have only looked at how host variables and indicator variables are used to move data between embedded SQL applications and database objects. However, embedded SQL applications also need to communicate with the DB2 Database Manager. Two special SQL data structures are used to establish this vital communication link: the SQL Communications Area (SQLCA) data structure and the SQL Descriptor Area (SQLDA) data structure. The SQL Communications Area (SQLCA) data structure contains a collection of elements that are updated by the DB2 Database Manager each time an SQL statement or a DB2 administrative API function is executed. For the DB2 Database Manager to populate this data structure, a data structure variable must exist; therefore, any application that contains embedded SQL or calls one or more administrative API functions must define at least one SQLCA data structure variable. In fact, such an application will not compile successfully if an SQLCA data structure variable does not exist. (Applications that are precompiled with the LANGLEVEL SQL92E option specified must use an SQLCODE or SQLSTATE variable instead of an SQLCA data structure variable.) Table 4-2 lists the elements that make up an SQLCA data structure variable. Table 4-2: Elements of an SQLCA Data Structure Variable Element Name sqlcaid Data Type CHAR(8) Description An "eye catcher" for storage dumps. To help visually identify the data structure, this element normally contains the value "SQLCA." The size, in bytes, of the SQLCA data structure itself. This element should always contain the value 136. The SQL return code value. A value of 0 means "successful execution," a positive value means "successful execution with warnings," and a negative value means "error." Refer to the DB2 UDB Message Reference Volume 1 and 2 product manuals to obtain more information about a specific SQL return code value. The size, in bytes, of the data stored in the sqlerrmc element of this structure. This value can be any number between 0 and 70; a value of 0 indicates that no data has been stored in

sqlcabc sqlcode

INTEGER INTEGER

sqlerrml

SMALLINT

Table 4-2: Elements of an SQLCA Data Structure Variable Element Name sqlerrmc Data Type Description the sqlerrmc field. CHAR(70) One or more error message tokens, separated by the value 0xFF, that are to be substituted for variables in the descriptions of warning/error conditions. This element is also used when a successful connection is established. A diagnostic value that represents the type of DB2 server currently being used. This value begins with a three-letter code identifying the product version and release and is followed by five digits that identify the modification level of the product. For example, "SQL08014" means DB2 Universal Database, version 8, release 1, modification level 4. If the sqlcode element contains a negative value, this element will contain an 8-character code that identifies the module that reported the error. An array of six integer values that provide additional diagnostic information when an error occurs. Table 4-3 describes the type of diagnostic information that can be returned in this element. An array of character values that serve as warning indicators, each containing either a blank or the letter W. If compound SQL was used, this field will contain an accumulation of the warning indicators that were set for all substatements. Table 4-4 describes the types of warning information that can be returned in this element. The SQLSTATE value that identifies the outcome of the most recently executed SQL statement.

sqlerrp

CHAR(8)

sqlerrd

INTEGER ARRAY

sqlwarn

CHAR(11)

sqlstate

CHAR(5)

Table 4-3: Elements of the sqlca.sqlerrd Array Array Element sqlerrd[0] Description If a connection was successfully established, this element will contain the expected difference in length of mixed-character data (CHAR data types) when it is converted from the application code page used to the database code page used. A value of 0 or 1 indicates that no expansion is anticipated; a positive value greater than 1 indicates a possible expansion in length; and a negative value indicates a possible reduction in length. If a connection was successfully established, this element will contain the expected difference in length of mixed-character data (CHAR data types) when it is converted from the database code page used to the application code page used. A value of 0 or 1 indicates that no expansion is anticipated; a positive value greater than 1 indicates a possible expansion in length; and a negative value indicates a possible reduction in length. If the SQLCA data structure contains information for compound SQL, this element will contain the number of substatements that failed (if any). sqlerrd[2] If the SQLCA data structure contains information for a CONNECT SQL statement that executed successfully, this element will contain the value 1 if the connected database is updatable and the value 2 if the connected database is read-only.

sqlerrd[1]

Table 4-3: Elements of the sqlca.sqlerrd Array Array Element Description If the SQLCA data structure contains information for a PREPARE SQL statement that executed successfully, this element will contain an estimate of the number of rows that will be returned in a result data set in response to the prepared statement. If the SQLCA data structure contains information for an INSERT, UPDATE, or DELETE SQL statement that executed successfully, this element will contain a count of the number of rows affected by the operation. If the SQLCA data structure contains information for compound SQL, this element will contain a count of the number of rows affected by all substatements. sqlerrd[3] If the SQLCA data structure contains information for a CONNECT SQL statement that executed successfully, this element will contain the value 0 if one-phase commit from a down-level client is being used; the value 1 if one-phase commit is being used; the value 2 if one-phase, read-only commit is being used; and the value 3 if two-phase commit is being used. If the SQLCA data structure contains information for a PREPARE SQL statement that executed successfully, this element will contain a relative cost estimate of the resources needed to prepare the statement specified. If the SQLCA data structure contains information for compound SQL, this element will contain a count of the number of substatements that executed successfully. sqlerrd[4] If the SQLCA data structure contains information for a CONNECT SQL statement that executed successfully, this element will contain the value 0 if server authentication is being used, the value 1 if client authentication is being used, the value 2 if authentication is being handled by DB2 Connect, the value 3 if authentication is being handled by DCE Security Services, and the value 255 if the method of authentication cannot be determined. If the SQLCA data structure contains information for anything else, this element will contain a count of the total number of rows inserted, updated, or deleted as a result of the DELETE rule of one or more referential integrity constraints or the activation of one or more triggers. (If the SQLCA data structure contains information for compound SQL, this element will contain a count of all such rows for each substatement successfully processed.) sqlerrd[5] For partitioned databases, this element contains the partition number of the partition that encountered an error or warning. If no errors or warnings were encountered, this element will contain the partition number of the partition that serves as the coordinator node. Description This element is blank if all other elements in the array are blank; this element contains the character W if one or more of the other elements available is not blank. This element contains the character W if the value for a column with a character string data type was truncated when it was assigned to a host variable. (This element contains the character N if the null-terminator for the

Table 4-4: Elements of the sqlca.sqlwarn Array Array Element sqlwarn[0]

sqlwarn[1]

Table 4-4: Elements of the sqlca.sqlwarn Array Array Element Description string was truncated.) sqlwarn[2] sqlwarn[3] sqlwarn[4] sqlwarn[5] sqlwarn[6] sqlwarn[7] sqlwarn[8] sqlwarn[9] sqlwarn[10] This element contains the character W if null values were eliminated from the arguments passed to a function. This element contains the character W if the number of values retrieved does not equal the number of host variables provided. This element contains the character W if a dynamic UPDATE or DELETE SQL statement that does not contain a WHERE clause was prepared. This element is reserved for future use. This element contains the character W if the result of a date calculation was adjusted to avoid an invalid date value. This element is reserved for future use. This element contains the character W if a character that could not be converted was replaced with a substitution character. This element contains the character W if one or more errors in an arithmetic expression were ignored during column function processing.

This element contains the character W if a conversion error occurred while converting a character data value in another element of the SQLCA data structure variable. The SQL Descriptor Area (SQLDA) data structure contains a collection of elements that are used to provide detailed information to the PREPARE, OPEN, FETCH, and EXECUTE SQL statements. This data structure consists of a header followed by an array of structures, each of which describes a single host variable or a single column in a result data set. Table 4-5 lists the elements that make up an SQLDA data structure variable. Table 4-5: Elements of an SQLDA Data Structure Variable Element Name sqldaid Data Type CHAR(8) Description An "eye catcher" for storage dumps. To help visually identify the data structure, this element normally contains the value "SQLDA." The size, in bytes, of the SQLDA data structure itself. The value assigned to this element is determined using the following equation: sqldabc = 16 + (44 * sqln). The total number of elements in the sqlvar array. The number of columns in the result data set returned by a DESCRIBE or a PREPARE SQL statement or the number of host variables described by the elements in the sqlvar array.

sqldabc

INTEGER

sqln sqld

SMALLINT SMALLINT

STRUCTURE An array of data structures that contain information ARRAY about host variables or result data set columns. As you can see, in addition to some basic information, an SQLDA data structure variable contains an arbitrary number of occurrences of sqlvar data structures that are referred to as SQLVAR variables. The

sqlvar

information stored in each SQLVAR variable used is dependent on where the SQLDA data structure variable is to be usedwhen used with a PREPARE or a DESCRIBE SQL statement, each SQLVAR variable used will contain information about the columns that will exist in the result data set produced on behalf of the prepared SQL statement. (If any of the columns have a large object (LOB) or user-defined data type, the number of SQLVAR variables used will be doubled and the seventh byte of the character string value stored in the sqldaid element of the SQLDA data structure variable will be assigned the value 2.) On the other hand, when the SQLDA data structure variable is to be used with an OPEN, FETCH, or EXECUTE SQL statement, each SQLVAR variable used should contain information about the host variables that are to be passed to the DB2 Database Manager. Two types of SQLVAR variables are used: base SQLVARs and secondary SQLVARs. Base SQLVARs contain base information (such as data type code, length attribute, column name, host variable address, and indicator variable address) for result data set columns or host variables. Table 4-6 lists the elements that make up a base SQLVAR data structure variable. Table 4-6: Elements of an SQLVAR Data Structure Variable Element Name sqltype sqllen sqldata Data Type SMALLINT SMALLINT Pointer Description The data type of a host variable used or the data type of a column in the result data set produced. The size (length), in bytes, of a host variable used or the size of a column in the result data set produced. A pointer to a location in memory where the data for a host variable used is stored or a pointer to a location in memory where data for a column in the result data set produced is to be stored. A pointer to a location in memory where the data for the Indicator associated with a host variable used is stored or a pointer to a location in memory where the data for the Indicator associated with a column in the result data set produced is to be stored. The unqualified name of a host variable or a column in the result data set produced.

sqlind

Pointer

sqlname

VARCHAR(30)

For distinct data types, secondary SQLVARs contain the distinct data type name. For LOB data types, these SQLVARs contain the length attribute of the column or host variable and a pointer to the buffer that contains the actual length of the data. Secondary SQLVAR entries are present only if the number of SQLVAR entries is doubled because LOB or distinct data types are usedif locators or file reference variables are used to represent LOB data types, secondary SQLVAR entries are not needed. The information stored in an SQLDA data structure variable, along with the information stored in any corresponding SQLVAR variables, may be placed there manually (using the appropriate programming language statements) or can be generated automatically using the DESCRIBE SQL statement. Both an SQLCA data structure variable and an SQLDA data structure variable can be created by embedding the appropriate form of the INCLUDE SQL statement (INCLUDE SQLCA and INCLUDE SQLDA, respectively) within the Prologue of an embedded SQL source code file. Alternately, these structure variables can be defined like any other variable would be defined with the high-level programming language being used.

The Body
The Body follows the Prologue and makes up the bulk of an embedded SQL source code file. This is where all the SQL statements that enable an application program to access and manipulate data stored

in a database reside. The Body usually begins by establishing a connection to a database server. Once a connection has been established, SQL statements that define and maintain database objects, manipulate data, and initiate control operations such as granting or revoking user authority are issued as appropriate. These statements can be issued within any number of transactions, and the SQL statements and transactions used compose the remainder of the Body.

Establishing a Database Connection


In order to perform any type of operation against a database, a connection to that database must first be established. With embedded SQL applications, database connections are made (and in some cases are terminated) by executing the CONNECT SQL statement. (The RESET option of the CONNECT statement is used to terminate a connection.) During the connection process, the information needed to establish a connection (such as authorization ID and corresponding password of an authorized user) is passed to the database specified for validation. (Often, this information is collected at application run time and passed to the CONNECT statement by way of one or more host variables.) Embedded SQL applications have the option of using two different types of connection semantics. These two types, known simply as "Type 1" and "Type 2", support two types of transaction behavior: Type 1 connections support only one database connection per transaction (referred to as a remote unit of work), and Type 2 connections support any number of database connections per transaction (referred to as a application-directed distributed unit of work). Essentially, when Type 1 connections are used, an application can be connected to only one database at a time; once a connection to a database has been established and a transaction started, that transaction must be either committed or rolled back before another database connection can be established. When Type 2 connections are used, however, an application can be connected to several different databases at the same time, and each database connection will have its own transaction boundary. The actual type of connection semantics an application will use is determined by the value assigned to the CONNECT, SQLRULES, DISCONNECT, and SYNCPOINT SQL precompiler options at the time the application is precompiled. Note If an application program is made up of several source code files and Type 1 connections are being used, the CONNECT statement needs to be in the Body of the source code file that will be executed first. In addition, the CONNECT statement must be coded as a static SQL statement; it cannot be dynamically prepared. A word about connection states. When Type 2 connections are used, each time the CONNECT statement is executed, any database connection that existed before the CONNECT statement was executed is placed in the "Dormant" state; the new database server name is added to the list of available servers; and the new connection is placed into both the "Current" state and the "Held" state. (Initially, all database connections are placed in the "Held" state, which means that the connection will not be terminated the next time a commit operation is performed.) When the RELEASE SQL statement is executed, a connection is removed from the "Held" state and placed in the "Release-Pending" state, which means that the connection will be terminated by the next successful commit operation (rollback operations have no effect on connections). Regardless of whether a connection is in the "Held" or "Release-Pending" state, it can also be in the "Current" or "Dormant" state. When a connection is in the "Current" state, SQL statements executed by the application can reference data objects that are managed by the corresponding database server. (You can find out which connection is in the "Current" state by examining the value of the CURRENT SERVER special register.) When a connection is in the "Dormant" state, it is no longer current, and no SQL statement is allowed to reference its data objects. Either the SET CONNECTION SQL statement or the CONNECT RESET statement can be used to change the state of a specific connection from the "Dormant" state to the "Current" state, which automatically places all other existing connections in the "Dormant" state. (Only one connection can be in the "Current" state at any given point in time.)

Preparing and Executing SQL Statements


When static SQL statements are embedded in an application, they are executed as they are encountered. However, when dynamic SQL statements are used, there are two ways in which they can be processed:

Prepare and Execute. This approach separates the preparation of the SQL statement from its actual execution and is typically used when an SQL statement is to be executed repeatedly. This method is also used when an application needs advance information about the columns that will exist in the result data set produced when a query is executed. The SQL statements PREPARE and EXECUTE are used to process dynamic SQL statements in this manner. Execute Immediately. This approach combines the preparation and the execution of an SQL statement into a single step and is typically used when an SQL statement is to be executed only once. This method is also used when the application does not need additional information about the result data set that will be produced, if any, when the SQL statement is executed. The SQL statement EXECUTE IMMEDIATE is used to process dynamic SQL statements in this manner. Dynamic SQL statements that are prepared and executed (using either method) at run time are not allowed to contain references to host variables. They can, however, contain parameter markers in place of constants and/or expressions. Parameter markers are represented by the question mark (?) character and indicate the position in the SQL statement where the current value of one or more host variables or elements of an SQLDA data structure variable are to be substituted when the statement is actually executed. (Parameter markers are typically used where a host variable would be referenced if the SQL statement being executed were static.) Two types of parameter markers are available: typed and untyped. Typed parameter markers. A typed parameter marker is a parameter marker that is specified along with its target data type. Typed parameter markers have the general form: CAST(? AS DataType) This notation does not imply that a function is called, but rather "promises" that the data type of the value replacing the parameter marker at application run time will either be the data type specified or a data type that can be converted to the data type specified. For example, in the SQL statement: UPDATE EMPLOYEE SET LASTNAME = CAST(? AS VARCHAR(12)) WHERE EMPNO = '000050' the value for the LASTNAME column will be provided at application run time, and the data type of that value will be either VARCHAR(12) or a data type that can be converted to VARCHAR(12). Untyped parameter markers. An untyped parameter marker is a parameter marker that is specified without a target data type and has the form of a single question mark (?). The data type of an untyped parameter marker is determined by the context in which it is used. For example, in the SQL statement: UPDATE EMPLOYEE SET LASTNAME = ? WHERE EMPNO = '000050' the value for the LASTNAME column is provided at application run time, and the data type of that value will be compatible with the data type that has been assigned to the LASTNAME column of the EMPLOYEE table. When parameter markers are used in embedded SQL applications, values that are to be substituted for parameter markers placed in a dynamic SQL statement must be provided as additional parameters to the EXECUTE or the EXECUTE IMMEDIATE SQL statement when either is used to execute the dynamic SQL statement specified. The following pseudo-source code example, written in the C programming language, illustrates how actual values would be provided for parameter markers that have been coded in a simple UPDATE SQL statement: ... // Define The SQL Host Variables Needed EXEC SQL BEGIN DECLARE SECTION; char SQLStmt[80]; char JobType[10]; EXEC SQL END DECLARE SECTION; ...

// Define A Dynamic UPDATE SQL Statement That Uses A // Parameter Marker strcpy(SQLStmt, "UPDATE EMPLOYEE SET JOB = ? "); strcat(SQLStmt, "WHERE JOB = 'DESIGNER'"); // Populate The Host Variable That Will Be Used In // Place Of The Parameter Marker strcpy(JobType, "MANAGER"); // Prepare The SQL Statement EXEC SQL PREPARE SQL_STMT FROM :SQLStmt; // Execute The SQL Statement EXEC SQL EXECUTE SQL_STMT USING :JobType; ...

Retrieving and Processing Results


Regardless of whether an SQL statement is static or dynamic, once it has been executed from within an embedded SQL application, any results produced will need to be retrieved and processed. If the SQL statement was anything other than a SELECT or VALUES statement, the only additional processing required after execution is a check of the SQLCA data structure to ensure that the statement executed as expected. However, if a SELECT or VALUES statement was executed and a result data set was produced, additional steps are needed to retrieve each row of data from the result data set. You may recall that in Chapter 3, "Data Manipulation," we saw that when a SELECT statement is executed from within an application, DB2 UDB uses a mechanism known as a cursor to retrieve data values from any result data set produced. A cursor indicates the current position in the result data set (i.e. the current row) and identifies which row of data will be returned to the application next. The following steps must be followed (in the order shown) if a cursor is to be incorporated into an embedded SQL application: 1. Declare (define) a cursor along with its type (read-only or updatable) and associate it with the desired query (SELECT SQL statement). (This is done by executing the DECLARE CURSOR statement.) 2. Open the cursor. This will cause the corresponding query to be executed and a result data set to be produced. (This is done by executing the OPEN statement.) 3. Retrieve (fetch) each row in the result data set, one by one, until an "End of data" condition occurseach time a row is retrieved from the result data set, the cursor is automatically moved to the next row. (This is done by repeatedly executing the FETCH statement; host variables or an SQLDA data structure variable are used in conjunction with a FETCH statement to extract a row of data from a result data set.) 4. If appropriate, modify or delete the current row (but only if the cursor is an updatable cursor). (This is done by executing the UPDATE or DELETE statement.) 5. Close the cursor. This will cause the result data set (that was produced when the corresponding query was executed) to be deleted. (This is done by executing the CLOSE statement.) Now that we have seen the steps that must be performed in order to use a cursor, let's examine how these steps are typically coded in an application. The following pseudo-source code example, written in

the C programming language, illustrates how a cursor would be used to retrieve the results of a SELECT SQL statement: ... // Declare The SQL Host Memory Variables EXEC SQL BEGIN DECLARE SECTION; char EmployeeNo[7]; char LastName[16]; EXEC SQL END DECLARE SECTION; ... // Declare A Cursor EXEC SQL DECLARE C1 CURSOR FOR SELECT EMPNO, LASTNAME FROM EMPLOYEE WHERE JOB = 'DESIGNER'; // Open The Cursor EXEC SQL OPEN C1; // Fetch The Records while (sqlca.sqlcode == SQL_RC_OK) { // Retrieve A Record EXEC SQL FETCH C1 INTO :EmployeeNo, :LastName; // Process The Information Retrieved if (sqlca.sqlcode == SQL_RC_OK) ... } // Close The Cursor EXEC SQL CLOSE C1; ... If it is known in advance that only one row of data will be produced in response to a query, the contents of that row can be copied to host variables within an application program by executing one of two statements: the SELECT INTO statement or the VALUES INTO statement. Like the SELECT SQL statement, the SELECT INTO statement can be used to construct complex queries. However, unlike the SELECT statement, the SELECT INTO statement requires a list of valid host variables to be supplied as part of its syntax, and it cannot be used dynamically. Similar to the SELECT INTO statement, the VALUES INTO statement can be used to retrieve the data associated with a single record and copy it to one or more host variables. The VALUES INTO statement cannot be used to construct complex queries in the same way the SELECT INTO statement can;

however, like the SELECT INTO statement, when the VALUES INTO statement is executed, all data retrieved is stored in a result data set and if this result data set contains only one record, the first value in that record is copied to the first host variable specified, the second value is copied to the second host variable specified, and so on. On the other hand, if the result data set produced contains more than one record, the operation will fail and an error will be generated. (If the result data set produced is empty, a NOT FOUND warning will be generated.)

Managing Transactions
In Chapter 3, "Data Manipulation," we saw that a transaction (also known as a unit of work) is a sequence of one or more SQL operations grouped together as a single unit, usually within an application process. A given transaction can be comprised of any number of SQL operationsfrom a single operation to many hundreds or even thousands, depending upon what is considered a "single step" within your business logic. The initiation and termination of a single transaction defines points of data consistency within a database; the effects of all operations performed within a transaction are either applied to the database and made permanent (committed), or backed out (rolled back) and the database is returned to the state it was in before the transaction was initiated. In most cases, transactions are initiated the first time an executable SQL statement is executed after a connection to a database has been established or immediately after a pre-existing transaction has been terminated. (Transactions are initiated in the Body of an embedded SQL application source code file rather than in the Prologue because the SQL statements used in the Prologue are non-executable statements.) Once initiated, transactions can be implicitly terminated using a feature known as "automatic commit" (in which case, each executable SQL statement is treated as a single transaction and any changes made by that statement are applied to the database if the statement executes successfully or discarded if the statement fails), or they can be explicitly terminated by executing the COMMIT or ROLLBACK SQL statement. In either case, all transactions associated with a particular database should be completed before the connection to that database is terminated. (Open transactions are committed automatically on termination of a database connection; however, this behavior may change in future versions of DB2 UDB.) It is important to note that although all transactions associated with a particular database should be completed before the connection to that database is terminated, you should not wait until you are about to terminate a database connection before you decide to end a transactiondoing so can cause concurrency and locking problems among other applications. That's because whenever a transaction is initiated, one or more locks may be acquired on the transaction's behalf, even if the transaction does not perform any data modification operations. However, as soon as a transaction is committed or rolled back, all locks that were acquired on behalf of the transaction, with the exception of those locks acquired for held cursors, are immediately released. Therefore, it is a good idea to commit transactions as soon as application requirements permit so that locks (and, consequently, other resources) are not held needlessly.

Epilogue
The Epilogue, as the name implies, is located at the end of an embedded SQL source code file. This is where any remaining active transactions are committed or rolled back and all database connections that were established within the Body are terminated. This is also where all generic error handling routines that are used with WHENEVER SQL statements are stored. Existing database connections can be terminated by executing the DISCONNECT SQL statement. (The CONNECT RESET statement can also be used to terminate a database connection provided Type 1 connections are used.) If Type 1 connection semantics are being used, only the DISCONNECT statement (or CONNECT RESET statement) is needed. However, if Type 2 connection semantics are used, one DISCONNECT statement must be executed, either in the Body or in the Epilogue, for each database connection that exists.

Note

If an application program is made up of several source code files and Type 1 connections are being used, the DISCONNECT (or CONNECT RESET) statement needs to be in the Epilogue of the source code file that will be executed last.

Putting It All Together


Now that we have examined the three distinct parts (Prologue, Body, and Epilogue) that all embedded SQL applications are comprised of, let's see how each of these parts are typically coded in an embedded SQL application. A simple embedded SQL application, written in the C programming language, that obtains and prints employee identification numbers, last names, and salaries for all employees who have the job title DESIGNER using static SQL might look something like this: #include <stdio.h> #include <string.h> #include <sql.h> int main() { /*-----------------------------------------------------*/ /* PROLOGUE */ /*-----------------------------------------------------*/ // Include The SQLCA Data Structure Variable EXEC SQL INCLUDE SQLCA; // Define The SQL Host Variables Needed EXEC SQL BEGIN DECLARE SECTION; char EmployeeNo[7]; char LastName[16]; double Salary; short SalaryNI; EXEC SQL END DECLARE SECTION; /*-----------------------------------------------------*/ /* BODY */ /*-----------------------------------------------------*/ // Connect To The Appropriate Database EXEC SQL CONNECT TO SAMPLE USER db2admin USING ibmdb2; // Declare A Static Cursor EXEC SQL DECLARE C1 CURSOR FOR SELECT EMPNO, LASTNAME, DOUBLE(SALARY)

FROM EMPLOYEE WHERE JOB = 'DESIGNER'; // Open The Cursor EXEC SQL OPEN C1; // If The Cursor Was Opened Successfully, Retrieve // And Display All Records Available while (sqlca.sqlcode == SQL_RC_OK) { // Retrieve The Current Record From The Cursor EXEC SQL FETCH C1 INTO :EmployeeNo, :LastName, :Salary :SalaryNI; // Display The Record Retrieved if (sqlca.sqlcode == SQL_RC_OK) { printf("%-8s %-16s ", EmployeeNo, LastName); if (SalaryNI < 0) printf("%lf\n", Salary); else printf("Unknown\n"); } } // Close The Open Cursor EXEC SQL CLOSE C1; // Commit The Transaction EXEC SQL COMMIT; /*-----------------------------------------------------*/ /* EPILOGUE */ /*-----------------------------------------------------*/ // Terminate The Database Connection EXEC SQL DISCONNECT CURRENT; // Return Control To The Operating System return(0);

} On the other hand, a simple embedded SQL application, written in the C programming language, that changes the job title for all employees who have the job title DESIGNER to MANAGER using dynamic SQL might look something like this: #include <stdio.h> #include <string.h> #include <sql.h> int main() { /*-----------------------------------------------------*/ /* PROLOGUE */ /*-----------------------------------------------------*/ // Include The SQLCA Data Structure Variable EXEC SQL INCLUDE SQLCA; // Define The SQL Host Variables Needed EXEC SQL BEGIN DECLARE SECTION; char SQLStmt[80]; char JobType[10]; EXEC SQL END DECLARE SECTION; /*-----------------------------------------------------*/ /* BODY */ /*-----------------------------------------------------*/ // Connect To The Appropriate Database EXEC SQL CONNECT TO SAMPLE USER db2admin USING ibmdb2; // Define A Dynamic UPDATE SQL Statement That Uses A // Parameter Marker strcpy(SQLStmt, "UPDATE EMPLOYEE SET JOB = ? "); strcat(SQLStmt, "WHERE JOB = 'DESIGNER'"); // Populate The Host Variable That Will Be Used In // Place Of The Parameter Marker strcpy(JobType, "MANAGER"); // Prepare The SQL Statement EXEC SQL PREPARE SQL_STMT FROM :SQLStmt;

// Execute The SQL Statement EXEC SQL EXECUTE SQL_STMT USING :JobType; // Commit The Transaction EXEC SQL COMMIT; /*-----------------------------------------------------*/ /* EPILOGUE */ /*-----------------------------------------------------*/ // Terminate The Database Connection EXEC SQL DISCONNECT CURRENT; // Return Control To The Operating System return(0);

Diagnostics and Error Handling


Earlier, we saw that the SQL Communications Area (SQLCA) data structure contains a collection of elements that are updated by the DB2 Database Manager each time an SQL statement or a DB2 administrative API function is executed. One element of that structure, the sqlcode element, is assigned a value that indicates the success or failure of the SQL statement or DB2 administrative API function executed. (A value of 0 means "successful execution," a positive value means "successful execution with warnings," and a negative value means "error.") Error handling is an important part of any application; embedded SQL applications are no exception. At a minimum, an embedded SQL application should always check the sqlcode value produced (often referred to as the SQL return code) immediately after an SQL statement is executed; when an SQL statement fails to execute as expected, users should be notified that an error or warning condition has occurred and whenever possible, they should be provided with sufficient diagnostic information so they can locate and correct the problem. As you might imagine, checking the sqlcode value after an SQL statement is executed can add additional overhead to an applicationespecially when an application contains a large number of SQL statements. However, because every SQL statement coded in an embedded SQL application source code file must be processed by the SQL precompiler, it is possible to have the precompiler automatically generate the source code that is needed to check SQL statement return codes. This is accomplished by embedding one or more forms of the WHENEVER SQL statement into a source code file, usually within the Prologue. The WHENEVER statement tells the precompiler to generate source code that evaluates SQL return codes and branches to a specified label whenever an error, warning, or "out of data" condition occurs. (If the WHENEVER statement is not used, the default behavior is to ignore SQL return codes and continue processing as if no problems have been encountered.) Four forms of the WHENEVER statement are availableone for each of the three types of error/warning conditions the WHENEVER statement can be used to check for and one to turn error checking off: WHENEVER SQLERROR GOTO [Label]. Instructs the precompiler to generate source code that evaluates SQL return codes and branches to the label specified when a negative sqlcode value is generated. WHENEVER SQLWARNING GOTO [Label]. Instructs the precompiler to generate source code that evaluates SQL return codes and branches to the label specified when a positive sqlcode value (other than the value +100) is generated.

WHENEVER NOT FOUND GOTO [Label]. Instructs the precompiler to generate source code that evaluates SQL return codes and branches to the label specified when an sqlcode value of +100 or an sqlstate value of 02000 is generated. WHENEVER [SQLERROR | SQL WARNING | NOT FOUND] CONTINUE. Instructs the precompiler to ignore the SQL return code and continue with the next instruction in the application. A source code file can contain any combination of these four forms of the WHENEVER statement; the order in which the first three forms appear is insignificant. However, once any form of the WHENEVER statement has been used, the SQL return codes of all subsequent SQL statements executed will be evaluated and processed accordingly until the application ends or until another WHENEVER statement alters this behavior. A simple embedded SQL application written in the C programming language that uses every form of the WHENEVER statement available to trap and process errors, warnings, and "out of data" conditions might look something like this: #include <stdio.h> #include <string.h> #include <sql.h> int main() { /*-----------------------------------------------------*/ /* PROLOGUE */ /*-----------------------------------------------------*/ // Include The SQLCA Data Structure Variable EXEC SQL INCLUDE SQLCA; // Define The SQL Host Variables Needed EXEC SQL BEGIN DECLARE SECTION; char EmployeeNo[7]; EXEC SQL END DECLARE SECTION; // Set Up Error Handlers EXEC SQL WHENEVER SQLERROR GOTO ERROR_HANDLER; EXEC SQL WHENEVER SQLWARNING GOTO WARNING_HANDLER; EXEC SQL WHENEVER NOT FOUND GOTO NOT_FOUND_HANDLER; /*-----------------------------------------------------*/ /* BODY */ /*-----------------------------------------------------*/ // Connect To The Appropriate Database EXEC SQL CONNECT TO SAMPLE USER db2admin USING ibmdb2;

// Execute A SELECT INTO SQL Statement (This Will Cause // A "DATA NOT FOUND" Situation To Occur And The Code To // Branch To The NOT_FOUND_HANDLER Label) EXEC SQL SELECT EMPNO INTO :EmployeeNo FROM RSANDERS.EMPLOYEE WHERE JOB = 'CODER'; // Commit The Transaction EXEC SQL COMMIT; // Disable All Error Handling EXEC SQL WHENEVER SQLERROR CONTINUE; EXEC SQL WHENEVER SQLWARNING CONTINUE; EXEC SQL WHENEVER NOT FOUND CONTINUE; /*-----------------------------------------------------*/ /* EPILOGUE */ /*-----------------------------------------------------*/ // Prepare To Return To The Operating System goto EXIT; // Define A Generic Error Handler ERROR_HANDLER: printf("ERROR: SQL Code = %d\n", sqlca.sqlcode); EXEC SQL ROLLBACK; goto EXIT; // Define A Generic Warning Handler WARNING_HANDLER: printf("WARNING: SQL Code = %d\n", sqlca.sqlcode); EXEC SQL ROLLBACK; goto EXIT; // Define A Generic "Data Not Found" Handler NOT_FOUND_HANDLER: printf("NOT FOUND: SQL Code = %d\n", sqlca.sqlcode); EXEC SQL ROLLBACK; goto EXIT;

EXIT: // Terminate The Database Connection EXEC SQL DISCONNECT CURRENT; // Return Control To The Operating System return(0); } Unfortunately, the code that is generated when the WHENEVER SQL statement is used relies on GO TO branching instead of call/return interfaces to transfer control to the appropriate error handling section of an embedded SQL application. As a result, when control is passed to the source code that is used to process errors and warnings, the application has no way of knowing where control came from, nor does it have any way of knowing where to return control after the error or warning has been properly handled. For this reason, about the only thing an application can do when control is passed to a WHENEVER statement error handling label is display the error code generated, roll back the current transaction, and return control to the operating system. Because of these limitations, many application developers opt for developing their own error handling routine rather than relying on the basic error handling functionality that is provided when the WHENEVER statement is used.

The Get Error Message API


Among other things, most editions of DB2 UDB along with the DB2 Application Development Client contain a rich set of functions that are referred to as the administrative APIs (Application Programming Interfaces). These APIs are designed to provide services other than the data storage, manipulation, and retrieval functionality that SQL provides to DB2 UDB applications. (Essentially, any database operation that can be performed from the Command Line Processor by executing a DB2 command can be performed from within an application by calling an administrative API.) Any of the administrative APIs available can be called from within a high-level programming language source code file. When called, they operate in a manner similar to other host-language functionseach API has both a call and a return interface, and the calling application must wait until the requested API completes before it can continue. Earlier, we saw that each time an SQL statement or an administrative API is executed, the DB2 Database Manager updates an SQLCA data structure variable and the sqlcode element of that variable is assigned a value that indicates the success or failure of the operation. This value is actually a coded number, and a special administrative API can be used to translate any coded number produced into a meaningful description, which can then be displayed to the user. This API is known as the Get Error Message API, and the basic syntax used to call it from an application source code file is: sqlaintp (char short short *pBuffer, sBufferSize, sLineWidth,

struct sqlca *pSQLCA); (C/C++ high-level programming language applications) or sqlgintp (short short sBufferSize, sLineWidth,

struct sqlca *pSQLCA,

char

*pBuffer);

(Other high-level programming language applications) where: pBuffer

Identifies a location in memory where the Get Error Message API is to store any message text retrieved. Identifies the size, in bytes, of the memory storage buffer to which any message text retrieved is to be written. Identifies the maximum number of characters that one line of message text should contain before a line break is inserted. A value of 0 indicates that the message text is to be returned without line breaks.

sBufferSize

sLineWidth

Identifies a location in memory where an SQL Communications Area (SQLCA) data structure variable is stored. Each time this API is called, the value stored in the sqlcode element of the SQLCA data structure variable provided is used to locate and retrieve appropriate error message text from a message file that is provided with DB2 UDB. Thus, a simple embedded SQL application written in the C programming language that uses the Get Error Message API to obtain and display the message associated with the SQL return code generated when an attempt is made to establish a connection to a database using an invalid user ID might look something like this: #include <stdio.h> #include <string.h> #include <sql.h> int main() { // Include The SQLCA Data Structure Variable EXEC SQL INCLUDE SQLCA; // Declare The Local Memory Variables long RetCode = SQL_RC_OK; char ErrorMsg[1024]; // Attempt To Connect To A Data Source Using An Invalid // User ID (This Will Cause An Error To Be Generated) EXEC SQL CONNECT TO SAMPLE USER db2_admin USING ibmdb2; // If Unable To Establish A Data Source Connection, // Obtain Any Diagnostic Information Available

pSQLCA

if (sqlca.sqlcode != SQL_RC_OK) { // Retrieve The Error Message Text For The Error // Code Generated RetCode = sqlaintp(ErrorMsg, sizeof(ErrorMsg), 70, &sqlca); switch (RetCode) { case -1: printf("ERROR : Insufficient memory.\n"); break; case -3: printf("ERROR : Message file is "; printf("inaccessable.\n"); break; case -5: printf("ERROR : Invalid SQLCA, bad buffer, "); printf("or bad buffer length specified.\n"); break; default: printf("%s\n", ErrorMsg); break; } } // Return Control To The Operating System return(0); } As you can see in this example, when the Get Error Message API is called, it returns a value that indicates whether it executed successfully. In this case, the return code produced is checked, and if an error did occur, a message explaining why the API failed is returned to the user.

A Word about SQLSTATEs


DB2 UDB (as well as other relational database products) uses a set of error message codes known as SQLSTATEs to provide supplementary diagnostic information for warnings and errors. SQLSTATEs are alphanumeric strings that are five characters (bytes) in length and have the format ccsss, where cc indicates the error message class and sss indicates the error message subclass. Like SQL return code values, SQLSTATE values are written to an element (the sqlstate element) of the SQLCA data structure variable used each time an SQL statement is executed. Just as the Get Error Message API can be used to convert any SQL return code value generated into a meaningful description, another APIthe Get SQLSTATE Message APIcan be used to convert an SQLSTATE value into a meaningful description, as well. By including either (or both) of these APIs in your embedded SQL applications, you can always return meaningful information to the end user when error and/or warning conditions occur.

Creating Executable Applications

So far, we have looked at some of the basic steps used to embed SQL statements in application source code files, but we have only hinted at how source code files containing embedded SQL statements are converted into an actual working program. Once a source code file has been written, the following steps must be performed (in the order shown) before an application that interacts with a DB2 UDB database will be created: 1. All source code files containing embedded SQL statements must be precompiled to convert the embedded SQL statements used into DB2-specific function calls and to create a corresponding package. You must be connected to a database to run the SQL precompiler, and all packages created can be stored in the database being used by the SQL precompiler or written to a special file known as a "bind file," which can then be bound to any valid DB2 UDB database later (deferred binding). 2. All high-level programming language source code files produced by the SQL precompiler (and any additional source code files that are needed) must be compiled to create object modules. 3. All appropriate object modules must be linked with high-level programming language libraries and DB2 UDB libraries to create an executable program. 4. If the packages for the files that were processed by the SQL precompiler have not already been bound to the appropriate database, they must be bound using the bind files produced by the SQL precompiler. Figure 4-2 illustrates the embedded SQL source code file-to-executable application conversion process when deferred binding is used.

Figure 4-2: Converting a source code file containing embedded SQL statements into an executable application when deferred binding is used.

Precompiling Source Code Files


We learned earlier that, by design, high-level programming language compilers do not recognize, and therefore cannot interpret, SQL statements. Therefore, when SQL statements are embedded in a highlevel programming language source code file, they must be converted to source code that a high-level programming language compiler can understand, and this conversion process is performed by a special tool known as the SQL precompiler. (The SQL precompiler is included with DB2 UDB and is normally invoked from the Command Line Processor, a batch file, or a make utility file.) During the precompile process, a source code file containing embedded SQL statements is converted into a source code file that is made up entirely of high-level programming language statements. (The embedded SQL statements themselves are commented out, and DB2-specific function calls are stored in their place.) At the same time, a corresponding package that contains, among other things, the access plans that are to be used to process each static SQL statement embedded in the source code file is also produced. (Access plans contain optimized information that the DB2 Database Manager uses to execute SQL statementsaccess plans for static SQL statements are produced at precompile time, but access plans for dynamic SQL statements are produced at application run time.) Packages produced by the SQL precompiler can be stored in the database being used by the precompiler as they are generated or they can be written to an external bind file and "bound" to any valid DB2 UDB database later (the process of storing this package in the appropriate database is known as "binding"). By default, packages are automatically bound to the database used for precompiling during the precompile process. Note Never make changes to SQL precompiler-generated source code files. Any changes made will be lost the next time the original source code file is precompiled. Unless otherwise specified, the SQL precompiler is also responsible for verifying that all database objects (such as tables and columns) that have been referenced in static SQL statements actually exist and that all application data types used are compatible with their database counterparts (that's why a database connection must exist in order to use the SQL precompiler).

Compiling Source Code Files


Once a source code file containing embedded SQL statements has been processed by the SQL precompiler, the high-level programming language source code file produced, as well as any other source code files used, must be compiled by a high-level programming language compiler. The highlevel programming language compiler is responsible for converting source code files into object modules that the linker can use to create an executable program.

Linking Object Modules


When the source code files needed to build an application have been compiled successfully, the resulting object modules can be provided as input to the linker. The linker combines object modules, high-level programming language libraries, and DB2 UDB libraries to produce an executable application. In most cases, this executable application exists as an executable file. However, it can also exist as a shared library or dynamic-link library (DLL) that is loaded and executed by other executable applications.

Creating and Binding Packages


Earlier, we saw that when a source code file containing embedded SQL statements is processed by the SQL precompiler, a package containing data access plans is produced along with a source code file that is made up entirely of high-level programming language statements. This package must reside in an appropriate DB2 UDB database (i.e., a database that contains data objects referenced by the package) before the corresponding application can be executed against that database. The process of storing such a package in a DB2 UDB database is known as "binding," and by default, packages are automatically bound to the database being used by the SQL precompiler during the precompile process. However, by specifying the appropriate precompiler options, you can elect to store the steps needed to create the package in a separate file (rather than in a database) and complete the

binding process at a later point in time, using a tool known as the SQL Binder (or simply the Binder). This is referred to as deferred binding. Deferred binding is preferable if you want to perform any of the following: Defer binding until you have an application program that compiles and links successfully. Create a package under a different schema or under multiple schemas. Run an application against a database using different options (isolation level, Explain settings, etc.). By deferring the bind process, you can dynamically change these options without having to rebuild the application. Run an application against several different databases. By deferring the bind process, you can build your program once and bind it to any number of appropriate databases. Otherwise, you will have to rebuild the entire application each time you want to run it against a new database. Run an application against a database that has been duplicated on several different servers. By deferring the bind process, you can dynamically create your application database on each machine and then bind your program to the newly-created database (possibly as part of your application's installation process).

Practice Questions
Question 1 Which of the following indicates when the access plan is built for a static SQL statement? A. When the statement is executed B. When the application is precompiled C. When the statement is prepared D. When the application is compiled Under which of the following situations should dynamic SQL NOT be used? A. When the name of the view that is to be queried is not known at compile time B. When temporary tables that are referenced do not exist at compile time C. When optimum run-time performance is desired D. When the columns in an INSERT statement are not known at compile time Which of the following does NOT have to be available in order to develop embedded SQL applications? A. A database driver B. A precompiler C. A compiler D. A database Which of the following SQL statements illustrates the proper use of host variables? A. EXEC SQL CONNECT TO sample USER $UserID USING5 $Password B. EXEC SQL DECLARE c1 CURSOR FOR SELECT deptname FROM department WHERE deptnum = %DeptNumber% C. EXEC SQL SELECT workdept INTO :wdept :wdind FROM employee WHERE lastname = SMITH D. EXEC SQL UPDATE employee SET lastname = &lname WHERE lastname = Johnson Given the following table: TAB1

Question 2

Question 3

Question 4

Question 5

Question 6

Question 7

Question 8

---------------EMPID NAME ----- ------1 USER1 2 USER2 USER3 4 USER4 Assuming the EMPID column is defined as INTEGER and the NAME column is defined as VARCHAR(10) NOT NULL, how many host variables must be included in the DECLARE section of an embedded SQL application in order to retrieve all of the data found in table TAB1, one row at a time? A. 1 B. 2 C. 3 D. 4 Which two of the following can modify the contents of host variables within a program? A. Only SQL statements that are coded in the DECLARE section B. Only program statements that are coded in the DECLARE section C. Any user that can execute the program D. SELECT INTO, VALUES INTO, and FETCH statements that are coded outside of the DECLARE section E. Any program statement, regardless of where it is coded Which of the following SQLCA elements contains information about the DB2 UDB product version, release number, and modification number for the DB2 UDB server being used? A. sqlcode B. sqlerrp C. sqlcaid D. sqlerrd Given the following source code: EXEC SQL CONNECT TO sales USER db2admin USING ibmdb2; if (sqlca.sqlcode == SQL_RC_OK) printf("Connected to SALES\n"); EXEC SQL CONNECT TO payroll USER db2admin USING ibmdb2; if (sqlca.sqlcode == SQL_RC_OK) printf("Connected to PAYROLL\n"); Assuming Type 2 connections are used, which of the following statements will change the connection state for the PAYROLL database from "Current" to "Dormant"? A. EXEC SQL CONNECT sales B. EXEC SQL SET CONNECTION sales C. EXEC SQL SET CONNECTION payroll

Question 9

Question 10

Question 11

Question 12

Question 13

Question 14

D. EXEC SQL DISCONNECT payroll Which two of the following can be used to dynamically insert data within an embedded SQL application? A. ADD B. INSERT INTO C. EXECUTE D. INSERT E. EXECUTE IMMEDIATE Which of the following is used to convert information about SQL statements stored in a file generated by the SQL precompiler into a package that is stored in a database? A. Precompiler B. Bind utility C. Host language compiler D. Host language linker Which of the following takes place in the declare section of an embedded SQL application? A. Dynamic SQL statements are constructed B. Generic errors are processed C. Functions are declared D. Host variables are defined Which of the following is NOT a valid way to connect to a DB2 UDB database from an embedded SQL application? A. EXEC SQL CONNECT TO sample; B. EXEC SQL CONNECT TO sample USER :userID USER :password; C. EXEC SQL CONNECT TO :dbalias USER :userID USER :password; D. EXEC SQL CONNECT TO :dbalias USER ? USING ?; Assuming deferred binding is not used, during which two of the following are SQL statements optimized when embedded SQL is used? A. Cursor open B. Precompile C. Application binding D. Statement preparation E. Statement execution Assuming all referenced database objects exist and all referenced host variables have been properly declared, which of the following embedded SQL statements can be successfully precompiled? A. SQL DELETE FROM tab1 WHERE col2 = :var1; B. EXEC SQL FETCH c1 INTO :var1 :var2 :var3; C. SQL EXEC SELECT col1, col2 FROM tab1 WHERE col2 = :var1; D. EXEC SQL SELECT col1, col2 FROM tab1 WHERE col1 = 001 INTO :var1:var2;

Question 15

Question 16

Which two of the following static SQL statements require the use of host variables? A. DELETE B. WHENEVER C. VALUES INTO D. CONNECT E. SELECT INTO Given the following table: TAB1 -----------------------------EMPNAME HIREDATE ---------------USER1 11/18/1961 USER2 06/06/1964 USER3 11/18/1987 USER4 04/09/1993 And the following embedded SQL source code: EXEC SQL DECLARE c1 CURSOR AS SELECT * FROM tab1; EXEC SQL OPEN c1; EXEC SQL FETCH c1 INTO :var1, :var2; EXEC SQL CLOSE c1; Which of the following host variable data types is required for VAR2? A. Float B. Decimal C. Integer array D. Character string

Answers Question 1

Question 2

Question 3

The correct answer is B. The access plans for static SQL statements are generated when the application is precompiled, unless binding is deferred. (The access plans are stored in the database as packages during bindingbinding can be performed automatically when a source code file containing embedded static SQL statements is precompiled or it can occur later, provided a bind file is generated by the SQL precompiler.) The access plan for a dynamic SQL statement, however, is generated when the statement is prepared, by either the PREPARE SQL statement or EXECUTE IMMEDIATE statement. (Remember, when the EXECUTE IMMEDIATE statement is used to execute an SQL statement, the statement is prepared and executed in a single step.) The correct answer is C. Although dynamic SQL statements are generally more flexible than static SQL statements, they are typically more complicated to incorporate into an application. And because the work of analyzing the statement and selecting the optimum data access plan to use to execute the statement is done at application run time, dynamic SQL statements can take longer to execute than their equivalent static SQL counterparts. Therefore, static SQL statements are better suited for high-performance applications than dynamic SQL statements are. The correct answer is A. Database drivers are required to develop CLI/ODBC applications but are not needed to develop embedded SQL applications. However, in order to develop an embedded SQL

Question 4

Question 5

Question 6

Question 7

Question 8

Question 9

Question 10

application, you must have an SQL precompiler, a compiler, and a linker. Because the SQL precompiler must be connected to a database before it can be used, you must have a valid database as well. The correct answer is C. All references to host variables must be preceded by a colon (:) (remember, an indicator variable is a special type of host variable), and if an indicator variable is used (which is the case in answer C), it must follow the host variable with which it is associated. The correct answer is C. One host variable is needed for values stored in the EMPID column, one host variable is needed to store null indicator values for the EMPID column, and one host variable is needed for values stored in the NAME column. An additional host variable is not needed to store null indicator values for the NAME column because that column cannot contain null values. The correct answers are D and E. Once a host variable has been declared, any statement in the source code file in which it is used can modify the host variable's contents. (Such a file will be comprised of embedded SQL statements and high-level programming language statements.) The correct answer is B. After an SQL statement has been processed, the sqlerrp element of the SQLCA data structure variable being used will contain a diagnostic value that represents the type of DB2 server currently being used. This value begins with a three-letter code identifying the product version and release and is followed by five digits that identify the modification level of the product. For example, "SQL08014" means DB2 Universal Database, version 8, release 1, modification level 4. If the sqlcode element contains a negative value, this element will contain an 8-character code that identifies the module that reported the error. The correct answer is B. When Type 2 connections are used, either the SET CONNECTION SQL statement or the CONNECT RESET statement can be used to change the state of a specific connection from "Dormant" to "Current," which will automatically place all other connections in the "Dormant" state because only one connection can be in the "Current" state at any given point in time. In this example, the connection to the PAYROLL database is in the "Current" state, and when executed, the statement EXEC SQL SET CONNECTION SALES will place the connection to the SALES database in the "Current" state, causing the connection to the PAYROLL database to be placed in the "Dormant" state. The correct answers are C and E. When dynamic SQL statements are used in an embedded SQL application, they can be processed in one of two ways: Prepare and Execute. This approach separates the preparation of the SQL statement from its actual execution and is typically used when an SQL statement is to be executed repeatedly. The SQL statements PREPARE and EXECUTE are used to process dynamic SQL statements in this manner. Execute Immediately. This approach combines the preparation and execution of an SQL statement into a single step and is typically used when an SQL statement is to be executed only once. The SQL statement EXECUTE IMMEDIATE is used to process dynamic SQL statements in this manner. The correct answer is B. When a source code file containing embedded SQL statements is processed by the SQL precompiler, a package containing data access plans is produced, along with a source code file

Question 11

Question 12

Question 13

Question 14

Question 15

Question 16

that is made up entirely of high-level programming language statements. The process of storing this package in a DB2 UDB database is known as "binding" and by default, packages are automatically bound to the database being used by the SQL precompiler during the precompile process. However, by specifying the appropriate precompiler options, you can elect to store the steps needed to create the package in a separate file (rather than in a database) and complete the binding process at a later point in time using the Bind utility (which is invoked by executing the BIND command). The correct answer is D. To distinguish host variables from other highlevel programming language variables, they must be defined within a special section known as the declare section. The beginning of a declare section is defined by the BEGIN DECLARE SECTION SQL statement, and the end is defined by the END DECLARE SECTION SQL statement. The correct answer is D. With embedded SQL applications, database connections are made by executing the CONNECT SQL statement, and this statement cannot be dynamically prepared. Dynamic SQL statements are not allowed to contain references to host variables; however, they can contain parameter markers, which are represented by the question mark (?) character, in place of constants and/or expressions. Therefore, because the SQL statement shown in answer D appears to be a dynamic SQL statement (because parameter markers are used), the statement will fail. The correct answers are B and D. Access plans are generated once an SQL statement has been optimized; access plans for static SQL statements are produced at precompile time when deferred binding is not used, but access plans for dynamic SQL statements are produced at application run time during statement preparation. The correct answer is D. Source code files containing embedded SQL statements must be preprocessed by an SQL precompiler before they can be compiled by a high-level language compiler. To facilitate this preprocessing, each SQL statement used in an embedded SQL application must be prefixed with the keywords "EXEC SQL". Answers A and C do not begin with these keywords; therefore they cannot be precompiled. Answer B will fail because there are no commas between each host variable used. So the correct answer is Din this case, the lack of a comma between the host variables used implies that var2 is a null indicator variable for var1. The correct answers are C and E. If you know in advance that only one row of data will be produced in response to a query, you can copy the contents of that row to host variables without using a cursor by executing either a special form of the SELECT SQL statement known as the SELECT INTO statement or the VALUES INTO SQL statement. Both statements require a list of valid host variables to be supplied as part of their syntax and neither can be used dynamically, so parameter markers are not allowed. The correct answer is D. The attributes of a host variable that is to receive data from a database must be appropriate for the context in which the host variable is used. Thus, you must define host variables such that their data types and lengths are similar to the data types and lengths of the columns they are designed to work with. To determine the appropriate data type to assign to a host variable, you should obtain information about the data type and length of the column or special register that the variable will be associated with and refer to the conversion charts found in the IBM DB2 Universal Database Application Development GuideProgramming Client Applications documentation.

In this example, the data type for the HIREDATE column is obviously a date data type, and according to Table 13 in the IBM DB2 Universal Database Application Development GuideProgramming Client Applications documentation, which addresses SQL C/C++ data type conversion, a character string host variable must be used when retrieving date values from a DB2 UDB database.

You might also like