You are on page 1of 19

|nformat|on |ntegrat|on 8o|ut|ons

DataStage Enterprise Edition


Configuring Parallel DB2 Remote Connectivity



February 1, 2006
INFORMATION INTEGRATION SOLUTIONS

version 2.5 DataStage Enterprise Edition DB2 Configuration Page 2 of 19


1 Preface
This document is intended for those who are planning for and implementing !BN
WebSphere DataStage Enterprise Edition (DSfEE) requiring connectivity to DB2
Enterprise Server Edition (DB2). !t is intended to replace product documentation and
provide guidance when implementing systems where the database server is distinct from
the DSfEE server. Such an implementation is referred to as a remote server
implementation. !n the following sections, we discuss the issues that should be resolved
prior to installation, installation requirements, and configuration.

To demonstrate the processes discussed in this document, we will implement the DB2
Enterprise stage and use screen shots and actual files from this implementation.

Our example system will run DSfEE version 7.5.1a.
1.1 Organization
This document contains the following sections:
1 PREFACE......................................................................................................... 2
1.1 Organization .................................................................................................................... 2
1.2 Documentation Conventions ........................................................................................... 2
1.3 Goals and Target Audience.............................................................................................. 2
2 BACKGROUND ................................................................................................ 5
2.1 DB2 Stage Types within DataStage EE............................................................................ 5
2.2 DB2 Enterprise Stage Architecture.................................................................................. 6
3 PREREQUISITES............................................................................................. S
4 HOW-TO SET UP DB2 CONNECTIVITY FOR REMOTE SERVERS ..................... S
5 USING THE DB2 ENTERPRISE STAGE........................................................... 15
6 CONFIGURING MULTIPLE INSTANCES IN ONE JOB.................................... 1S
7 TROUBLESHOOTING..................................................................................... 1S
S PERFORMANCE NOTES................................................................................. 19
9 SUMMARY OF SETTINGS .............................................................................. 19

1.2 Documentation Conventions
This document uses the following conventions:
Convention Usage
Bold !n syntax, bold indicates commands, function names, keywords, and
options that must be input exactly as shown. !n text, bold indicates keys
INFORMATION INTEGRATION SOLUTIONS

version 2.5 DataStage Enterprise Edition DB2 Configuration Page 3 of 19
to press, function names, and menu selections.
!talic !n syntax, italic indicates information that you supply. !n text, italic also
indicates UN!X commands and options, file names, and pathnames.
Plain !n text, plain indicates prompts, commands and options, file names, and
pathnames.
Bold !talic Indicates: important information.
Courier
Courier indicates examples of source code and system output and
prompts.
Tahoma Bold !n examples, tahoma bold indicates characters that the user types or
keys the user presses (for example, <Return>).

A right arrow between menu commands indicates you should choose


each command in sequence. For example, Choose File Exit" means
you should choose File from the menu bar, and then choose Exit from
the File pull-down menu.
This
linecontinues
The continuation character is used in source code examples to indicate a
line that is too long to fit on the page, but must be entered as a single
line on screen.

The following are also used:
Syntax definitions and examples are indented for ease in reading.
All punctuation marks included in the syntax-for example, commas, parentheses,
or quotation marks-are required unless otherwise indicated.
Syntax lines that do not fit on one line in this manual are continued on subsequent
lines. The continuation lines are indented. When entering syntax, type the entire
syntax entry, including the continuation lines, on the same input line.
Text enclosed in parenthesis and underlined (like this) following the first use of
proper terms will be used instead of the proper term.

!nteraction with our example system will usually include the system prompt and the
command, most often on 2 or more lines. For example:

/home/dsadm @ database_server >
JbinJtar -cvf JdevJrmt0 JusrJdsadmJAscentialJDataStageJProjects
1.3 Goals and Target Audience
This document presents a detailed set of instructions for configuring connectivity from
DSfEE to a remote DB2 instance using the native parallel DB2 Enterprise stage.

The primary audience for this document is DataStage administrators and DB2 DBAs.
!nformation in certain sections may also be relevant for Technical Architects and System
Administrators.

For additional tips and best practices:
The Ascential Developer Net (ADN) is a set of online services designed to help project
managers, architects, analysts and developers with their data integration tasks using !BN
!nformation !ntegration products and technologies.

The Ascential Developer Net allows you to share ideas, ask questions among your peers at
other companies around the world, share files, tips 8 tricks, and search the archive.
INFORMATION INTEGRATION SOLUTIONS

version 2.5 DataStage Enterprise Edition DB2 Configuration Page + of 19
Ascential Developer Net provides interactive forums with subscription capabilities that
automatically build a valuable knowledgebase from which Ascential can build better
products and lasting customer partnerships.

The Ascential Developer Net will additionally be able to share documents, configuration
files, code samples, and more. Ascential is committed to making ADN the premier location
for developer resources for data integration.

A link to the Ascential Developer Net can be found in the Help->About dialog on the
DataStage clients. Or, you can access it at the following URL:
http:ffdevelopernet.ascential.comf

INFORMATION INTEGRATION SOLUTIONS

version 2.5 DataStage Enterprise Edition DB2 Configuration Page 5 of 19
2 Background
2.1 DB2 Stage Types within DataStage EE
There are four stages available on the DSfEE Designer canvas that can access DB2:
DB2 AP! - plug-in data access for read, insert, update-insertion (upsert) and delete.
DB2 Load - plug-in data access for load
Dynamic RDBNS - plug-in data access for read, insert, upsert and delete.
Enterprise ODBC - native non-parallel data access for read, insert, upsert and
delete.
DB2 Enterprise - native parallel data access for read, insert, upsert, delete and
load.

The plug-in stages are designed for lower-volume access to DB2 databases without the
DPF option installed (prior to DB2 UDB v8, DB2 EE"). These stages also provide
connectivity to non-UN!X DB2 databases, databases on UN!X platforms that differ from
the platform of the DataStage ETL server, or DB2 databases on Windows or Nainframe
platforms (except for the Load" stage against a mainframe DB2 instance which is not
supported).



Figure 1: DB2 stages available on the DSfEE Parallel Job design palette

By facilitating flexible connectivity to multiple types of remote DB2 database servers, the
use of DataStage plug-in stages expands the range of options available to the designer.
However, this flexibility limits overall performance and scalability. Furthermore, when
used as data sources, plug-in stages cannot read from DB2 in parallel.

Using the DB2 API stage or the Dynamic RDBMS stage, it is possible to access a DB2
with Data Partitioning Facility (DPF) database in parallel by manually partitioning data and
stages on the canvas for each partition of the database. Because each plug-in invocation
will open a separate connection to the same target DB2 database table, the ability to
function in parallel may be limited by the table and index configuration set by the DB2
database administrator. This document does not provide any further discussion of this
technique.
INFORMATION INTEGRATION SOLUTIONS

version 2.5 DataStage Enterprise Edition DB2 Configuration Page 6 of 19

The capabilities of each DB2 stage are summarized in the following table. For specific
details on the stage capabilities, consult the DataStage documentation (DataStage Parallel
Job Developers Guide, DataStage Plug-!n guides)

DataStage
EE
Stage
Name
Stage
Type
DB2
Requirement
Supports
Partitioned
DB2?
Parallel
Read?
Parallel
Write?
Parallel
Sparse
Lookup
SQL
Open
J
Close
DB2
Enterprise
Native
Parallel
DPF,
Homogeneous
Hardware and
Operating
System
1

Yes f
directly to
each DB2
node
Yes Yes Yes Yes
DB2 AP! Plug-
!n
Any DB2 via
DB2 Client or
DB2-Connect
Yes f
through DB2
node 0
No Possible
Limitations
No No
Dynamic
RDBNS
Plug-
!n
Any DB2 via
DB2 Client or
DB2-Connect
Yes f
through DB2
node 0
No Possible
Limitations
No No
Enterprise
ODBC
Native Any DB2 via
DB2 Client or
DB2-Connect
Yes f
through DB2
node 0
No No No No
DB2 Load Plug-
!n
Subject to
DB2 Loader
Limitations
No No No No No
Figure 2: DSfEE DB2 Communication Options and Capabilities
1
!t is possible to connect the DB2 UDB stage to a remote database by cataloging the remote database in the
local instance and then using it as if it were a local database. This will only work when the authentication
mode of the database on the remote instance is set to client authentication". !f you use the stage in this
way, you may experience data duplication when working in partitioned instances since the node
configuration of the local instance may not be the same as the remote instance. For this reason, the client
authentication" configuration of a remote instance is not recommended.
2.2 DB2 Enterprise Stage Architecture
As a native, parallel component, the DB2 Enterprise stage is designed for maximum
performance and scalability. These goals are achieved through tight integration with DB2,
including direct communication with each DB2 database node, and reading from or writing
to DB2 in parallel (where appropriate), using the same data partitioning as the referenced
DB2 tables.

This section outlines the high-level architecture of the native parallel DB2 Enterprise stage
providing relevant background to understand its configuration as detailed in the remaining
sections of this document.

Prior to v7, DSfEE required the primary DataStage ETL server (aka conductor node") to
be installed on the DB2 coordinator server. Starting with v7 and later releases, DSfEE
provides remote DB2" configuration, separating the primary ETL server (conductor
INFORMATION INTEGRATION SOLUTIONS

version 2.5 DataStage Enterprise Edition DB2 Configuration Page 7 of 19
node") from the primary DB2 server (coordinator node or node zero") using the native
parallel DB2 Enterprise stage. Because DSfEE is tightly integrated with the DB2 servers
and routes data to individual nodes based on DB2 table partitioning, configuration is
provided by a combination of DB2 client and DSfEE clustered processing.

As outlined in Figure 3, the primary ETL server (conductor node") must have the 32-bit
DB2 client installed and configured to connect to the remote DB2 server instance. This is
the same DB2 client that DataStage uses to connect to DB2 databases through the DB2
plug-in stages (DB2 AP!, DB2 Load, Dynamic RDBNS) for reading, writing, and import of
metadata.


Primary ("conductor node)
DataStage EE Server
32-bit DB2 client
DB2 DPF node 1
DB2 DPF node n
DSEE engine
DSEE engine
DSEE engine
DB2 DPF node 0
DSEE engine


Figure 3: DSfEE DB2 Communication Architecture

The native parallel DB2 Enterprise stage of DataStage EE uses the DB2 client connection
to pre-query" the DB2 instance and determine partitioning of the source or target table.
This partitioning information is then used to readfwritefload data directly fromfto the
remote DB2 nodes based on the actual table configuration. This tight integration is
provided by routing data within the DSfEE engine to DSfEE engine nodes configured on
the DB2 instance server(s), which requires a clustered configuration of the DSfEE engine.

As with any clustered DSfEE configuration, the DSfEE engine and libraries must be
installed in the same location on all ETL and DB2 servers in the cluster. This is most easily
achieved by creating a shared mount point on the remote DSEE and DB2 nodes through
NFS or similar directory sharing methods.

The DB2 client does not have to be installed in the same location on all servers, as long as
all locations are included in the $PATH and $L!BPATH, $LD_L!BRARY_PATH, or
$SHL!B_PATH environment variable settings.

The connectivity scenario for a DataStage EE DB2 Enterprise stage is:

1) The DSfEE conductor node uses the DB2 environment variable
APT_DB2!NSTANCE_HONE as the location on the ETL server where the remote
DB2 server's db2nodes.cfg has been copied.

INFORMATION INTEGRATION SOLUTIONS

version 2.5 DataStage Enterprise Edition DB2 Configuration Page 8 of 19
2) DSfEE reads the file db2nodes.cfg from a sqllib subdirectory identified for the
specified DB2 instance. This file allows DSfEE to determine the individual network
node names of each DB2 node.

3) DSfEE scans the current DSfEE configuration file specified by the environment
variable $APT_CONF!G_F!LE (APT_CONF!G_F!LE) for node names whose
fastname properties match the node names provided in db2nodes.cfg. DSfEE
must find each DB2 node name in the APT_CONF!G_F!LE or the job will fail.

+) The DSfEE conductor node queries the local DB2 instance via the DB2 client to
determine table partitioning information. The results of this query are then used to
route data directly to or from the appropriate DB2 nodes.

5) DSfEE starts up processes across all ETL and DB2 nodes in the cluster. This can be
easily verified by setting the environment variable $APT_DUMP_SCORE to TRUE,
and examining the corresponding score entry placed in the job log within
DataStage Director.

3 Prerequisites
- The DB2 database schema to be accessed must NOT have any columns with User
Defined Types (UDTs). Use the db2 describe table [table-name|" command on the
DB2 client for each table to be accessed to determine if UDTs are in use.
Alternatively, examine the DDL for each schema to be accessed.

- DSfEE must be installed on all ETL server(s) as well as each DB2 node in the DB2
cluster. The DSfEE server version demonstrated in this document is 7.5.1a.

- The hardware and operating system of the ETL server and DB2 nodes must be the
same. The systems demonstrated in this document were running A!X v5.3.

- A DB2 32-bit client must be installed on the primary (conductor) ETL server. The
DB2 client demonstrated in this document is v8.1 FixPack10 aka v8.2. Use the
db2level" command on the ETL server to identify the version of the database.

- The database must be DB2 Enterprise Server Edition with the Data Partitioning
Facility (DPF) option installed. The DB2 UDB server demonstrated in this document
is v8.1 FixPack9 aka v8.2. Use the db2level" command on the DB2 server to
identify the version of the database.

4 How-To Set Up DB2 Connectivity for Remote Servers
Our example systems are 2 A!X systems, one with + CPUs used as the DB2 UDB server,
and one with 2 CPUs used as the DSfEE server. !n this How-To, we will demonstrate
using the DSfEE super-user, by default dsadm.

Note that dsadm does NOT have to be the local database instance owner.
INFORMATION INTEGRATION SOLUTIONS

version 2.5 DataStage Enterprise Edition DB2 Configuration Page 9 of 19


Figure +: DSfEE DB2 Example System

1) Perform the following on ALL members of the cluster BEFORE installing DSfEE on
the ETL server:
a. Create the primary group to which the DSfEE users will belong (in this
document, this group is the recommended default dstage) and ensure that
this group has the same UN!X group id (like 127) on all the systems.
b. Create DSfEE users on all members of the cluster. Nake sure that each user
has the same user id (like 20+) on all the systems, and that every user has
the correct group memberships, minimally with dstage as the primary group,
and the DB2 group in the list of secondary groups.
c. Add these users to the DB2 database and ensure they can log in to DB2 on
db2_server. At this step, we are on the DB2 server, and NOT the ETL
server. !f you fail here, contact your DB2 DBA for support - this is NOT a
DSfEE issue.

/db2home/db2inst1@db2_server> . Jdb2homeJdb2inst1JsqllibJdb2profile
/db2home/db2inst1@db2_server> db2 connect to db2_dpf1_db user dsadm using db2_psword

Database Connection Information

Database server = DB2/6000 8.2.2
SQL authorization ID = DSADM
Local database alias = db2dev1

2) Enable the rsh command on all servers in the cluster. The simplest way to do this
is to create a .rhosts file in the home directory of each DSfEE user that has the host
name or !P address of all members of the cluster, and then setting the permissions
on this file to 600. This must be done for each user on all members of the cluster.
Note that modern security systems may prohibit this method, but it will serve as an
adequate example of the requirement. Contact the System Administrators for the
cluster for assistance. Here are the commands to be performed on each node of
our example system to implement the rhosts method:
echo "etl_server dsadm" > ~J.rhosts
echo "db2_server dsadm" >> ~J.rhosts
chmod 600 ~J.rhosts
INFORMATION INTEGRATION SOLUTIONS

version 2.5 DataStage Enterprise Edition DB2 Configuration Page 10 of 19
And an example of the validation of the etl_server:

/home/dsadm@etl_server> rsh db2_server date
Wed Jan 1S 15:40:51 CST 2006

3) !nstall a 32 bit DB2 client if one is not installed on the primary ETL server (server
on which DSfEE is installed and on which the DS repository resides, also known as
the conductor node").
a. Nake dsadm the owner of the client. While the software will be installed in
fusr, management directories and components appear under the home
directory of this owner, the top of which is ~fsqllib. For dsadm on our
sample A!X system, this is fhomefdsadmfsqllib.
b. Comment out the call to ~fsqllibfdb2profile that the client install puts into
the .profile of dsadm. !f you don't, DSfEE will not operate - it will find DB2
libraries before it finds DSfEE libraries.
c. Edit ~fsqllibfdb2profile to export !NSTHONE, DB2D!R and DB2!NSTANCE.

+) The DB2 DBA must now catalog all the databases you wish to access on the DB2
server into this instance of the DB2 client.
a. Ensure that dsadm can log in to DB2 on the db2_server. At this step, we
are on the ETL server, and NOT the DB2 server. !f you fail here, contact
your DB2 DBA for support - this is NOT a DSfEE issue.

/home/dsadm@etl_server> . JhomeJdsadmJsqllibJdb2profile
/home/dsadm@etl_server> db2 connect to db2dev1 user dsadm using db2_psword

Database Connection Information

Database server = DB2/6000 8.2.2
SQL authorization ID = DSADM
Local database alias = db2dev1

5) Ensure that the remote database is cataloged.

/home/dsadm@etl_server> db2 "LIST DATABASE DIRECTORY"

Database alias = db2dev1
Database name = db2_dpf1_db
Node name = db2_server
Database release level = a.00
Comment =
Directory entry type = Remote
Authentication = SERVER
Catalog database partition number = -1


6) Log out of the ETL server and log back in to reset all the environment variables to
their original state. Edit $DSHONEfdsenv to include the following information (note
that underlined items in blue should be substituted with appropriate values for your
configuration). We are assuming that the $DB2D!R directory is the same on all
nodes in our cluster. This ensures that $PATH and $L!BPATH are correctly set for
the remote sessions as well as the local session without resorting to individual files
on each member of the cluster.

INFORMATION INTEGRATION SOLUTIONS

version 2.5 DataStage Enterprise Edition DB2 Configuration Page 11 of 19
Note that on operating systems other than A!X (our example system), $L!BPATH
may be $SHL!B_PATH or $LD_L!BRARY_PATH.

################################################
# DB2 Setup section of dsenv
################################################
#DB2DIR is where the DB2 home is located
DB2DIR=/usr/opt/db2_08_01; export $DB2DIR

#DB2INSTANCE is the name of the DB2 client where the databases are
cataloged
DB2INSTANCE=dsadm; export $DB2INSTANCE

#INSTHOME is the PATH where the client instance is located, usually the
home directory of the instance owner.
INSTHOME=/home/dsadm; export $INSTHOME

#Append the DB2 directories to the PATH
PATH=$PATH:$DB2DIR/bin; export $PATH
THREADS_FLAG=native; export $THREADS_FLAG

#Add the DB2 libraries to END of the LIBPATH on AIX or LD_LIBRARY_PATH on
SUN and Linux
LIBPATH=$LIBPATH:$DB2DIR/lib; export $LIBPATH

IMPORTANT: the DataStage libraries NUST be placed BEFORE the DB2 entries in
$L!BPATH ($SHL!B_PATH or $LD_L!BRARY_PATH). DataStage and DB2 use the
same library name librwtool".

7) Copy the db2nodes.cfg file from the remote instance to the DataStage server. !f
you create a user on the DataStage server with the same name as the DB2 remote
instance owner (for example, db2inst1), then the db2nodes.cfg can be placed in
that user's home directoryfsqllib" on the DataStage server. Otherwise, create a
user defined environment variable APT_DB2!NSTANCE_HONE in the DS
administrator, add it to a test job and have it point to the location of the sqllib
subdirectory where the db2nodes.cfg has been placed. Avoid setting this at the
Project level so that other DB2 jobs which are connecting locally do not pick up this
value.

!n our example, the DB2 server has four processing nodes (logical nodes), the
instance owner is db2inst1, the db2nodes.cfg file on the DB2 server is
fhomefdb2inst1fsqllibfdb2nodes.cfg, and this file has these contents:

0 db2_server 0
1 db2_server 1
2 db2_server 2
3 db2_server 3

!n our example, the ETL server client is owned by dsadm, the
APT_DB2!NSTANCE_HONE environment variable has been set to
fhomefdsadmfremote_db2config", and this file was copied to
fhomefdsadmfremote_db2configfsqllibfdb2nodes.cfg on the ETL server.

INFORMATION INTEGRATION SOLUTIONS

version 2.5 DataStage Enterprise Edition DB2 Configuration Page 12 of 19
8) Ensure that dsadm can connect to the instance using the values in
$DSHONEfdsenv instead of ~fsqllibfdb2profile. Log out of the ETL server and log
back in to reset all the environment variables to their original state.

/home/dsadm@etl_server> cd `cat J.dshome`Jdsenv
/home/dsadm@etl_server> . .Jdsenv
/home/dsadm@etl_server> db2 connect to db2dev1 user dsadm using db2_psword

Database Connection Information

Database server = DB2/6000 8.2.2
SQL authorization ID = DSADM
Local database alias = db2dev1

9) !mplement a DSfEE cluster (please refer to the !nstall and Upgrade guide for more
details). !n this example, fetlfAscential is the file system that contains the DSfEE
software system, and it is NFS-exported from the ETL server to the DB2 server, and
NFS-mounted exactly on fetlfAscential, a file system owned by dsadm on the DB2
server.

10) verify that the DB2 operator library has been properly configured by making sure
the link orchdb2op" exists in the $PXEngineJlib directory. Normally this link is
configured on install, but if it does not exist, you must run the script
$PXEngineJinstallJinstall.liborchdb2op. You will be prompted to specify DB2
version 7 or 8, in our case, version 8.

11) The db2setup.sh script located in the $PXHONEfbinf can run without reporting
errors even if they occur, and if there are errors, DSfEE will not be able to connect
to the database(s). Run the following commands and ensure that no errors occur.

/home/dsadm@etl_server> db2 connect reset
/home/dsadm@etl_server> db2 connect terminate
/home/dsadm@etl_server> db2 connect to db2dev1 user dsadm using db2_psword
/home/dsadm@etl_server> db2 bind ${APT_ORCHHOME}JbinJdb2esql.bnd datetime ISO
blocking all grant public
/home/dsadm@etl_server> cd ${INSTHOME}JsqllibJbnd
/home/dsadm@etl_server> db2 bind @db2bind.lst datetime ISO
1
blocking all grant public
/home/dsadm@etl_server> db2 bind @db2cli.lst datetime ISO
2
blocking all grant public
/home/dsadm@etl_server> db2 connect reset
/home/dsadm@etl_server> db2 connect terminate

/home/dsadm@etl_server> db2 connect to db2dev1 user dsadm using db2_psword
/home/dsadm@etl_server> db2 grant bind, execute on package dsadm.db2.esql to group
dstage
/home/dsadm@etl_server> db2 connect reset
/home/dsadm@etl_server> db2 connect terminate

12) The db2grant.sh script located in the $PXHONEfbinf can run without reporting
errors even if they occur, and if there are errors, DSfEE will not operate correctly.
Run the following commands and ensure that no errors occur. Grant bind and

1
Datetime !SO currently prevents this bind from succeeding. Omit this option when issuing the bind until
this issue has been resolved by development.
2
Datetime !SO currently prevents this bind from succeeding. Omit this option when issuing the bind until
this issue has been resolved by development.
INFORMATION INTEGRATION SOLUTIONS

version 2.5 DataStage Enterprise Edition DB2 Configuration Page 13 of 19
execute privileges to every member of the primary DSfEE group, in our case
dstage.

/home/dsadm@etl_server> db2 connect to db2dev1 user dsadm using dsadm_db2_psword
/home/dsadm@etl_server> db2 grant bind, execute on package dsadm.db2.esql to group
dstage
/home/dsadm@etl_server> db2 connect reset
/home/dsadm@etl_server> db2 connect terminate

13) Create a DSfEE configuration file that includes nodes to be used for ETL processing
and a node entry for each physical server in the remote DB2 instance.

Unless ETL processing is to be performed on the remote DB2 instance nodes, these
entries should be removed from the default node pool (pools "). Each node in the
DB2 instance should be part of the same node pool (eg. pools db2"). An example
configuration file is shown below:

{
node "node1"
{
fastname "etl_server"
pools ""
resource disk "/worknode1/datasets" {pools ""}
resource scratchdisk "/worknode1/scratch" {pools ""}
}
node "db2node1"
{
fastname "db2_server"
pools "db2"
resource disk "/tmp" {pools ""}
resource scratchdisk "/tmp" {pools ""}
}
}

1+) Restart the DataStage server.

15) Test server connectivity by trying to import a table definition within DataStage
Designer (or DataStage Nanager) using the DB2 AP! plug-in (Server plug-in). !f
this fails, you do not have connectivity to the DB2 server and need to revisit all the
previous steps until this succeeds.

!f this succeeds, check the imported TableDefs to be sure the data types are
legitimate.

16) Create a user defined variable APT_DB2!NSTANCE_HONE in the DSfEE project
using the DataStage Administrator client for use in jobs that access DB2. Avoid
setting this at the Project level so that other DB2 jobs which are connecting locally
do not pick up this value. Set this variable in each job to the location of the
sqllibfdb2nodes.cfg file, in our case fhomefdsadmfremote_db2config.
INFORMATION INTEGRATION SOLUTIONS

version 2.5 DataStage Enterprise Edition DB2 Configuration Page 1+ of 19
5 Using the DB2 Enterprise Stage
Create a Parallel job and add a DB2 Enterprise stage and sequential file stage. Set the
file path in the sequential file stage to fdevfnull. Set or add the following properties to the
DB2 Enterprise stage (see image below).


Figure 5: DSfEE DB2 Enterprise Stage Properties

For connection to a remote DB2 instance, you need to set the following properties on the
DB2 Enterprise stage in your parallel job:
Client Instance Name. Set this to the DB2 client instance name. If you set
this property, DataStage assumes you require remote connection.
Server. Set this to the name of the DB2 server OR use the DB2 environment
variable DB2!NSTANCE to identify the name of the DB2 server.
Client Alias DB Name. Set this to the DB2 client's alias database name for the
remote DB2 server database. [This is required only if the client's alias is different
from the actual name of the remote server database.|
Database. Set this to the remote server database name OR use the environment
variables APT_DBNANE or APT_DB2DBDFT to identify the database.
User. Enter the user name for connecting to DB2. This is required for a remote
connection in order to retrieve the catalog information from the local instance of
DB2 and thus must have privileges for that local instance.
Password. Enter the password for connecting to DB2. This is required for a
remote connection in order to retrieve the catalog information from the local
instance of DB2 and thus must have privileges for that local instance.

This stage has been parameterized in the following example:

INFORMATION INTEGRATION SOLUTIONS

version 2.5 DataStage Enterprise Edition DB2 Configuration Page 15 of 19

Figure 6: DSfEE Parallel Job Properties Tab



Figure 7: DSfEE DB2 Enterprise Stage Properties Using Job Parameters

Set the APT_DB2!NSTANCE_HONE variable in the Parameters panel to
fhomefdsadmfremote_db2config.

INFORMATION INTEGRATION SOLUTIONS

version 2.5 DataStage Enterprise Edition DB2 Configuration Page 16 of 19

Figure 8: Sample Job Properties Panel

Test the connection using view Data on the Output f Properties panel:


Figure 9: Sample view Data Output
INFORMATION INTEGRATION SOLUTIONS

version 2.5 DataStage Enterprise Edition DB2 Configuration Page 17 of 19
6 Configuring Multiple Instances in One Job
Although it is not officially supported, it is possible to connect to more than one DB2
instance within a single job. Your job must meet one of the following configurations
(note: the use of the word stream" refers to a contiguous flow of one stage to another
within a single job):

1. Single stream - Two Instances Only
reading from one instance and writing to another instance with no other DB2
instances (not sure how many stages of these 2 instances can be added to the
canvas for this configuration for lookups)

2. Two Stream - One Instance per Steam
reading from instance A and writing to instance A and reading from instance B and
writing to instance B (not sure how many stages of these 2 instances can be added
to the canvas for this configuration for lookups)

3. Multiple Stream with N DB2 sources with no DB2 targets
reading from 1 to n DB2 instances in separate source stages with no downstream
other DB2 stages

!n order to get this configuration to work correctly, you must adhere to all of the
directions specified for connecting to a remote instance AND the following:

You must not set the APT_DB2!NSTANCE_HONE environment variable. Once this
variable is set, it will try to use it for each of the connections in the job. Since a
db2nodes.cfg file can only contain information for one instance, this will create
problems.

!n order for DS to locate the db2nodes.cfg, you must build a user on the DS server
with the same name as the instance you are trying to connect to (the default logic
for the DB2 Enterprise stage is to use the instance's home directory as defined for
the UN!X user with the same name as the DB2 instance). !n the users UN!X home
directory, create a sqllib subdirectory and place the remote instance's db2nodes.cfg
there. Since the APT_DB2!NSTANCE_HONE is not set, DS will default to this
directory to find the configuration file for the remote instance.

7 Troubleshooting
1) !f you get an error while performing the binds and grants, make sure dsadm has
privileges to create schema, can select on the sysibm.dummy1 table, and bind
packages (see installation documentation for the DB2 grants necessary to run the
scripts).

2) There are several errors while trying to view data from the DB2 Enterprise stage
that don't represent the actual issue:
- !f you log into DS with a username (ex dsadm) and try to view data with a
INFORMATION INTEGRATION SOLUTIONS

version 2.5 DataStage Enterprise Edition DB2 Configuration Page 18 of 19
different user in the plug-in (username and password inside of the plug-in, you
could get a failed connection. This is because the username and password inside of
the stage is only used to create a connection to DB2 via the client and them the job
actually runs using the DS user (username used to log into DS either from the
designer or the director).
- The user doesn't have permission to read the catalog tables

3) The userid used to access the DB2 remote servers has to be set in each of the
servers. For example, the dsadm user has to exist as a UN!X user on the ETL
server and all of the DB2 nodes. Also make sure the groups are set correctly since
the db2grant.sh scripts only grants permission to the group (in our example, dstage
or something like db2group).

+) The DB2 client instance is a service that needs to be running before you can
connect to any of the cataloged databases.

5) The permission on the resource disk or scratch are not set correctly (mainly for
performing a load) When performing a load, make sure the resource disk and
scratch are read f write to the dstage group as well as the DB2 instance owner
were the data is going to be loaded. Usually the groups are different so the
permission needs to be set to 777.
S Performance Notes
!n some cases, when using user-defined SQL without partitioning against large volumes of
DB2 data, the overhead of routing information through a remote DB2 coordinator may be
significant. !n these instances, it may be beneficial to have the DB2 DBA configure
separate DB2 coordinator nodes (no local data) on each ETL server (in clustered ETL
configurations). !n this configuration, DB2 Enterprise stage should not include the Client
Instance Name property, forcing the DB2 Enterprise stages on each ETL server to
communicate directly with their local DB2 coordinator.
9 Summary of Settings
The DB2 libraries must come after the DataStage libraries because both products have
libraries with identical names. The DB2 client alters the .profile of the DB2 owner, and
this must be removed or DataStage will not function. Here is the .profile for user dsadm
on the ETL server:
/home/dsadm @ etl_server >> tail -4 .profile
# The following three lines were added by UDB and removed by !BN !!S.
# if [ -f fhomefdsadmfsqllibfdb2profile |; then
# . fhomefdsadmfsqllibfdb2profile
# fi

Environment variables set by fhomefdsadmfsqllibfdb2profile must be supplied after the
native DataStage environment variables. This is done with the dsenv file for the
DataStage server. Here are the last lines of the dsenv file with DB2 setup information
added:
/etl/Ascential/DataStage/DSEngine @ etl_server >> tail -S dsenv
# DB2 setup section
INFORMATION INTEGRATION SOLUTIONS

version 2.5 DataStage Enterprise Edition DB2 Configuration Page 19 of 19
DB2D!R=fusrfoptfdb2_08_01; export DB2D!R
DB2!NSTANCE=dsadm; export DB2!NSTANCE
!NSTHONE=fhomefdsadm; export !NSTHONE
PATH=$PATH:$DB2D!Rfbin; export PATH
THREADS_FLAG=native; export THREADS_FLAG
L!BPATH=$L!BPATH:$DB2D!Rflib; export L!BPATH

Here are the contents of the db2nodes.cfg file located in fhomefdsadmfremote_db2config
fsqllib:
/home/dsadm/remote_db2config/sqllib @ etl_server >> cat db2nodes.cfg
0 db2_server 0
1 db2_server 1

You might also like