You are on page 1of 9

Informatica Data Replication FAQs

2011 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means

(electronic, photocopying, recording or otherwise) without prior consent of Informatica Corporation.

Abstract
This article describes frequently asked questions about using Informatica Data Replication for transactional data replication. It includes information about Data Replication features.

Supported Versions
Informatica Data Replication 3.0.0

Table of Contents
General Questions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Microsoft SQL Server Sources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Oracle Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Destinations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

General Questions
What is Informatica Data Replication? Informatica Data Replication is a transactional data replication tool. It extracts DML changes (inserts, deletes, and updates) from DB2 for Linux, UNIX, and Windows, Oracle, and Microsoft SQL Server databases and replicates the changes to heterogeneous destinations. Destinations include relational databases and flat files. Data Replication can replicate large volumes of data with low latency and without degrading source system performance. You can use it for the following purposes:
Update data warehouses with changes from OLTP systems in a timely manner. Update ODS systems to provide the latest data for making critical and timely business decisions. Extract transactional changes in text format for migration to a non-database target such as Hadoop or Hive. Use Data Replication in conjunction with Fast Clone to migrate data to different platforms with near zero-

downtime. Can Data Replication replicate changes from a single source to multiple destinations? Yes. The destinations must be of the same type and use the same target schema. For example, you can replicate changes from a SQL Server source to multiple Oracle destinations. Data Replication has no limit on the number of destinations. In the Data Replication Console, you can define selection conditions to filter rows to different destinations. For example, you could send rows for a specific region to one destination and rows for another region to a different destination. However, you must use the same table- or column-level filtering criteria and the same apply mode for all destinations. For example, if you select a subset of columns, you must replicate those same columns to all destinations. Note: If an apply process for one destination fails, the entire replication job fails. However, the data files are retained. If the job is scheduled to run again, it can recover from the last checkpoint. Does Data Replication extract data from the columns that have user-defined data types (UDTs)? Data Replication can extract change data from UDT columns for Microsoft SQL Server sources only. Data Replication does not extract data from UDT columns for DB2 or Oracle sources. What type of data compression algorithm does Data Replication use on intermediate files that are sent over the network? If you select the Compression on the fly option on the Runtime Settings tab > Miscellaneous Conditions view, Data Replication uses the QuickLZ software to compress and decompress intermediate files during a data

replication job. You cannot specify a compression ratio or select another compression method. However, with the default compression, data replication jobs that use QuickLZ compression are often much faster than those without compression because of the reduction in I/O that is associated with writing compressed data to a destination. For more information about QuickLZ, see http://www.quicklz.com/ Can Data Replication replicate DDL changes? Data Replication 3.0 can replicate most DDL changes from Oracle sources to Oracle destinations. However, Data Replication does not replicate DDL changes for other source and destination combinations and does not replicate TRUNCATE statements. Does the Data Replication Console use JDBC drivers to connect to destinations? Yes. Data Replication supplies JDBC drivers for DB2, Greenplum, Microsoft SQL Server, Oracle, PostgreSQL, and Vertica databases in the DataReplication_installation\lib subdirectory. For MySQL, Netezza, and Teradata databases, you must download the JDBC drivers to the DataReplication_installation\lib subdirectory. Are the JDBC drivers used to connect to sources too from the console? When do I need to install ODBC drivers? The following table describes the sources and destinations for which to use ODBC drivers:
Database DB2 for Linux, UNIX, and Windows Greenplum Microsoft SQL Server Source Yes Destination Yes Comment For DB2, install the DB2 Client, DB2 Runtime Client, or DB2 Connect, which include the DB2 ODBC driver.

No Yes

Yes Yes

For Greenplum destinations, install the Greenplum ODBC driver. For Microsoft SQL Server sources and destinations, install the SQL Server ODBC driver. On Linux and UNIX systems, if you want to run InitialSync with Microsoft SQL Server sources, install the FreeTDS or DataDirect driver for SQL Server.

MySQL Netezza

No Yes

Yes Yes

For MySQL destinations, install the MySQL ODBC driver. For Netezza sources and destinations, install the Netezza ODBC driver. Note: The Netezza ODBC driver is not publicly available. You must request a copy of the ODBC driver from Netezza Support. Install the Oracle Client to connect to Oracle sources and destinations. For PostgreSQL destinations, install the PostgreSQL ODBC driver. For Teradata destinations, install the TPT libraries that include Teradata ODBC driver. For Vertica destinations, install the Vertica ODBC driver.

Oracle PostgreSQL Teradata

No No No

No Yes Yes

Vertica

No

Yes

Tip: Download the driver or client for your database version. Do I need to install an ODBC driver manager? On Linux and UNIX systems, Data Replication requires an ODBC driver manager in addition to the ODBC drivers that it uses to connect to most destinations and some sources. The driver manager provides an ODBC API that enables Data Replication to communicate with the ODBC drivers. Data Replication provides the unixODBC driver

manager in the DataReplication_installation/odbc_driver subdirectory. Depending on the operating system, use one of the following environment variables to point to the directory that contains the unixODBC driver manager:
Operating System AIX HP-UX Linux Solaris Environment Variable LIBPATH SHLIB_PATH LD_LIBRARY_PATH LD_LIBRARY_PATH_64

Also, set the ODBCINST environment variable to point to the odbcinst.ini file. Note: Instead of the supplied unixODBC driver manager, you can use another ODBC driver manager. How can I get the commit timestamp of the last replicated record? Commit timestamps are stored in the Data Replication audit tables. What is the DBSYNC_SYNC_INFO table and why do we need it? Data Replication creates the DBSYNC_SYNC_INFO table when the IntialSync component performs an initial synchronization of an Oracle source and Oracle destination using a database link (dblink). The DBSYNC_SYNC_INFO table contains information about the InitialSync operation. The table has the following structure:
"CREATE TABLE %s.dbsync_sync_info " "(id VARCHAR2(20)," \ "sync_date DATE," \ "source_db VARCHAR2(512)," "source_schema VARCHAR2(512)," "source_table VARCHAR2(512)," "dest_db VARCHAR2(512)," "dest_schema VARCHAR2(512)," "dest_table VARCHAR2(512)," "scn_wrap NUMBER(12)," \ "scn_base NUMBER(12)," \ "rows_transfered number(18)," \ "duration number(18)," \ "primary key (id) ) initrans 10" \ \ \ \ \ \ \

How do I start the Informatica Data Replication Console? The executable for the Informatica Data Replication Console is in the top-level Data Replication installation directory. To start the console:
On Windows, run gui.cmd. On Linux or UNIX, run gui.sh.

Before starting the console, verify that the following prerequisites are met:
The Data Replication Console requires the Java Runtime Environment (JRE) 1.5 or later. If the JRE is not

installed, you can download it for free from http://www.oracle.com/technetwork/java/javase/downloads/index.html.. You must define a JAVA_HOME environment variable that points to the JRE base directory. Also, include the jre_home/bin directory in the PATH environment variable.
On Linux and UNIX, the Data Replication Console requires an X Window environment. Configure an X Window

environment if one does not exist.


For Microsoft SQL Server sources, enable TCP/IP protocol under Network Configuration in SQL Server

Configuration Manager.

How do I start the Data Replication Extractor and Applier? Do I need to start the Extractor first? You can start the Extractor and Applier from the command line or the Informatica Data Replication Console. You should start the Extractor prior to starting Applier. Otherwise, the Applier does not find the files with which it needs to work. To start the Extractor:
From a Windows command prompt, run extract.cmd. From a Linux or UNIX command line, run extract.sh. In the Data Replication Console, click the Runtime Settings tab > File Locations view and enter the Extractor

script name. The default name is dbsync_extract.exe. Then either click the Capture extract changes transactions from source icon on the toolbar or click Data > Capture extract changes transactions from source on the menu bar. To start the Applier:
From a Windows command prompt, run apply.cmd. From a Linux or UNIX command line, run apply.sh. From the Data Replication Console, enter the Applier script name on the Runtime Settings tab > File

Locations view. The default name is dbsync_apply_odbc.exe. Then either click the Start Applier module as defined in Runtime settings icon on the toolbar or click Data > Apply changes transactions to the destination on the menu bar. How do I start the InitialSync component to perform an initial synchronization of the source and destination? You can start the InitialSync component from the command line or the Informatica Data Replication Data Replication Console:
From a Windows command prompt, run initialsync.cmd. From a Linux or UNIX command line, run initialsync.sh. From the Data Replication Console, enter the InitialSync script name on the Runtime Settings tab > File

Locations view. The default name is dbsync_initial.exe. Then either click the Start Initial Sync module as defined in Runtime settings icon on the toolbar or click Data > Synchronize databases on the menu bar.

Microsoft SQL Server Sources


Does the Data Replication Extractor for SQL Server sources use a Microsoft API to read data? No. The Extractor reads the SQL Server transaction logs directly. Can Data Replication extract change data from a SQL Server database while SQL Server native transactional replication is active? No. Data Replication cannot extract change data while SQL Server native replication is running. Both products need to manage the secondary truncation point in the SQL Server log, but only one can do so at a time. Can a Data Replication Extractor process multiple SQL Server databases? No. The Extractor reads data from the SQL Server transaction log. Because each SQL Server database has a separate transaction log, you must run a separate Extractor process for each database. Can Data Replication capture data from SQL Server online logs and backup logs? To capture data from online logs, Data Replication must run on the same system as the SQL Server database. To capture data from backup logs, Data Replication requires the backup logs to be in SQL Server native format. Data Replication does not capture data from backup logs that were created by a third-party tool. You can run Data Replication on the SQL Server source system or on a separate system that has the same or a different operating

system. To run Data Replication on a separate system, configure the Data Replication Management Server on the SQL Server system in one of the following ways:
Configure the Data Replication Management Server to create backup logs periodically and send the logs to

another Management Server. Install the target Management Server on the system where the Extractor runs.
Configure the Data Replication Management Server to create backup logs periodically in a shared directory

that can be accessed over the network. The Extractor runs on another system and accesses the shared directory to read the backup logs. Note: If you capture data from backup logs, you cannot also capture data from online logs. Does Data Replication support capture of change data from Microsoft SQL Server from a Linux or UNIX system? Yes. On Linux and UNIX systems, Data Replication supports Microsoft SQL Server as a source database provided that you use Data Replication Management Server, a mounted file system, or a copy of the SQL Server archive logs. Does Data Replication support Microsoft SQL Server tables that do not have primary keys? Yes. Data Replication supports Microsoft SQL Server tables that do not have primary keys.

Oracle Sources
Does Data Replication extract change data from Oracle online redo logs and archived logs? In a non-RAC environment, Data Replication can extract change data from both Oracle online redo logs and archived logs. In a RAC environment, Data Replication extracts change data from archived logs only. Can Data Replication extract change data from an Oracle source in a Real Application Cluster (RAC) environment that uses Automatic Storage Management (ASM)? Yes. In a RAC, Data Replication can extract change data from Oracle archived logs that are managed by ASM. However, Data Replication cannot extract change data from Oracle online redo logs in a RAC. How do I configure Data Replication to extract change data from an Oracle source in a RAC environment that uses ASM? On the Source Database tab of the Informatica Data Replication Console, enter the following information:
In the Database Connection view, select the RAC support box. In the ASM Settings view, enter the following information that Data Replication uses to connect to the ASM

instance:
- ASM instance hostname. The host name or IP address of the system with the ASM instance. - ASM instance port. The port number that is used to connect to the ASM instance. The default value is 1521. - ASM sys username. The user name of the default Oracle user that was created at Oracle installation, or

another user that you define. This user must have SYSDBA privileges.
- ASM sys password. A clear text password for the ASM sys user. - ASM instance. A service name for the ASM instance. The default value is +ASM. If you want to use the

Oracle instance name instead, select Use SID instead of SERVICE_NAME and then enter an Oracle instance name that is defined in an ORACLE_SID environment variable.
If Data Replication extracts data from a remote ASM instance, also enter the following general setting on the

Runtime Settings tab:


extract.read_asm_direct=true

Does Data Replication support Oracle compressed tables? In Oracle 9i to Oracle 11g Release 1, Data Replication can extract only SQL inserts, with or without the append hint, from tables that have table compression or OLTP compression. In Oracle 11g Release 2 and later, Data

Replication can extract all SQL change operations, including inserts, updates, and deletes. Data Replication does not support Oracle Exadata Hybrid Columnar Compression. Does Data Replication extract changes from Oracle index-organized tables (IOTs)? Yes. Data Replication can extract changes from IOTs. Does Data Replication extract changes from Oracle materialized views? Yes. Data Replication can extract changes from Oracle materialized views. Does Data Replication extract changes from an Oracle logical standby database? Yes. Data Replication can extract changes from Oracle logical standby databases. However, if you run the InitialSync component, you must use the original database as the source for the initial synchronization. Note: This limitation does not apply to Oracle physical standby databases. You can run the InitialSync component with an Oracle physical standby database as the source. Does Data Replication support Oracle Transparent Data Encryption (TDE)? No. Data Replication does not extract change data from tablespaces or columns that are encrypted with Oracle TDE/TSE. Does Data Replication extract change data from Oracle redo logs on raw devices on Linux? Yes. Data Replication can extract change data from both Oracle online redo logs and archived logs on a raw device on Linux. Can Data Replication extract Oracle direct-path inserts and apply them to an Oracle destination as direct-path inserts? Yes. Data Replication replicates direct-path inserts. Does the Oracle source database need to have supplemental logging enabled? Yes. Data Replication requires minimal global supplemental logging at the database level. To enable supplemental logging, use the following SQL statement:
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA; COMMIT;

Destinations
What driver is required for connection to Greenplum destinations? For Greenplum destinations, you must install the Greenplum or PostgreSQL ODBC driver on the system where the Data Replication Applier or InitialSync components run. Use the driver that has the same bit-level (32- or 64-bit) as Data Replication. On Linux and UNIX, add an entry for the driver in the odbcinst.ini file. For example:
[PostgreSQL] Description=ODBC for PostgreSQL Driver=/usr/local/greenplum-connectivity-4.0.3.0-build-5/drivers/odbc/psqlodbc-08.02.0500/ unixodbc-2.2.12/psqlodbcw.so FileUsage=1 Threading=0

Note: Specify Threading=0 to allow multithreaded ODBC calls. Also, if you store the odbcinst.ini file in a location other than the default location, define the following environment variables:
ODBCINST. Provides the full path to the odbcinst.ini file. ODBCSYSINI. Provides the path to your ODBC home directory that contains the odbcinst.ini file.

What driver is required for connection to Vertica destinations? For Vertica destinations, you must install the Vertica ODBC driver on the system where the Data Replication Applier and InitialSync components run. Use the driver that has the same bit-level (32- or 64-bit) as Data Replication. On Linux and UNIX, add an entry for the driver in the odbcinst.ini file. For example:
[Vertica] Description=ODBC for Vertica Driver=/usr/lib64/vertica_4.0.12_odbc_3.5_unixodbc_x86_64_linux.so FileUsage=1 Threading=0

Note: Specify Threading=0 to allow multithreaded ODBC calls. Also, if you store the odbcinst.ini file in a location other than the default location, define the following environment variables:
ODBCINST. Provides the full path to the odbcinst.ini file. ODBCSYSINI. Provides the path to your ODBC home directory that contains the odbcinst.ini file.

Which driver is required for connection to Netezza destinations? For Netezza destinations, you must install the Netezza ODBC driver on the system where the Data Replication Applier and InitialSync components run. Use the driver that has the same bit-level (32- or 64-bit) as Data Replication. On Linux and UNIX, add an entry for the driver in the odbcinst.ini file. For example:
[NetezzaSQL] Driver=/usr/local/nz/lib/libnzsqlodbc3_64bit.so Setup=/usr/local/nz/lib/libnzsqlodbc3_64bit.so APILevel=1 ConnectFunctions=YYN Description=Netezza ODBC driver DriverODBCVer=03.51 DebugLogging=false LogPath=/tmp UnicodeTranslationOption=utf8 CharacterTranslationOption=all PreFetch=256 Socket=16384

Note: If you store the odbcinst.ini file in a location other than the default location, define the following environment variables:
ODBCINST. Provides the full path to the odbcinst.ini file. ODBCSYSINI. Provides the path to your ODBC home directory that contains the odbcinst.ini file.

Which destinations are supported for Netezza sources? Data Replication applies data captured from Netezza sources only to Netezza destinations. Is Data Replication Merge Apply mode supported for Microsoft SQL Server destinations? No. For SQL Server destinations, the Merge Apply mode is not supported because it would not significantly improve performance. Does Data Replication support SQL Server Express Edition as a destination? Yes. Data Replication supports the Express Edition of SQL Server 2005, 2008, and 2008 R2 as a destination. Also, Data Replication supports the Express Edition of SQL Server 2005, 2008, and 2008 R2 as a source. Dev/PM please verify. Which load utility does the Data Replication InitialSync component use to load data to a SQL Server destination? InitialSync uses the SQL Server bcp load utility.

Author
Anna Turukina Technical Writer The author would like to acknowledge Virginia Pfeifle for her assistance and help with this article.