You are on page 1of 6

c 

 
   c
 
  
 
 
  c   

 c
   


Pushdown optimization is a way of load-balancing among servers in order to achieve optimal


performance. Veteran ETL developers often come across issues when they need to determine
the appropriate place to perform ETL logic. Suppose an ETL logic needs to filter out data
based on some condition. One can either do it in database by using WHERE condition in the
SQL query or inside Informatica by using Informatica Filter transformation. Sometimes, we can
even "push" some transformation logic to the target database instead of doing it in the source side (Especially in the
case of EL-T rather than ETL). Such optimization is crucial for overall ETL performance.



 c
   

 

One can push transformation logic to the source or target database using pushdown optimization. The Integration
Service translates the transformation logic into SQL queries and sends the SQL queries to the source or the target
database which executes the SQL queries to process the transformations. The amount of transformation logic one
can push to the database depends on the database, transformation logic, and mapping and session configuration.
The Integration Service analyzes the transformation logic it can push to the database and executes the SQL
statement generated against the source or target tables, and it processes any transformation logic that it cannot push
to the database.

è  ! c
   


èse the ! 



  to preview the SQL statements and mapping logic that the Integration
Service can push to the source or target database. You can also use the Pushdown Optimization Viewer to view the
messages related to pushdown optimization.

Let us take an example: Image: Pushdown Optimization Example 1


Filter Condition used in this mapping is: DEPTNO>40

Suppose a mapping contains a Filter transformation that filters out all employees except those with a DEPTNO
greater than 40. The Integration Service can push the transformation logic to the database. It generates the following
SQL statement to process the transformation logic:

Ä Ä  
         
   
   
   
  
  
 
   

The Integration Service generates an INSERT SELECT statement and it filters the data using a WHERE clause. The
Integration Service does not extract data from the database at this time.

We can configure pushdown optimization in the following ways:

èsing source-side pushdown optimization:

The Integration Service pushes as much transformation logic as possible to the source database. The Integration
Service analyzes the mapping from the source to the target or until it reaches a downstream transformation it cannot
push to the source database and executes the corresponding SELECT statement.

èsing target-side pushdown optimization:

The Integration Service pushes as much transformation logic as possible to the target database. The Integration
Service analyzes the mapping from the target to the source or until it reaches an upstream transformation it cannot
push to the target database. It generates an INSERT, DELETE, or èPDATE statement based on the transformation
logic for each transformation it can push to the database and executes the DML.

èsing full pushdown optimization:

The Integration Service pushes as much transformation logic as possible to both source and target databases. If you
configure a session for full pushdown optimization, and the Integration Service cannot push all the transformation
logic to the database, it performs source-side or target-side pushdown optimization instead. Also the source and
target must be on the same database. The Integration Service analyzes the mapping starting with the source and
analyzes each transformation in the pipeline until it analyzes the target. When it can push all transformation logic to
the database, it generates an INSERT SELECT statement to run on the database. The statement incorporates
transformation logic from all the transformations in the mapping. If the Integration Service can push only part of the
transformation logic to the database, it does not fail the session, it pushes as much transformation logic to the source
and target database as possible and then processes the remaining transformation logic.

For example, a mapping contains the following transformations:


SourceDefn -> SourceQualifier -> Aggregator -> Rank -> Expression -> TargetDefn

    


 
 ! 
 " #
Image: Pushdown Optimization Example 2

The Rank transformation cannot be pushed to the database. If the session is configured for full pushdown
optimization, the Integration Service pushes the Source Qualifier transformation and the Aggregator transformation to
the source, processes the Rank transformation, and pushes the Expression transformation and target to the target
database.

When we use pushdown optimization, the Integration Service converts the expression in the transformation or in the
workflow link by determining equivalent operators, variables, and functions in the database. If there is no equivalent
operator, variable, or function, the Integration Service itself processes the transformation logic. The Integration
Service logs a message in the workflow log and the Pushdown Optimization Viewer when it cannot push an
expression to the database. èse the message to determine the reason why it could not push the expression to the
database.



" ! 
# $ c   c
   


To push transformation logic to a database, the Integration Service might create temporary objects in the database.
The Integration Service creates a temporary sequence object in the database to push Sequence Generator
transformation logic to the database. The Integration Service creates temporary views in the database while pushing
a Source Qualifier transformation or a Lookup transformation with a SQL override to the database, an unconnected
relational lookup, filtered lookup.

1. To push Sequence Generator transformation logic to a database, we must configure the session for pushdown
optimization with Sequence.
2. To enable the Integration Service to create the view objects in the database we must configure the session for
pushdown optimization with View.
2. After the database transaction completes, the Integration Service drops sequence and view objects created for
pushdown optimization.

-
! !  
 c
   


Depending on the database workload, we might want to use source-side, target-side, or full pushdown optimization at
different times and for that we can use the oo ushdownConfig mapping parameter. The settings in the
$$PushdownConfig parameter override the pushdown optimization settings in the session properties. Create
$$PushdownConfig parameter in the Mapping Designer , in session property for Pushdown Optimization attribute
select $$PushdownConfig and define the parameter in the parameter file.

The possible values may be,


1. none i.e the integration service itself processes all the transformations,
2. Source [Seq View],
3. Target [Seq View],
4. Full [Seq View]
 c
   
%  
èse the Pushdown Optimization Viewer to examine the transformations that can be pushed to the database. Select a
pushdown option or pushdown group in the Pushdown Optimization Viewer to view the corresponding SQL statement
that is generated for the specified selections. When we select a pushdown option or pushdown group, we do not
change the pushdown configuration. To change the configuration, we must update the pushdown option in the
session properties.

   c  
 " 
   c
   

We can configure sessions for pushdown optimization having any of the databases like Oracle, IBM DB2, Teradata,
Microsoft SQL Server, Sybase ASE or Databases that use ODBC drivers.

When we use native drivers, the Integration Service generates SQL statements using native database SQL. When
we use ODBC drivers, the Integration Service generates SQL statements using ANSI SQL. The Integration Service
can generate more functions when it generates SQL statements using native language instead of ANSI SQL.

  !&
c  c
   
   

When the Integration Service pushes transformation logic to the database, it cannot track errors that occur in the
database.

When the Integration Service runs a session configured for full pushdown optimization and an error occurs, the
database handles the errors. When the database handles errors, the Integration Service does not write reject rows to
the reject file.

If we configure a session for full pushdown optimization and the session fails, the Integration Service cannot perform
incremental recovery because the database processes the transformations. Instead, the database rolls back the
transactions. If the database server fails, it rolls back transactions when it restarts. If the Integration Service fails, the
database server rolls back the transaction.

What is Pushdown Optimization and things to consider


The process of pushing transformation logic to the source or target database by Informatica
Integration service is known as Pushdown Optimization. When a session is configured to run for
Pushdown Optimization, the Integration Service translates the transformation logic into SQL
queries and sends the SQL queries to the database. The Source or Target Database executes the
SQL queries to process the transformations.
ðow does Pushdown Optimization (PO) Works?
The Integration Service generates SQL statements when native database driver is used. In case of
ODBC drivers, the Integration Service cannot detect the database type and generates ANSI
SQL. The Integration Service can usually push more transformation logic to a database if a
native driver is used, instead of an ODBC driver.

For any SQL Override, Integration service creates a view (PM_*) in the database while
executing the session task and drops the view after the task gets complete. Similarly it also create
sequences (PM_*) in the database.

Database schema (SQ Connection, LKP connection), should have the Create View / Create
Sequence Privilege, else the session will fail.

Few Benefits in using PO


˜ There is no memory or disk space required to manage the cache in the Informatica server
for Aggregator, Lookup, Sorter and Joiner Transformation, as the transformation logic is
pushed to database.
˜ SQL Generated by Informatica Integration service can be viewed before running the
session through Optimizer viewer, making easier to debug.
˜ When inserting into Targets, Integration Service do row by row processing using bind
variable (only soft parse ± only processing time, no parsing time). But In case of
Pushdown Optimization, the statement will be executed once.

Without Using Pushdown optimization:

INSERT INTO EMPLOYEES(ID_EMPLOYEE, EMPLOYEE_ID, FIRST_NAME,


LAST_NAME, EMAIL,

PHONE_NUMBER, HIRE_DATE, JOB_ID, SALARY, COMMISSION_PCT,

MANAGER_ID,MANAGER_NAME,

DEPARTMENT_ID) VALUES (:1, :2, :3, :4, :5, :6, :7, :8, :9, :10, :11, :12, :13) U   


  

With Using Pushdown optimization

INSERT INTO EMPLOYEES(ID_EMPLOYEE, EMPLOYEE_ID, FIRST_NAME,


LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, JOB_ID, SALARY,
COMMISSION_PCT, MANAGER_ID, MANAGER_NAME, DEPARTMENT_ID) SELECT
CAST(PM_SJEAIJTJRNWT45X3OO5ZZLJYJRY.NEXTVAL AS NUMBER(15, 2)),
EMPLOYEES_SRC.EMPLOYEE_ID, EMPLOYEES_SRC.FIRST_NAME,
EMPLOYEES_SRC.LAST_NAME, CAST((EMPLOYEES_SRC.EMAIL || µ@gmail.com¶) AS
VARCHAR2(25)), EMPLOYEES_SRC.PHONE_NUMBER,
CAST(EMPLOYEES_SRC.HIRE_DATE AS date), EMPLOYEES_SRC.JOB_ID,
EMPLOYEES_SRC.SALARY, EMPLOYEES_SRC.COMMISSION_PCT,
EMPLOYEES_SRC.MANAGER_ID, NULL, EMPLOYEES_SRC.DEPARTMENT_ID FROM
(EMPLOYEES_SRC LEFT OUTER JOIN EMPLOYEES PM_Alkp_emp_mgr_1 ON
(PM_Alkp_emp_mgr_1.EMPLOYEE_ID = EMPLOYEES_SRC.MANAGER_ID)) WHERE
((EMPLOYEES_SRC.MANAGER_ID = (SELECT PM_Alkp_emp_mgr_1.EMPLOYEE_ID
FROM EMPLOYEES PM_Alkp_emp_mgr_1 WHERE
(PM_Alkp_emp_mgr_1.EMPLOYEE_ID = EMPLOYEES_SRC.MANAGER_ID))) OR (0=0))
U     

=hings to note when using PO


There are cases where the Integration Service and Pushdown Optimization can produce different
result sets for the same transformation logic. This can happen during data type conversion,
handling null values, case sensitivity, sequence generation, and sorting of data.

The database and Integration Service produce different output when the following settings and
conversions are different:

˜ Dulls treated as the highest or lowest value: While sorting the data, the Integration
Service can treat null values as lowest, but database treats null values as the highest value
in the sort order.
˜ ï ï= built-in variable: Built-in Variable SYSDATE in the Integration Service
returns the current date and time for the node running the service process. However, in
the database, the SYSDATE returns the current date and time for the machine hosting the
database. If the time zone of the machine hosting the database is not the same as the time
zone of the machine running the Integration Service process, the results can vary.
˜ ate Conversion: The Integration Service converts all dates before pushing
transformations to the database and if the format is not supported by the database, the
session fails.
˜ °ogging: When the Integration Service pushes transformation logic to the database, it
cannot trace all the events that occur inside the database server. The statistics the
Integration Service can trace depend on the type of pushdown optimization. When the
Integration Service runs a session configured for full pushdown optimization and an error
occurs, the database handles the errors. When the database handles errors, the Integration
Service does not write reject rows to the reject file.

You might also like