Professional Documents
Culture Documents
Agenda
Service-oriented architecture Domains, nodes, and services Administering the domain Log Service and Log Viewer Managing folders, users, and permissions
Service-Oriented Architecture
Product Overview
Informatica components
Repository Service Integration Service Informatica Client
Repository Manager Designer Workflow Manager Workflow Monitor
SOA Cont.
Powercenter Domain
Powercenter Domain
Nodes
Gateway Node
Service Manager
Services
Services
Core Services
Log Services
Administration console
Administration console
Domain Property
Node Property
Designer
To add source and target definitions to the repository To create mappings that contain data transformation instructions
Overview .. Sources
Relational - Oracle, Sybase, Informix, IBM DB2, Microsoft SQL Server, and Teradata File - Fixed and delimited flat file, COBOL file, and XML Extended PowerConnect products for PeopleSoft, SAP R/3, Siebel, and IBM MQSeries Mainframe PowerConnect for IBM DB2 on MVS Other - MS Excel and Access
Overview .. Targets
Relational - Oracle, Sybase, Sybase IQ, Informix, IBM DB2, Microsoft SQL Server, and Teradata. File - Fixed and delimited flat files and XML Extended
Integration Service to load data into SAP BW. PowerConnect for IBM MQSeries to load data into IBM MQSeries
message queues.
Repository Manager
Repository
The Informatica repository tables have an open architecture Metadata can include information such as
mappings describing how to transform source data sessions indicating when you want the Integration Service to perform the transformations connect strings for sources and targets
Repository
Can create and store the following types of metadata in the repository:
Database connections Global objects Mappings Mapplets Multi-dimensional metadata Reusable transformations Sessions and batches Shortcuts Source definitions Target definitions Transformations
Repository
Exchange of Metadata with other BI Tools ,Metadata can be Exported to and Imported from BO, Cognos ..The objects exported or Imported can be compared in their XML formats itself. MX Views ,This Feature enables the User to view the Information on the Server Grids and the Object History in the Repository Service.
Import and export repository connection information in the registry Analyze source/target, mapping, and shortcut dependencies
Repository Security
Can plan and implement security using the following features:
User groups Repository users Repository privileges Folder permissions Locking
Can assign users to multiple groups Privileges are assigned to groups Can assign privileges to individual usernames and must assign each user to at least one user group
Types of Locks
There are five kinds of locks on repository objects: Read lock - Created when you open a repository object in a folder for which you do not have write permission Write lock - Created when you create or edit a repository object Execute lock - Created when you start a session or batch Fetch lock - Created when the repository reads information about repository objects from the database Save lock - Created when the repository is being Saved.
Folders
Folders provide a way to organize and store all metadata in the repository, including mappings and sessions They are used to store sources, transformations, cubes, dimensions, Mapplets, business components, targets, mappings, sessions and batches Can copy objects from one folder to another Can copy objects across repositories The Designer allows you to create multiple versions within a folder When a new version is created, the Designer creates a copy of all existing mapping metadata in the folder and places it into the new version Can copy a session within a folder, but you cannot copy an individual session to a different folder
Folders
To copy all sessions within a folder to a different location, you can copy the entire folder Any mapping in a folder can use only those source and target definitions or reusable transformations that are stored:
in the same folder in a shared folder and accessed through a shortcut
Folders
Folders have the following permission types:
Read permission Write permission Execute permission
Shared folders allow users to create shortcuts to objects in the folder Shortcuts inherit changes to their shared object Once you make a folder shared, you cannot reverse it
Copying Folders
Each time you copy a folder, the Repository Manager copies the following:
Sources, transformations, Mapplets, targets, mappings, and business components Sessions and batches Folder versions
Copying Folders
Comparing Folders
The Compare Folders Wizard allows to perform the following comparisons:
Compare objects between two folders in the same repository Compare objects between two folders in different repositories Compare objects between two folder versions in the same folder
Whether or not the Repository Manger notes a similarity or difference between two folders depends on the direction of the comparison One-way comparisons check the selected objects of Folder1 against the objects in Folder2 Two-way comparisons check objects in Folder1 against those in Folder2 and also check objects in Folder2 against those in Folder1
Comparing Folders
Comparing Folders
The comparison wizard displays the following usercustomized information:
Similarities between objects Differences between objects Outdated objects
Can edit and save the result of the comparison The Repository Manager does not compare the field attributes of the objects in the folders when performing the comparison A two-way comparison can sometimes reveal information a one-way comparison cannot A one-way comparison does not note a difference if an object is present in the target folder but not in the source folder
Folder Versions
Added Features in PowerCenter 8.1 Can run object queries that return shortcut objects. Can also run object queries based on the latest status of an object. The query can return local objects that are checked out, the latest version of checked in objects, or a collection of all older versions of objects.
Can share objects by exporting and importing objects between repositories with the same version
Designer
Designer Appendix
Informaticas Designer is the client application used to create and manage sources, targets, and the associated mappings between them. The Integration Service uses the instructions configured in the mapping and its associated session to move data from sources to targets. The Designer allows you to work with multiple tools, in multiple folders, and in multiple repositories at a time. The client application provides five tools with which to create mappings:
Source Analyzer. Use to import or create source definitions for flat file, ERP, and relational sources. Target Designer. Use to import or create target definitions. Transformation Developer. Used to create reusable object that generates or modifies data. Mapplet Designer. Used to create a reusable object that represents a set of transformations. Mapping Designer. Used to create mappings.
Designer Appendix
The Designer consists of the following windows: Navigator. Use to connect to and work in multiple repositories and folders. You can also copy objects and create shortcuts using the Navigator Workspace. Use to view or edit sources, targets, Mapplets, transformations, and mappings. You can work with a single tool at a time in the workspace Output. Provides details when you perform certain tasks, such as saving your work or validating a mapping. Rightclick the Output window to access window options, such as printing output text, saving text to file, and changing the font size Overview. An optional window to simplify viewing workbooks containing large mappings or a large number of objects
Source Analyzer
The following types of source definitions can be imported or created or modified in the Source Analyzer:
Relational Sources Tables, Views, Synonyms Files Fixed-Width or Delimited Flat Files, COBOL Files Microsoft Excel Sources XML Sources XML Files, DTD Files, XML Schema Files Data models using MX Data Model PowerPlug SAP R/3, SAP BW, Siebel, IBM MQ Series by using PowerConnect
After importing a relational source definition, Business names for the table and columns can be entered
The source definition appears in the Source Analyzer. In the Navigator, the new source definition appears in the Sources node of the active repository folder, under the source database name
Warehouse Designer
To create target definitions for file and relational sources
Import the definition for an existing target - Import the target definition from a relational target Create a target definition based on a source definition, Relational source definition, Flat file source definition Manually create a target definition or design several related targets at the same time
target
Can edit Business Names, Constraints, Creation Options, Description, Keywords on the Table tab of the target definition Can edit Column Name, Datatype, Precision and Scale, Not Null, Key Type, Business Name on the Columns tab of the target definition
Mapping
Mappings represent the data flow between sources and targets When the Integration Service runs a session, it uses the instructions configured in the mapping to read, transform, and write data Every mapping must contain the following components:
Source / Target definitions. Transformation / Transformations. Connectors Or Links.
Mapping
Transformation Transformation
Source Source Source Qualifier Source Qualifier Links or Connectors Links or Connectors
Sample Mapping
Target Target
Mapping - Invalidation
On editing a mapping, the Designer invalidates sessions under the following circumstances:
Add or remove sources or targets Remove Mapplets or transformations Replace a source, target, Mapplet, or transformation while importing or copying objects Add or remove Source Qualifiers or COBOL Normalizers, or change the list of associated sources for these transformations Add or remove a Joiner or Update Strategy transformation. Add or remove transformations from a Mapplet in the mapping Change the database type for a source
Mapping - Components
Every mapping requires at least one transformation object that determines how the Integration Service reads the source data:
Source Qualifier transformation Normalizer transformation ERP Source Qualifier transformation XML Source Qualifier transformation
Transformations can be created to use once in a mapping, or reusable transformations to use in multiple mappings
Mapping - Updates
By default, the Integration Service updates targets based on key values The default UPDATE statement for each target in a mapping can be overrode For a mapping without an Update Strategy transformation, configure the session to mark source records as update
Mapping - Validation
The Designer marks a mapping valid for the following reasons:
Connection validation - Required ports are connected and that all connections are valid Expression validation - All expressions are valid Object validation - The independent object definition matches the instance in the mapping
The Designer performs connection validation each time you connect ports in a mapping and each time you validate or save a mapping You can validate an expression in a transformation while you are developing a mapping
Mapping - Validation
Transformations
A transformation is a repository object that generates, modifies, or passes data The Designer provides a set of transformations that perform specific functions Transformations in a mapping represent the operations the Integration Service performs on data Data passes into and out of transformations through ports that you connect in a mapping or mapplet Transformations can be active or passive
Transformation - Types
An active transformation can change the number of rows that pass through it A passive transformation does not change the number of rows that pass through it Transformations can be connected to the data flow, or they can be unconnected An unconnected transformation is not connected to other transformations in the mapping. It is called within another transformation, and returns a value to that transformation
Transformations - Types
Source Qualifier Transformation Object
This object is used to define the reader process, or the selection for relational sources. Can automatically generate the SQL, User can override it and write own SQL. In the source qualifier you can also specify filters on the source data or distinct selection.
Transformations - Types
Expression Transformation Object
The Expression Transformation is used for data cleansing and scrubbing. There are over 80 functions within PowerCenter, such as concatenate, instring, rpad, ltrim and we use many of them in the Expression Transformation. We can also create derived columns and variables in the Expression Transformation.
Filter Transformation
The Filter Transformation is used to do just that filter data. We can filter at the source using the Source Qualifier, but maybe I need to filter the data somewhere within the pipeline, perhaps on an aggregate that Ive calculated, and we use the Filter Transformation for that. We also use the Filter to branch and load a portion of the data into one target based on some condition and the rest of the data into another target.
NOTE Router transformation is used for branching, its MUCH more efficient than the filter transformation.
Transformations - Types
Aggregator Transformation
The Aggregator Transformation is used when doing those types of functions that require sorting or a group by. Functions like SUM, AVG. Aggregates are done in memory and we can accept pre-sorted input to reduce the amount of memory necessary.
Lookup Transformation
Use the Lookup Transformation to lookup tables in the source database, the target database or any other database as long as we have connectivity. Lookups are by default cached in memory, however you can turn off the caching by checking the option. The lookup condition can be any type of Boolean expression.
Like the Update Strategy, the Lookup Transformation is used frequently for handling Slowly Changing Dimensions.
Transformations - Types
Sequence Generator Transformation
We provide a Sequence Generator Transformation for those target databases that do not have these capabilities (such as MS-SQL 6.5) and ours works just like Oracles sequence generator. You can specify the starting value, the increment value, the upper limit and whether or not you want to reinitialize.
(Extended discussion: the Sequence Generator can also be used to stamp a batch load id into every new row loaded in a given update/refresh load. This is very useful for backing out loads if that becomes necessary.) Stored Procedure Transformation
One of the main benefits of using PowerCenter is that we eliminate the need for coding, however if you have already written a stored procedure that you cannot duplicate within PowerCenter you can call them from within a mapping using the Stored Procedure Transformation. (We import the stored procedure arguments the same way we import the source catalog definition.)
Transformations - Types
External Procedure Transformation
The same holds true for External Procedures written in C, C++, or Visual Basic they can be called from within the mapping process using the External Procedure transformation. External procedure transformations are reusable transformations and we define those in the Transformation Developer (the fourth component tool in the Designer).
Joiner Transformation
The Joiner Transformation is used for heterogeneous joins within a mapping perhaps we need to join a flat file with an Oracle table or Oracle and a Sybase table or 2 flat files together. (Joins are done in memory and the memory profile is configurable!).
Sorter Transformation
The Sorter is used to sort the data in the Ascending or the Descending order, generally used before aggregation of data.
Transformations - Types
Normalizer Transformation
The Normalizer Transformation is used when you have occurs or arrays in your source data and you want to flatten the data into a normalized table structure PowerCenter will do this automatically for you. The Normalizer is also used to pivot data.
Ranking Transformation
This is used when you are doing a very specific data mart, such as Top Performers, and you want to load the top 5 products and the top 100 customers or the bottom 5 products.
Transformation
Added Features in PowerCenter 8.1 (about 8.6 at the end of ppt) Flat file lookup: Can now perform lookups on flat files. To create a Lookup transformation using a flat file as a lookup source, the Designer invokes the Flat File Wizard. To change the name or location of a lookup between session runs, the lookup file parameter can be used .
Dynamic lookup cache enhancements: When you use a dynamic lookup cache, the PowerCenter Service can ignore some ports when it compares values in lookup and input ports before it updates a row in the cache. Also, you can choose whether the PowerCenter Service outputs old or new values from the lookup/output ports when it updates a row. You might want to output old values from lookup/output ports when you use the Lookup transformation in a mapping that updates slowly changing dimension tables.
Union transformation: Can use the Union transformation to merge multiple sources into a single pipeline. The Union transformation is similar to using the UNION ALL SQL statement to combine the results from two or more SQL statements.
Transformation
Custom transformation API enhancements: The Custom transformation API includes new array-based functions that allow you to create procedure code that receives and outputs a block of rows at a time. Use these functions to take advantage of the PowerCenter Server processing enhancements. Midstream XML transformations: Can now create an XML Parser transformation or an XML Generator transformation to parse or generate XML inside a pipeline. The XML transformations enable you to extract XML data stored in relational tables, such as data stored in a CLOB column. You can also extract data from messaging systems, such as TIBCO or IBM MQSeries.
Transformations - Properties
Port Name
Copied ports will inherit the name of contributing port Copied ports with the same name will be appended with a number Types Of Ports:
Input: Data Input from previous stage. Output: Data Output to the next stage. Lookup: Port to be used to compare Data. Return: The Port (Value) returned from Looking up. Variable: The port that stores value temporarily.
Data types
Transformations use internal data types Data types of input ports must be compatible with data types of the feeding output port
Port Default values - can be set to handle nulls and errors Description - can enter port comments
Aggregator Transformation
Performs aggregate calculations Components of the Aggregator Transformation
Aggregate expression Group by port Sorted Input option Aggregate cache
Aggregator Transformation
Expression Transformation
Can use the Expression transformation to
perform any non-aggregate calculations Calculate values in a single row test conditional statements before you output the results to target tables or other transformations
Expression Transformation
Can enter multiple expressions in a single expression transformation Can enter only one expression for each output port Can create any number of output ports in the transformation Can create Variable ports to store Data temporarily.
Filter Transformation
!! Filter should always be used as close to the Source, so that the Load of data carried ahead is decreased at / or near to the Source Itself.
It provides the means for filtering rows in a mapping All ports in a Filter transformation are input/output Only rows that meet the condition pass through it Cannot concatenate ports from more than one transformation into the Filter transformation To maximize session performance, include the Filter transformation as close to the sources in the mapping as possible Does not allow setting output default values
Joiner Transformation
Joins two related heterogeneous sources residing in different locations or file systems Can be used to join
Two relational tables existing in separate databases Two flat files in potentially different file systems Two different ODBC sources Two instances of the same XML source A relational table and a flat file source A relational table and an XML source
Joiner Transformation
Use the Joiner transformation to join two sources with at least one matching port It uses a condition that matches one or more pairs of ports between the two sources Requires two input transformations from two separate data flows It supports the following join types
Normal (Default) Master Outer Detail Outer Full Outer
Can be used to join two data streams that originate from the same source.
Joiner Transformation
Joiner on Multiple Sources.
Joiner on Same Sources, Used when a Calculation has to be done on One part of the data and join the Transformed data to the Original Data set.
Lookup Transformation
Used to look up data in a relational table, view, synonym or Flat File. The Integration Service queries the lookup table based on the lookup ports in the transformation. It compares Lookup transformation port values to lookup table column values based on the lookup condition. Can use the Lookup transformation to perform many tasks, including:
Get a related value Perform a calculation Update slowly changing dimension tables
Types Of Lookups
Connected Unconnected
Connected Lookup
Connected Lookup Transformation Receives input values directly from another transformation in the pipeline For each input row, the Integration Service queries the lookup table or cache based on the lookup ports and the condition in the transformation Passes return values from the query to the next transformation
Unconnected Lookup
Unconnected Lookup Transformation
Receives input values from an expression using the :LKP (:LKP.lookup_transformation_name(argument, argument, ...)) reference qualifier to call the lookup and returns one value. Some common uses for unconnected lookups include, Testing the results of a lookup in an expression, Filtering records based on the lookup results, Marking records for update based on the result of a lookup (for example, updating slowly changing dimension tables), Calling the same lookup multiple times in one mapping. With unconnected Lookups, you can pass multiple input values into the transformation, but only one column of data out of the transformation Use the return port to specify the return value in an unconnected lookup transformation
Lookup Caching
Session performance can be improved by caching the lookup table Caching can be static or dynamic By default, the lookup cache remains static and does not change during the session Types of Caching
Persistent Cache used across sessions. ReCache from Source To Synchronize persistent cache. Static Read only cache for single Lookup. Dynamic To reflect the Change Data Directly, Used when Target Table is looked up. Shared Cache shared among multiple transformations.
Dynamic lookup cache enhancements: When you use a dynamic lookup cache, the PowerCenter Service can ignore some ports when it compares values in lookup and input ports before it updates a row in the cache. Also, you can choose whether the PowerCenter Service outputs old or new values from the lookup/output ports when it updates a row. You might want to output old values from lookup/output ports when you use the Lookup transformation in a mapping that updates slowly changing dimension tables.
Router Transformation
A Router transformation tests data for one or more conditions and gives the option to route rows of data that do not meet any of the conditions to a default output group It has the following types of groups:
Input Output
Create one user-defined group for each condition that you want to specify
It provides two output ports: NEXTVAL and CURRVAL These ports can not be edited or deleted Can not add ports to the sequence generator transformation When NEXTVAL is connected to the input port of another transformation, the Integration Service generates a sequence of numbers
Connect the NEXTVAL port to a downstream transformation to generate the sequence based on the Current Value and Increment By properties The CURRVAL port is connected, only when the NEXTVAL port is already connected to a downstream transformation
For relational sources, the Integration Service generates a query for each SQ when it runs a session The default query is a SELECT statement for each source column used in the mapping The Integration Service reads only those columns in SQ that are connected to another transformation The Target Load Order can be specified depending on the SQ
The SQ Override can be used to Select only the Required Ports from the Source. For an SQL to Generate a Default Query, Its should SQ Override can contain The have Linked Output port !! Parameter Variables like $$$Sessstarttime, $$IPAddress. Clicking the Generate SQL tab would result in the generation of Default Select query. This can be altered by the User.
For an SQL to Generate a Default Query, Its should have Linked Output port !!
The SQ Override can be used to Join two different Sources based on a key (In the WHERE Clause) and Select only the Required Ports from either of the Sources.
The SQ can be used to define a Source Filter which would filter off the Unwanted records at the Source itself, thus reducing the cache time.
The SQ can be used to Define a User-Defined Join condition also. This Join condition would be added to the Default Query generated by Informatica as a WHERE clause.
The SQ can be used to Sort ports if it is being succeeded by an Aggregator. Select Distinct Check Box when checked performs a Distinct Select on the Input Data. Pre SQL can be used to run a pre Caching command, either a Delete Target or Join two source systems. Similarly Post SQL command can also be used.
Rank Transformation
Allows to select only the top or bottom rank of data, not just one value Can use it to return
the largest or smallest numeric value in a port or group the strings at the top or the bottom of a session sort order
During the session, the Integration Service caches input data until it can perform the rank calculations Can select only one port to define a rank
Rank Transformation
When you create a Rank transformation, you can configure the following properties:
Enter a cache directory Select the top or bottom rank Select the input/output port that contains values used to determine the rank. You can select only one port to define a rank Select the number of rows falling within a rank Define groups for ranks
Rank Transformation
Properties
Top/Bottom This is used to specify if the top or the bottom rank needs to be output. Number of Ranks Specifies the number of Rows to be Ranked. Transformation Scope
Transaction Applies the
The stored procedure must exist in the database before creating a Stored Procedure transformation One of the most useful features of stored procedures is the ability to send data to the stored procedure, and receive data from the stored procedure There are three types of data that pass between the Integration Service and the stored procedure:
Input/Output parameters - For many stored procedures, you provide a value and receive a value in return Return values - Most databases provide a return value after running a stored procedure Status codes - Status codes provide error handling for the Integration Service during a session
A Unconnected Stored Procedure transformation can be called by passing the parameters of Input through an expression in the following format, :SP.<Proc_Name>(Input, Output)
Custom Transformation
Custom Transformation Operates in Conjunction with the Procedures that are created outside the Designer. The PowerCenter Service uses generated Functions to Interface with the procedure and the Procedure code should be developed using API functions. The PowerCenter Service can pass a Single row of data or an Array depending on the Custom function. The types of generated functions are Initialization Function - To initialize processes before data is passed to custom function. Notification Function - To send notifications Deinitialization Function - To deinitialize the processes after data is passed to custom function.
The Types of API Functions are Set Data Access Mode Functions Navigation Functions Property Functions Data Handling Functions
Most of these Functions are associated with Handles, These Handles are Internal Informatica Functions which Guide the process flow.
Sorter Transformation
A Sorter Transformation is used to sort Data. Types of Sorting:
Ascending/Descending Case Sensitive - Does Case sensitive ascending or descending sort. Distinct - Gives distinctly sorted output rows.
A sorter can be used to sort data from a Relational Source or a Flat file. The Sort key has to be specified based on which the sort would be done.
Sorter Transformation
The Fields can be sorted in Either Ascending or Descending orders by specifying the type of sort required in the Direction field. Multiple fields can be marked as Keys and different sorts can be done on them.
When Case Sensitive Check box is Checked, the sorting is done on case sensitive basis. Working Directory is the directory in which Informatica Caches the files for sort. Distinct Check box is used to get Distinct sorted output.
Transformation Language
The designer provides a transformation language to help you write expressions to transform source data With the transformation language, you can create a transformation expression that takes the data from a port and changes it Can write expressions in the following transformations:
Aggregator Expression Filter Rank Router Update Strategy
Transformation Language
Expressions can consist of any combination of the following components:
Ports (input, input/output, variable) String literals, numeric literals Constants Functions Local and system variables Mapping parameters and mapping variables Operators Return values
Transformation Language
The functions available in PowerCenter are
Aggregate Functions e.g. AVG, MIN, MAX Character Functions e.g. CONCAT, LENGTH Conversion Functions e.g. TO_CHAR, TO_DATE Date Functions e.g. DATE_DIFF, LAST_DAY Numeric Functions e.g. ABS, CEIL, LOG Scientific Functions e.g. COS, SINH Special Functions e.g. DECODE, IIF, ABORT Test Functions e.g. ISNULL, IS_DATE Variable Functions e.g. SETMAXVARIABLE
Transformation Expressions
The pre-compiled and tested transformation expressions help you create simple or complex transformation expressions: Functions. Over 60 SQL-like functions allow you to change data in a mapping. Aggregates.
Calculate a single value for all records in a group. Return a single value for each group in an Aggregator transformation. Apply filters to calculate values for specific records in the selected ports. Use operators to perform arithmetic within the function. Calculate two or more aggregate values derived from the same source columns in a single pass. Filter condition can be applied to all aggregate functions. The filter condition must evaluate to TRUE, FALSE, or NULL. If the filter condition evaluates to NULL or FALSE, the Integration Service does not select the record. For example, the following expression calculates the median salary for all employees who make more than $50,000:
Transformation Expressions
Characters.
Character functions assist in the conversion, extraction and identification of sub-strings. For example, the following expression evaluates a string, starting from the end of the string. The expression finds the first space and then deletes characters from first space to end of line. SUBSTR( CUST_NAME,1,INSTR( CUST_NAME,' ' ,-1,1 ))
Conversions.
Conversion functions assist in the transformation of data from one type to another. For example, the following expression converts the dates in the DATE_PROMISED port to text in the format MON DD YYYY: TO_CHAR( DATE_PROMISED, 'MON DD YYYY' )
DATE_PROMISED Apr 1 1998 12:00:10AM RETURN VALUE 'Apr 01 1998'
Transformation Expressions
Dates.
Date functions help you round, truncate, or compare dates; extract one part of a date; or to perform arithmetic on a date. For example, the following expression would return the month portion of the date, GET_DATE_PART(Apr 1 1997 00:00:00, MM)
Transformation Expressions
Miscellaneous.
Informatica also provides functions to assist in:
aborting or erroring out records developing if then else structures looking up values from a specified external or static table testing values for validity (such as date or number format)
For example, the following expression might cause the Service to skip a value. IIF(SALES < 0, ERROR( 'Negative value found'), EMP_SALARY)
SALES 100 -500 400.55 800.10 RETURN VALUE 100 Server skips record 400.55 800.10
Transformation Expressions
Operators. Use transformation operators to create transformation expressions to perform mathematical computations, combine data, or compare data. Constants. Use built-in constants to reference values that remain constant, such as TRUE, FALSE, and NULL. Variables. Use built-in variables to write expressions that reference values that vary, such as the system date. You can also create local variables within a transformation. Return values. You can also write expressions that include the return values from Lookup, Stored Procedure, and External Procedure transformations
Reusable Transformation
A Transformation is said to be in reusable mode when multiple instances of the same transformation can be created. Reusable transformations can be used in multiple mappings. Creating Reusable transformations:
Design it in the Transformation Developer Promote a standard transformation from the Mapping Designer.
Mapplet
A Mapplet is a reusable object that represents a set of transformations It allows to reuse transformation logic and can contain as many transformations as needed Mapplets can:
Include source definitions Accept data from sources in a mapping Include multiple transformations Pass data to multiple pipelines Contain unused ports
Expanded Mapplet
Mapplet - Components
Each Mapplet must include the following:
One Input transformation and/or Source Qualifier transformation At least one Output transformation
Workflow Monitor
1. Gantt Chart 2. Task View
Workflow Manager
The Workflow Manager replaces the Server Manager in version 5.0. Instead of running sessions, you now create a process called the workflow in the Workflow Manager. A workflow is a set of instructions on how to execute tasks such as sessions, emails, and shell commands. A session is now one of the many tasks you can execute in the Workflow Manager. The Workflow Manager provides other tasks such as Assignment, Decision, and Events. You can also create branches with conditional links. In addition, you can batch workflows by creating worklets in the Workflow Manager.
Task Developer
Use the Task Developer to create tasks you want to execute in the workflow.
Workflow Designer
Use the Workflow Designer to create a workflow by connecting tasks with links. You can also create tasks in the Workflow Designer as you develop the workflow.
Worklet Designer
Use the Worklet Designer to create a worklet.
Workflow Tasks
Command. Specifies a shell command run during the workflow. Control. Stops or aborts the workflow. Decision. Specifies a condition to evaluate. Email. Sends email during the workflow. Event-Raise. Notifies the Event-Wait task that an event has occurred. Event-Wait. Waits for an event to occur before executing the next task. Session. Runs a mapping you create in the Designer. Assignment. Assigns a value to a workflow variable. Timer. Waits for a timed event to trigger.
Create Task
Workflow Monitor
PowerCenter 6.0 provides a new tool, the Workflow Monitor, to monitor workflow, worklets, and tasks. The Workflow Monitor displays information about workflows in two views: 1. Gantt Chart view 2. Task view. You can monitor workflows in online and offline mode.
Workflow Monitor
The Workflow Monitor includes the following performance and usability enhancements: When you connect to the PowerCenter Service, you no longer distinguish between online or offline mode. You can open multiple instances of the Workflow Monitor on one machine. The Workflow Monitor includes improved options for filtering tasks by start and end time. The Workflow Monitor displays workflow runs in Task view chronologically with the most recent run at the top. It displays folders alphabetically. You can remove the Navigator and Output window.
pmrepagent
Discontinued. Use replacement commands in pmrep.
pmcmd
Updated to support new Integration Service functionality.
pmrep
Includes former pmrepagent commands. Also includes new syntax to connect to a domain.
pmcmd Changes
pingservice instead of pingserver getservicedetails instead of getserverdetails, etc. Includes syntax to specify domain and Integration Service information instead of PowerCenter Server information
aborttask connect gettaskdetails startworkflow And more
pmrep Changes
infacmd EnableService instead of pmrep EnableRepository infacmd DefineRepositoryService instead of pmrep AddRepository Includes syntax to specify domain and Repository Service information instead of Repository Server information
Connect DeployFolder Notify Register And more
Ported pmrepagent
Commands
Ported commands use new syntax for domain information Backup Create Delete Registerplugin Restore Unregisterplugin Upgrade
Grid Option
Automatic Failover Restart Recovery
4 4 3 1
Dynamic Distribution Sessions on Grid (SonG)
2
Heterogenous Hardware Grid
Pushdown Optimization
Pushdown Optimization
A session option that causes the Integration Service to push some transformation logic to the source and/or target database You can choose source-side or target-side optimization, or both $$PushdownConfig mapping parameter Benefits:
Can increase session performance Maintain metadata and lineage in PowerCenter repository Reduces movement of data (when source and target are on the same database)
Use Cases
Batch transformation and loadstaging and target tables in the same target database Transformation and load from real-time status table to data warehouse in the same database
Step 1
Step 2
Staging
Warehouse
Data Sources
Target Database
Joiner
Agg
Transform
Lookup Transform
Expr
Agg
Transform
Databases Supported
Oracle 9.x and above IBM DB2 Teradata Microsoft SQL Server Sybase ASE Databases that use ODBC drivers
Supported Transformations
To Source Aggregator Expression Filter Joiner Lookup Sorter Union To Target Expression Lookup Target definition
Deployment:
Can assign owners and groups to folders and deployment groups Can generate deployment control file (XML) to deploy folders and deployment groups with pmrep
Partitioning Changes
Database partitioning
Works with Oracle in addition to DB2
Dynamic partitioning
Integration Service determines the number of partitions to create at run time Integration Service scales the number of session partitions based on factors such as source database partitions or the number of nodes in a grid Useful if volume of data increases over time, or you add more CPUs
Performance Tuning
First step in performance tuning is to identify the performance bottleneck in the following order: Target Source Mapping Session System The most common performance bottleneck occurs when the Integration Service writes to a target database.
Target Bottlenecks
Identifying
A target bottleneck can be identified by configuring the session to write to a flat file target.
Optimizing
Dropping Indexes and Key Constraints before loading. Increasing commit intervals. Use of Bulk Loading / External Loading.
Source Bottlenecks
Identifying Add a filter condition after Source qualifier to false so that no data is processed past the filter transformation. If the time it takes to run the new session remains about the same, then there is a source bottleneck. In a test mapping remove all the transformations and if the performance is similar, then there is a source bottleneck. Optimizing Optimizing the Query by using hints. Use Informatica Conditional Filters if the source system lacks indexes.
Mapping Bottlenecks
Identifying If there is no source bottleneck, add a Filter transformation in the mapping before each target definition. Set the filter condition to false so that no data is loaded into the target tables. If the time it takes to run the new session is the same as the original session, there is a mapping bottleneck. Optimizing Configure for Single-Pass reading Avoid unnecessary data type conversions. Avoid database reject errors. Use Shared Cache / Persistent Cache
Session Bottlenecks
Identifying If there is no source, Target or Mapping bottleneck, then there may be a session bottleneck. Use Collect Performance Details. Any value other than zero in the readfromdisk and writetodisk counters for Aggregator, Joiner, or Rank transformations indicate a session bottleneck. Low (0-20%) BufferInput_efficiency and BufferOutput_efficiency counter values also indicate a session bottleneck.
(continued.)
Session Bottlenecks
Optimizing Increase the number of partitions. Tune session parameters. DTM Buffer Size (6M 128M) Buffer Block Size (4K 128K) Data (2M 24 M )/ Index (1M-12M) Cache Size Use incremental Aggregation if possible.
Incremental Aggregation
First Run creates idx and dat files. Second Run performs the following actions: For each i/p record, the Server checks historical information in the index file for a corresponding group, then: If it finds a corresponding group, it performs the aggregate operation incrementally, using the aggregate data for that group, and saves the incremental change If it does not find a corresponding group, it creates a new group and saves the record data. (continued.)
Incremental Aggregation
When writing to the target Integration Service Updates modified aggregate groups in the target Inserts new aggregate data Deletes removed aggregate data Ignores unchanged aggregate data Saves modified aggregate data in the index and data files
Incremental Aggregation
You can find options for incremental aggregation on the Transformations tab in the session properties The Server Manager displays a warning indicating the Informatica Service overwrites the existing cache and a reminder to clear this option after running the session.
System Bottlenecks
Identifying
If there is no source, Target, Mapping or Session bottleneck, then there may be a system bottleneck. Use system tools to monitor CPU usage, memory usage, and paging. On Windows :- Task Manager On Unix Systems toots like sar, iostat. For E.g. sar u (%usage on user, idle time, i/o waiting time)
Optimizing
Improve network speed. Improve CPU performance Check hard disks on related machines Reduce Paging
Commit Points
A commit interval is the interval at which the Integration Service commits data to relational targets during a session The commit point can be a factor of the commit interval, the commit interval type, and the size of the buffer blocks The commit interval is the number of rows you want to use as a basis for the commit point The commit interval type is the type of rows that you want to use as a basis for the commit point Can choose between the following types of commit interval Target-based commit Source-based commit During a source-based commit session, the Integration Service commits data to the target based on the number of rows from an active source in a single pipeline
Commit Points
During a target-based commit session, the Informatica Service continues to fill the writer buffer after it reaches the commit interval When the buffer block is filled, the Integration Service issues a commit command As a result, the amount of data committed at the commit point generally exceeds the commit interval
Commit Points
During a source-based commit session, the Informatica Service commits data to the target based on the number of rows from an active source in a single pipeline These rows are referred to as source rows A pipeline consists of a source qualifier and all the transformations and targets that receive data from the source qualifier An active source can be any of the following active transformations: Advanced External Procedure Source Qualifier Normalizer Aggregator Joiner Rank Mapplet, if it contains one of the above transformations.
Commit Points
When the Integration Service runs a source-based commit session, it identifies the active source for each pipeline in the mapping The Integration Service generates a commit row from the active source at every commit interval When each target in the pipeline receives the commit row, the Integration Service performs the commit
Commit Points
Questions???
Informatica Debugger
Debugger
Can debug a valid mapping to gain troubleshooting information about data and error conditions To debug a mapping, you configure and run the Debugger from within the Mapping Designer When you run the Debugger, it pauses at breakpoints and allows you to view and edit transformation output data After you save a mapping, you can run some initial tests with a debug session before you configure and run a session in the Server Manager
Debugger
Debugger
Can Use the following process to debug a mapping: Create breakpoints Configure the Debugger Run the Debugger Monitor the Debugger Debug log Session log Target window Instance window Modify data and breakpoints A breakpoint can consist of an instance name, a breakpoint type, and a condition.
Debugger
After you set the instance name, breakpoint type, and optional data condition, you can view each parameter in the Breakpoints section of the Breakpoint Editor
Questions???
Reporting Services
Reporting Service
An application service that runs the Data Analyzer application in a PowerCenter domain. When you log in to Data Analyzer, you can create and run reports on data in a relational database or run the following PowerCenter reports: PowerCenter Repository Reports, Data Analyzer Data Profiling Reports, or Metadata Manager Reports.
The status indicator at the top of the right pane indicates when the service has started running. To disable a Reporting Service:
1. Select the Reporting Service in the Navigator. 2. Click Disable. The Disable Reporting Service dialog box appears. 3. Click OK to stop the Reporting Service.
Note: Before you disable a Reporting Service, ensure that all users are disconnected from Data Analyzer.
Cont..
PowerExchange for Web Services uses the Simple Object Access Protocol (SOAP) to exchange information with the web services provider and request web services. SOAP is a protocol for exchanging information between computers. It specifies how to encode XML data so that programs on different operating systems can pass information to each other. Web services hosts contain WSDL files and web services.
Before you configure PowerExchange for Web Services, install or upgrade PowerCenter.
Cont..
Cont..
Cont..
Cont..
Cont..
PowerCenter Transformations
HTTP Transformation You can let the Integration Service determine the authentication type of the HTTP server when the HTTP server does not return an authentication type to the Integration Service. Or, you can specify the authentication type for the Integration Service to use. Unstructured Data Transformation You can define additional ports and pass output rows to relational targets from the Unstructured Data transformation. You can create ports by populating the transformation from a Data Transformation service. You can pass a dynamic service name to the Unstructured Data transformation with source data.
Metadata Manager
Cont..
Thank You !!