Parser Transformation

Where (n How) to use Token Parser
Transformation in Informatica Data Quality

IDQ (Informatica data quality) is a super set of power center. To
use IDQ you need to take a

separate license. Another way around is that you can create a custom
transformation in IDQ which in turn can be used in power center. Informatica data
quality tool is different from profiling feature of Informatica. IDQ gives many in built
features to streamline the data. As well it gives tremendous capability to business
to review data ensures greater business involvement which helps to achieve
single truth of data. So if your organization is considering addressing BIG DATA
challenges then this tool can help your job done easily.
Parsing ideally refers as an activity where a system can analyze and breakdown the information into
meaningful chunk. We are exactly going to achieve the same using this Data Quality transformation.
IDQ parser is meant to identify one or more data elements in an input field and to write each element to a
different output field.
Parsing allows you to have greater control over the information in each column. For example
Your source file name includes generic file name, date, country, option type (say
coseorder_xxxxsecurity.20121016113056.EU.CNC.dat); you may have to preserve them as
independent columns in audit.
Data field that contains a persons full name, William Shakespeare You can use the Parser
transformation to split the full name into separate data columns for the first name and last name.
Any data field that has a structure that can be described in a regular expression. VAT numbers,
SSN, PAN, credit card numbers etc are all the valid candidates
After the data is parsed into new columns, one can create custom data quality operations for each column.
Primarily two types of parser transformations are there
TOKEN BASED Parse the input based on the

matching token sets, expressions or reference
table entries in token parser transformation.
In this parser transformation can be configured in two ways; Predefined forms of
token sets to parse data columns into component strings. For example ZIP
codes, phone numbers, and Social Security numbers etc has a particular format
and IDQ have them all in place, so just pick the token you need.
Custom Token sets or Expressions. Example I from above fall in this category. Parse data that matches
reference table entries or custom regular expressions that you enter.
PATTERN BASED Parse the patterns made

of multiple strings, one can define custom
patterns for this in token parser
transformation.
This kind of parser to be designed and configured along with a Labeler
transformation, which I will discuss later
Let us discuss a case pertaining this I want to load the data from a source file to target applying some
business rules, which is not a point of discussion now. Two things to be achieved in it:
R1. Full name to be parsed to First and Last names, Dont process the record if the full name format is other
than First_Name Last_Name, this need to written into the reject file;
R2. File name to be parsed and the data should be preserved in audit file
Token Parser Transformation Sample Mapping

Before we proceed with the solution, Why REGEX? Because it allow you to output desired number of fields
10 STEPS LEARN to configure a simple

parse transformation with regular expression.
Create a parser transformation
Go to Properties -> Strategies -> Add New Parser
Enter parser name, description, choose input fields. Go to Next
Choose Parse using Token set; Go to Next
Select Regular Expression, click choose (It will open Regex editor window)
Name it, go to next
Enter Number of desired output fields, for this example
Number of outputs 5
Regular expression -([A-Za-z._%-]+)\.([0-9]+)\.([A-Z]{2-3})\.([A-Z]{3})\.([A-Za-z]{3})
Enter sample data to validate expression
Enter output fields
Click Finish
Select the newly added regular expression and go to Next.
Done
Token Parser Transformation

This can either be exported to power center as a transformation or you can create a simple mapping (as shown) and
can export it as a mapplets; I chose the latter here.
In data quality define the format, preserve in model repository so that the format is available across the projects,
deploy to Power Center to use it in your mapping. In addition, it can be profiled, can be shared across your systems.

Parser Transformation

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Parser Transformation

Uploaded by

Copyright:

Available Formats

Where (n How) to use Token Parser

Transformation in Informatica Data Quality

use IDQ you need to take a

TOKEN BASED Parse the input based on the

PATTERN BASED Parse the patterns made

Token Parser Transformation Sample Mapping

10 STEPS LEARN to configure a simple

Create a parser transformation

Go to Properties -> Strategies -> Add New Parser

Enter parser name, description, choose input fields. Go to Next

Choose Parse using Token set; Go to Next

Name it, go to next

Enter Number of desired output fields, for this example

Regular expression -([A-Za-z._%-]+)\.([0-9]+)\.([A-Z]{2-3})\.([A-Z]{3})\.([A-Za-z]{3})

Enter sample data to validate expression

Enter output fields

Select the newly added regular expression and go to Next.

Token Parser Transformation

You might also like