You are on page 1of 13

Version History:

Version Date By Changes

Contributors:
Name Role Location Remarks

Approval:
Name Role Location Remarks

Reference Documents:
Name Author Version Date

CONTENTS
1 DOCUMENT DESCRIPTION 5
2 MANAGEMENT SUMMARY 5
3 SCOPE 6
4 ASSUMPTIONS / DEPENDENCIES 6
5 OPEN ISSUES 6
6 TECHNICAL DESIGN 6
6.1 VALIDATIONS & DERIVATIONS 6
6.1.1 Standard Validations 7
6.1.2 Standard Derivations 7
6.2 REFERENCE DATA 8
6.3 ERROR HANDLING 8
6.3.1 Validation and Error handling10
7 OTHER ISSUES 10
7.1 AUTHORITY 10
8 USER ACCEPTANCE CRITERIA 10
9 RE-USABLE COMPONENTS 10
9.1 DBF_GFIS13_REULKP_FEED_INFO 11
9.2 DBF_GFIS13_REULKP_VALIDATION_VAL1 12
9.3 DBF_GFIS13_REULKP_VALIDATION_VAL2 13
9.4 DBF_GFIS13_REULKP_VALIDATION_ERROR 14
9.5 DBF_REUSEQ_DBFEED_ID 16
9.6 DBF_GFIS13_REUSEQ_ERROR 17
9.7 DBF_GFIS13_REUEXP_CON_REF_MANDATORY_CHECK 18
9.8 DBF_GFIS13_REUEXP_CPY_REF_MANDATORY_CHK 19
9.9 DBF_GFIS13_REUEXP_MANDATORY_COLS 19
9.10 DBF_GFIS13_REUEXP_HEADER 21
9.11 DBF_GFIS13_REULKP_MAP_DATA 22
9.12 DBF_GFIS13_REULKP_DM1_VAL1 AND DBF_GFIS13_REULKP_DMVAL1
23
9.13 DBF_GFIS13_REULKP_DM1_VAL2 24
9.14 DBF_GFIS13_REULKP_DM2_DATA 25
9.15 DBF_GFIS13_REULKP_DM2_VAL1 26
9.16 DBF_GFIS13_REULKP_DM2_VAL2 27
9.17 DBF_GFIS13_REULKP_DM3_VAL1 28
9.18 DBF_GFIS13_REULKP_DM3_VAL2 29
9.19 DBF_GFIS13_REUMPL_ERROR_TABLE 30

1 Document Description
The purpose of this document is to detail the Reusable components that will be used by
the ETL workflow. This document will serve as a reference for developers as the reusable
components that are available during development and logical methodologies that need to
be followed while creating the mappings.
2 Management Summary
This document deals with the ETL objects that can be reused in multiple mappings.
Below is a high level component map of the areas where this logic can be implemented.
Each of these areas is discussed in detail later in this document.

• Validations will encompass rules on which the data will be validated.


• Derivations are values that will be derived based on an existing value.
• Error Handling will be discussed as to how this process will be generic to every
feed.
• Reference Data will be the standard lookup transformations for the data coming
from dbNexus and dbStaging.
3 Scope
This document gives the technical overview of the reusable components in the scope of
ETL.
4 Assumptions / Dependencies
• Informatica will be the ETL tool used.
• Oracle is the RDMS that will hold the repository and Meta data tables.
5 Open Issues
6 Technical Design
6.1 Validations & Derivations

• Metadata tables will drive validations and derivations.


• This will be driven by the rules that are setup on each attribute.
• Any given attribute can have any number of rules for validations and derivations.
• This logic will be implemented using a Materialized view for a given feed that
will passed to a reusable lookup transformation in Informatica.
6.1.1 Standard Validations
Data Type
• Checks whether the value passed is that of date, number, boolean etc.
Valid Values
• Checks if the value is present in a specific list. For example certain fields should
contain the values ‘Y’ or ‘N’.
Ranges
• This rule ensures that data passed does not exceed the thresholds. For example
percentage columns do not exceed 100.
6.1.2 Standard Derivations
This will be responsible for transforming attributes based on predefined rules
stored in the Domain tables. For example product type from AcS ‘MMLOAN’ being
transformed to ‘10110’.
6.2 Reference Data

• Reference data tables from GFIS will be created as reusable lookups.


• Apart from the standard reference data tables domain tables will be also setup as
reusable lookups.
6.3 Error Handling
• Error handling in this release will be to generate error reports.
• Any reconciliation of errors using GUIs is out of scope for the current release.
• This component will be reusable only if the error report has to be the same
structure as the source, GFIS or the target formats.
• Customized error reports cannot be standardised.

• Below is the flow of the proposed error handling with a Web GUI.

6.3.1 Validation and Error handling


• Each GFIS source will be validated to ensure the data type is consistent with the
received contents.
• The structure of the GFIS tables will be validated according to the published
schema.
• Field level validations will be done based on business rules in the data dictionary.
(Section 12.1.1 – data dictionary)
• Source file column data type will be validated against the data type specified for
ICON MM file.
• Mandatory field check will be done to make sure that the outbound file has data in
all the mandatory fields.
• In case of any processing failure during the extraction/transformation/loading
process the corresponding log files will be examined.
7 Other Issues
7.1 Authority
ETL IT.

8 User Acceptance Criteria


• Test cases, to be prepared later for checking the transformations, computation and
validations for user acceptance. This will be done along with the testing of each
individual interface.
9 Re-usable components
Reusable components are built on common logic, which can be used across different
interfaces, so that there is no need to build the same logic again and again. We can
implement this by copying the logic but if there are any updates we need to implement in
all places.
Using the reusable components minimizes the build time and we can make the updates at
one place where it actually reside and all instances will get updated.

Component Name Advantages Disadvantages


Mapplet All the transformations used in the logic kept together and reused across
all the interfaces so that duplication of the same logic can be avoided. And we can change
the logic with minimal effort. Performance view there might be a small degrade, but
coming to effort needed to build and update will be less.
Unconnected Lookup It caches the data of the database object that avoids calling
database for each row. This contains the logic necessary to compare and can be used
across all interfaces. We can return only one value.
Expression Transformation It holds common logic like calculations, conditions and
getting the lookup values by calling an connected lookup. As the no. of ports increases
performance may degrade.
Sequence Generator We can provide a unique id by using a single sequence generator.
So that we can identify the row across different tables. Sequence may not be in
order. As we are caching some values, the values which we are not used in the current
session will be missed out.
9.1 DBF_GFIS13_reuLKP_FEED_INFO

Name DBF_GFIS13_reuLKP_FEED_INFO
Type Lookup Transformation (reusable / unconnected)
Actual Location DbFeedsDev
Locations Used EPLUS, IDMS, ACS_LONDON and DBSTAGING
Parent dependency This lookup was created on the ETL metadata table
DBF_T_FEED_INFO. It is a master table for feed details and contains the data like
feed_id, feed_name and feed_desc are available in this table. Every new feed that comes
in should have an entry in this table.
Purpose It will return the corresponding unique feed_id for the feed_name that was
passed to it.
Advantages As we have more than a single feed going into the same target table we
have to identify the records of each feed separately. We can use feed_id for this purpose.
Reusability This lookup can be used across all the interfaces of dbFeeds.

9.2 DBF_GFIS13_reuLKP_VALIDATION_VAL1

Name DBF_GFIS13_reuLKP_VALIDATION_VAL1
Type Look Up Transformation (reusable / unconnected)
Actual Location DbFeedsDev
Locations Used EPLUS, IDMS, ACS_LONDON and DBSTAGING
Parent dependency This lookup was created on the ETL metadata materialized view
DBF_MV_CONSTRAINTS. This view was created against the ETL meta data tables
DBF_T_DATASTD_INFO. DBF_T_FEED_INFO, DBF_T_TABLE_INFO,
DBF_T_FEED_TABLE, DBF_T_RULE_TYPE, DBF_T_TAB_ATTRIBUTES and
DBF_T_CONSTRAINTS.
Purpose This lookup is used to get the default and derived values. We need to
insert data for default & derivations for all the required attributes in the respective
tables that are involved in this view creation. This view will get the data in a de-
normalized way. It will return these default or derived values by passing the static
information like feed_name, table_name, attribute name and rule name.
Advantages We can effect The changes in the default values and derivation logic with
minimal effort , as updation to the corresponding meta data tables will happen once &
mapping logic need not be touched.
Reusability This lookup can be used across all the interfaces of dbFeeds for getting
default values and derived values.

9.3 DBF_GFIS13_reuLKP_VALIDATION_VAL2
Name DBF_GFIS13_reuLKP_VALIDATION_VAL2
Type Lookup Transformation (reusable / unconnected)
Actual Location DbFeedsDev
Locations Used EPLUS and IDMS
Parent dependency This look up was created on the ETL metadata materialized view
DBF_MV_CONSTRAINTS. This view was created on the ETL meta data tables
DBF_T_DATASTD_INFO, DBF_T_FEED_INFO, DBF_T_TABLE_INFO,
DBF_T_FEED_TABLE, DBF_T_RULE_TYPE, DBF_T_TAB_ATTRIBUTES and
DBF_T_CONSTRAINTS.
Purpose This lookup is used to get the derived values. We need to insert data for
derivations for all the required attributes in the respective tables that are involved in this
view creation. This view will get the data in a de-normalized way. It will return these
derived values by passing the static information like feed_name, table_name, attribute
name and rule name.
Advantages We can effect changes in the existing derivation logic by updating the
corresponding metadata tables without touching the mapping logic.
Reusability This lookup can be used across all the interfaces of dbFeeds for getting
derived values.

9.4 DBF_GFIS13_reuLKP_VALIDATION_ERROR

Name DBF_GFIS13_reuLKP_VALIDATION_ERROR
Type Lookup Transformation (reusable / unconnected)
Actual Location DbFeedsDev
Locations Used EPLUS, IDMS, ACS_LONDON.
Parent dependency This look up was created on the ETL metadata materialized view
DBF_MV_CONSTRAINTS. This view was created on the ETL metadata tables
DBF_T_DATASTD_INFO. DBF_T_FEED_INFO, DBF_T_TABLE_INFO,
DBF_T_FEED_TABLE, DBF_T_RULE_TYPE, DBF_T_TAB_ATTRIBUTES and
DBF_T_CONSTRAINTS.
Purpose This lookup is used to get error ids. Incase of domain value mismatch or
mapping value not found or mandatory column value with null we have to raise an error
and need to deliver this consolidated error report. Error ids for the corresponding
attributes for all the interfaces need to be inserted into DBF_T_ERRORS error meta data
master table. It will return these error ids by passing the static information like
feed_name, table_name, attribute name and rule name.
Advantages All the error ids would available in the same table and the same error id
can be used across different interfaces.
Reusability This lookup can be used across all the interfaces of dbFeeds to get the
error ids.

9.5 DBF_reuSEQ_DBFEED_ID
Name DBF_ reuSEQ_DBFEED_ID
Type Sequence generator Transformation (reusable / connected)
Actual Location DbFeedsDev
Locations Used EPLUS, IDMS.
Parent dependency Informatica sequence generator transformation cached for 1000
values
Purpose ETL would generate a unique sequence called dbfeeds_id for all the IDMS
and EPLUS feeds received.
Advantages There is no chance for the duplication of the same dbfeed id and by using
the same reusable sequence generator across different feeds and it will increase the
performance.
Reusability Used for both IDMS and EPLUS interfaces.

9.6 DBF_GFIS13_reuSEQ_ERROR

Name DBF_GFIS13_ reuSEQ_ERROR


Type Sequence generator Transformation (reusable / connected)
Actual Location DbFeedsDev
Locations Used EPLUS, IDMS and ACS_LONDON.
Parent dependency Informatica sequence generator transformation cached for 1000
values. This is used in the mapplet DBF_GFIS13_reuMPL_ERROR_TABLE.
Purpose ETL would generate a unique sequence for errors (feed_error_id) raised
while processing so that the error sequence generated for theses errors will be unique
Advantages There is no chance for the duplication of the same dbfeed id and by using
the same reusable sequence generator across different feeds and it will increase the
performance.
Reusability Used across all the interfaces of dbFeeds

9.7 DBF_GFIS13_reuEXP_CON_REF_MANDATORY_CHECK

Name DBF_GFIS13_reuEXP_CON_REF_MANDATORY_CHECK
Type Expression Transformation (reusable / connected)
Actual Location DbFeedsDev
Locations Used EPLUS, IDMS.
Parent dependency It was created depending on the listed mandatory columns for
contract reference GFIS table in the GFIS 1.31 version.
Purpose ETL has to check all the mandatory attributes for not null values, incase of
null values need to raise an error. This expression would check these mandatory columns
for null and incase of null values it returns the error id and actual values coming from the
source system.
Advantages All the mandatory columns would be checked at the same place and we
can effect any changes in the mandatory columns with minimal effort.
Reusability This expression can be used for mandatory column checks of GFIS
contact reference table in any interface of dbFeeds.
9.8 DBF_GFIS13_reuEXP_CPY_REF_MANDATORY_CHK

Name DBF_GFIS13_reuEXP_CPY_REF_MANDATORY_CHK
Type Expression Transformation (reusable / connected)
Actual Location DbFeedsDev
Locations Used EPLUS, IDMS.
Parent dependency It was created depending on the listed mandatory columns for
counterparty reference GFIS table in the GFIS 1.31 version.
Purpose ETL has to check all the mandatory attributes for not null values, incase of
null values need to raise an error. This expression would check these mandatory columns
for null and incase of null values it returns the error id and actual values coming from the
source system.
Advantages All the mandatory columns would be checked at the same place and we
can effect any changes in the mandatory columns with minimal effort.
Reusability This expression can be used for mandatory column checks of GFIS
counterparty reference table in any interface of dbFeeds.

9.9 DBF_GFIS13_reuEXP_MANDATORY_COLS

Name DBF_GFIS13_reuEXP_ MANDATORY_COLS


Type Expression Transformation (reusable / connected)
Actual Location DbFeedsDev
Locations Used EPLUS, IDMS.
Parent dependency It was created depending on the listed mandatory columns for ddm
& eod GFIS tables in the GFIS 1.31 version.
Purpose ETL has to check all the mandatory attributes for not null values, incase of
null values need to raise an error. This expression would check these mandatory columns
for null and incase of null values it returns the error id and actual values coming from the
source system.
Advantages All the mandatory columns would be checked at the same place and we
can effect any changes in the mandatory columns with minimal effort.
Reusability This expression can be used for mandatory column checks of GFIS ddm &
eod table in any interface of dbFeeds.

9.10 DBF_GFIS13_reuEXP_HEADER
Name DBF_GFIS13_reuEXP_ HEADER
Type Expression Transformation (reusable / connected)
Actual Location DbFeedsDev
Locations Used EPLUS, IDMS.
Parent dependency It was created depending on the header format specified for IDMS
& EPLUS interfaces.
Purpose ETL needs header data in its staging table. This expression would extract
this header data from the file and makes it available for further processing in the ETL.
Advantages ETL would get the header data with out impacting on the further
processing and validation rules.
Reusability This can be used in all the interfaces of dbFeeds where the header data
needs to extracted from the inbound flat file. But the format needs to be same as
specified for Singapore.

9.11 DBF_GFIS13_reuLKP_MAP_DATA
Reusable component for mapping data
DbNexus would deliver the mapping data for more than a single table. ETL store this
data in a single table DBF_MAP_T_MAPPING. As history (previous data) also available
in this table we have to check the validity of the record for the defined business day.

Name DBF_GFIS13_reuLKP_MAP_DATA
Type Lookup Transformation (reusable / connected)
Actual Location DbFeedsDev
Locations Used EPLUS, IDMS.
Parent dependency It was created on the ETL mapping data table
DBF_MAP_T_MAPPING.
Purpose ETL has to pick a global standard value corresponding to the value
delivered by the source system and has to check the validity by that data by passing
business date. This lookup would fetch the standard value from the ETL mapping data
table DBF_MAP_T_MAPPING by passing process date to check the validity and other
information like mapping table name and the values delivered by the source system.
Advantages It holds all the mapping data in its cache and corresponding data would be
delivered without referring to actual database table. As it will be refreshed at the starting
of each workflow run it will keeps latest mapping data. As we are not calling the database
table each time it would increase the performance.
Reusability It can be used in all the interfaces of dbFeeds wherever we need to fetch
the mapping data.

Reusable components for domain data


ETL classified the domain data delivered by dbNexus into three types. All the domain
tables having less than 5 columns would come under ‘DM01’ and the domain tables
having columns from 1 to 30 would come under ‘DM02’ and upto 90 columns would
come under ‘DM03’.
As this table also keeps history with it we have to check the validity by passing business
date.

9.12 DBF_GFIS13_reuLKP_DM1_val1 and DBF_GFIS13_reuLKP_DMvaL1

Name DBF_GFIS13_reuLKP_DM1_VAL1
Type Lookup Transformation (reusable / unconnected)
Actual Location DbFeedsDev
Locations Used EPLUS, IDMS and ACS_LONDON.
Parent dependency It was created on the ETL domain data table
DBF_DM_T_DOMAIN01.
Purpose ETL has to check the validity of data delivered by the source system
against domain data delivered by the dbNexus This lookup would holds domain data of
the domain tables that comes under classification DM01. By passing process date to
check the validity and other information like domain table name and the value delivered
by the source system.
Advantages It holds all the DM01 domain data in its cache and corresponding data
would be delivered without referring to actual database table. As it will be refreshed at
the starting of each workflow run it will keeps latest domain data. As we are not calling
the database table each time it would increase the performance.
Reusability It can be used in all the interfaces of dbFeeds wherever we need to use
first value of domain tables DM01.

9.13 DBF_GFIS13_reuLKP_DM1_val2

Name DBF_GFIS13_reuLKP_DM1_VAL2
Type Lookup Transformation (reusable / unconnected)
Actual Location DbFeedsDev
Locations Used EPLUS, IDMS and ACS_LONDON.
Parent dependency It was created on the ETL domain data table
DBF_DM_T_DOMAIN01.
Purpose ETL has to check the validity of data delivered by the source system
against domain data delivered by the dbNexus This lookup would holds domain data of
the domain tables that comes under classification DM01. By passing process date to
check the validity and other information like domain table name and the value delivered
by the source system.
Advantages It holds all the DM01 domain data in its cache and corresponding data
would be delivered without referring to actual database table. As it will be refreshed at
the starting of each workflow run it will keeps latest domain data. As we are not calling
the database table each time it would increase the performance.
Reusability It can be used in all the interfaces of dbFeeds wherever we need to use
second value of domain tables DM01.
9.14 DBF_GFIS13_reuLKP_DM2_DATA

Name DBF_GFIS13_reuLKP_DM2_DATA
Type Lookup Transformation (reusable / connected)
Actual Location DbFeedsDev
Locations Used EPLUS, IDMS and ACS_LONDON.
Parent dependency It was created on the ETL domain data table
DBF_DM_T_DOMAIN02.
Purpose It can be used to check the validity as well as to return a different value
corresponding to the source value linked. This lookup would holds domain data of the
domain tables that comes under classification DM02. By passing process date to check
the validity and other information like domain table name and the values delivered by the
source system.
Advantages It holds all the DM02 domain data in its cache and corresponding data
would be delivered without referring to actual database table. As it will be refreshed at
the starting of each workflow run it will keeps latest domain data. As we are not calling
the database table each time it would increase the performance.
Reusability It can be used in all the interfaces of dbFeeds wherever we need to use any
value of domain tables DM02.

9.15 DBF_GFIS13_reuLKP_DM2_val1

Name DBF_GFIS13_reuLKP_DM2_VAL1
Type Lookup Transformation (reusable / unconnected)
Actual Location DbFeedsDev
Locations Used EPLUS, IDMS and ACS_LONDON.
Parent dependency It was created on the ETL domain data table
DBF_DM_T_DOMAIN02.
Purpose ETL has to check the validity of data delivered by the source system
against domain data delivered by the dbNexus This lookup would holds domain data of
the domain tables that comes under classification DM02. By passing process date to
check the validity and other information like domain table name and the value delivered
by the source system.
Advantages It holds all the DM02 domain data in its cache and corresponding data
would be delivered without referring to actual database table. As it will be refreshed at
the starting of each workflow run it will keeps latest domain data. As we are not calling
the database table each time it would increase the performance.
Reusability It can be used in all the interfaces of dbFeeds wherever we need to use
first value of domain tables DM02.

9.16 DBF_GFIS13_reuLKP_DM2_VAL2
Name DBF_GFIS13_reuLKP_DM2_VAL2
Type Lookup Transformation (reusable / unconnected)
Actual Location DbFeedsDev
Locations Used EPLUS, IDMS and ACS_LONDON.
Parent dependency It was created on the ETL domain data table
DBF_DM_T_DOMAIN02.
Purpose ETL has to check the validity of data delivered by the source system
against domain data delivered by the dbNexus This lookup would holds domain data of
the domain tables that comes under classification DM02. By passing process date to
check the validity and other information like domain table name and the value delivered
by the source system.
Advantages It holds all the DM02 domain data in its cache and corresponding data
would be delivered without referring to actual database table. As it will be refreshed at
the starting of each workflow run it will keeps latest domain data. As we are not calling
the database table each time it would increase the performance.
Reusability It can be used in all the interfaces of dbFeeds wherever we need to use
second value of domain tables DM02.

9.17 DBF_GFIS13_reuLKP_DM3_val1

Name DBF_GFIS13_reuLKP_DM3_VAL1
Type Lookup Transformation (reusable / unconnected)
Actual Location DbFeedsDev
Locations Used EPLUS, IDMS and ACS_LONDON.
Parent dependency It was created on the ETL domain data table
DBF_DM_T_DOMAIN03.
Purpose ETL has to check the validity of data delivered by the source system
against domain data delivered by the dbNexus This lookup would holds domain data of
the domain tables that comes under classification DM03. By passing process date to
check the validity and other information like domain table name and the value delivered
by the source system.
Advantages It holds all the DM03 domain data in its cache and corresponding data
would be delivered without referring to actual database table. As it will be refreshed at
the starting of each workflow run it will keeps latest domain data. As we are not calling
the database table each time it would increase the performance.
Reusability It can be used in all the interfaces of dbFeeds wherever we need to use
first value of domain tables DM03.

9.18 DBF_GFIS13_reuLKP_DM3_VAL2

Name DBF_GFIS13_reuLKP_DM3_VAL2
Type Lookup Transformation (reusable / unconnected)
Actual Location DbFeedsDev
Locations Used EPLUS, IDMS and ACS_LONDON.
Parent dependency It was created on the ETL domain data table
DBF_DM_T_DOMAIN03.
Purpose ETL has to check the validity of data delivered by the source system
against domain data delivered by the dbNexus This lookup would holds domain data of
the domain tables that comes under classification DM03. By passing process date to
check the validity and other information like domain table name and the value delivered
by the source system.
Advantages It holds all the DM03 domain data in its cache and corresponding data
would be delivered without referring to actual database table. As it will be refreshed at
the starting of each workflow run it will keeps latest domain data. As we are not calling
the database table each time it would increase the performance.
Reusability It can be used in all the interfaces of dbFeeds wherever we need to use
second value of domain tables DM03.

9.19 DBF_GFIS13_reuMPL_ERROR_TABLE

Name DBF_GFIS13_reuMPL_ERROR_TABLE
Type Mapplet (reusable)
Actual Location DbFeedsDev
Locations Used EPLUS, IDMS and ACS_LONDON.
Parent dependency Made of transformations like mapplet input through which we can
pass the input values, lookup created on the metadata table DBF_T_TABLE_INFO to get
the table_id, sequence generator to generate error sequence, expression which keeps all
the values mentioned together and generates created date, update strategy which checks
the error ids for 0 and incase of 0 rejects and mapplet output which sends all these values
outside.
Purpose ETL has to generate the error sequence incase of errors and has to get the
corresponding table_id, feed_id and created date and need to reject incase of error ids
coming with 0.
Advantages All the transformations used in raising errors kept together and reused
across all the interfaces so that duplication of the same logic can be avoided. And we can
changes the logic with minimal effort.
Reusability It can be used in all the interfaces of dbFeeds wherever we are going for
error handling.

You might also like