Professional Documents
Culture Documents
Trainer Introduction
Pre-Requisites
Participant Introduction
3
4/9/17 All Rights Reserved
TRAINER
INTRODUCTION
Facilitator : Pranay Kumar P
Email.id : pranay.pothuganti@gmail.com
Ph.No : +91-9160619276
4
4/9/17 All Rights Reserved
PRE-REQUISITES
Oracle SQL
General relational database
(Oracle)
Data Warehousing.
5
4/9/17 All Rights Reserved
LAB SETUP DETAILS
Within LAN
6
4/9/17 All Rights Reserved
PARTICIPANT
INTRODUCTION
About Yourself
Education Background
7
4/9/17 All Rights Reserved
SESSION I
INTRODUCTION
8
4/9/17 All Rights Reserved
AB-INITIO
9
4/9/17 All Rights Reserved
HISTORY OF AB INITIO
Ab-Initio Software Corporation was founded in
the mid of 1990's by Sheryl Handler, the
former CEO at Thinking Machines
Corporation, after TMC filed for bankruptcy.
10
4/9/17 All Rights Reserved
HISTORY OF AB INITIO
Ab-Initio software is a fourth generation data
analysis, batch processing, data manipulation
graphical user interface (GUI)-based parallel
processing tool that is used mainly to extract,
transform and load data.
11
4/9/17 All Rights Reserved
HISTORY OF AB INITIO
beginning.
AB INITIOS FOCUS
Moving Data
Move small and large volumes of
data in an
efficient manner
Deal with the complexity
associated with business data
High Performance
Scalable solutions
Better Productivity
15
4/9/17 All Rights Reserved
AB INITIO PLATFORMS
No process is too big or too small for Ab Initio.
It runs on a few processors or few hundred
processors & runs on virtually every kind of
hardware
16
4/9/17 All Rights Reserved
AB INITIO RUNS ON MANY
OPERATING SYSTEMS
Compaq Tru64 UNIX
Digital Unix
Hewlett-Packard HP-UX
IBM Aix
NCR MP-RAS
Red Hat Linux
IBM/Sequent DYNIX/ptx
Siemens Pyramid Reliant UNIX
Silicon Graphics IRIX
Sun Solaris
Windows NT and Windows 2000
17
4/9/17 All Rights Reserved
AB INITIO BASE
SOFTWARE - TWO
MAIN PIECES:
18
4/9/17 All Rights Reserved
AB INITIO PRODUCT
ARCHITECTURE
User Applications
User Applications
Development Environments
Development Environments
GDE Shell Ab Initio
GDE Shell Ab Initio
19
4/9/17 All Rights Reserved
ANATOMY OF A
RUNNING JOB
What happens when you push the RUN
button?
Your graph is translated into a script that
can be executed in the Shell
Development Environment.
This script and any metadata files stored on
the GDE client
machine are shipped (via FTP) to the
server.
The script is invoked (via REXEC or TELNET)
on the server.
The script creates and runs a job which in-
turn may run 20
All Rights Reserved
4/9/17
ANATOMY OF A RUNNING
JOB
Host Process Creation
Pushing RUN button generates
script.
Script is transmitted to Host node.
Script is invoked creating Host process.
Host
GDE
21
4/9/17 All Rights Reserved
ANATOMY OF A RUNNING
JOB
Agent Process Creation
Host process spawns Agent processes
Host
Agent Agent
22
4/9/17 All Rights Reserved
ANATOMY OF A RUNNING
JOB
Component Process Creation
Agent processes create Component
processes on
each processing node.
Host
Agent Agent
4/9/17 23
All Rights Reserved
ANATOMY OF A RUNNING
JOB
Component Execution
Component processes do their jobs.
It communicates directly with datasets
& internally within each other to move
the data around.
Host
GDE Agent Agent
24
4/9/17 All Rights Reserved
ANATOMY OF A RUNNING
JOB
Successful Component Termination
As soon as each Component process
finishes with its data, it exits with
success status.
Host
GDE Agent Agent
25
4/9/17 All Rights Reserved
ANATOMY OF A RUNNING
JOB
Agent Termination
When all of an Agents Component
processes exit, the Agent informs
the Host process that those components
are finished.
The Agent process then exits.
Host
26
4/9/17 All Rights Reserved
ANATOMY OF A RUNNING
JOB
Host Termination
When all Agents exit- the Host process
informs the
GDE that the job is complete.
The Host process then exits.
Host
27
4/9/17 All Rights Reserved
LAB EXERCISE
28
4/9/17 All Rights Reserved
DAY II
COMPONENTS
29
4/9/17 All Rights Reserved
REFORMAT COMPONENT
Reformatted Output
DEPTNO DNAME
10 ACCOUNTING
20 RESEARCH
30 SALES
40 OPERATIONS
31
4/9/17 All Rights Reserved
LAB EXERCISE
EXERCISE_QUESTIONS\REFORMAT.txt
Reformat
32
4/9/17 All Rights Reserved
FILTER BY EXPRESSION
COMPONENT
FBE
4/9/17 Cont., 34
All Rights Reserved
CONT.,
If the expression returns:
36
4/9/17 All Rights Reserved
LAB EXERCISE
EXERCISE_QUESTIONS\FILTER.txt
Filter By Expression
EXERCISE_QUESTIONS\EXP.txt
37
4/9/17 All Rights Reserved
BROADCAST
COMPONENT
BROADCAST
42
4/9/17 All Rights Reserved
LAB EXERCISE
EXERCISE_QUESTIONS\MULTIPLE TARGE
TS.txt
43
4/9/17 All Rights Reserved
DAY III
COMPONENTS
44
4/9/17 All Rights Reserved
WORKING WITH FILES
Types of Files :
Delimited
Fixed Width
Mixed type
EXERCISE_QUESTIONS\SRC_TRG_FILES.
txt
47
4/9/17 All Rights Reserved
JOIN COMPONENT
48
4/9/17 All Rights Reserved
JOIN TYPES
Inner Join Sets the record-required
parameters for all ports to True.
49
4/9/17 All Rights Reserved
JOIN TYPES -
CONT.,
Case 1: Inner Join join-type
Driving:
Number of the port to which you want to connect the
driving input. The driving input is the largest input. All
other inputs are read into memory.
EXERCISE_QUESTIONS\JOIN.txt
53
4/9/17 All Rights Reserved
CONCATENATE
COMPONENT
Not key-based
Result ordering is by partition
Serializes pipelined computation
54
4/9/17 All Rights Reserved
Useful for:
Creating serial flow from partitioned
data
Appending headers and trailers
Writing DML
CONT.,
The Concatenate component:
4/9/17 Cont., 57
All Rights Reserved
CONT.,
The Sort component:
EXERCISE_QUESTIONS\CONCAT.txt
60
4/9/17 All Rights Reserved
ROLLUP COMPONENT
61
4/9/17 All Rights Reserved
ROLLUP COMPONENT
CreateaDMLtypenamedtemporary_type
Createtherequiredtransformfunctions
Cont.,
CONT.,
At runtime, Rollup executes the followi
ng steps:
1.Inputselection
2.Temporaryinitialization
3.Computation
4.Finalization
5.Outputselection
63
4/9/17 All Rights Reserved
LAB EXERCISE
EXERCISE_QUESTIONS\MFS.txt
64
4/9/17 All Rights Reserved
FUSE COMPONENT
Fuse:
Combines multiple input flows into a single output flow by
applying a transform function to corresponding records of
each flow.
65
4/9/17 All Rights Reserved
CONT.,
Fuse sends the result of the transform
function to the out port.
67
All Rights Reserved
4/9/17
,
CONT.,
At runtime, Scan executes the
following steps:
1. Input selection
2.Temporaryinitialization
3.Computation
4.Finalization
5.Outputselection
68
4/9/17 All Rights Reserved
LAB EXERCISE
EXERCISE_QUESTIONS\SANDBOX_FUSE
_SCAN.txt
69
4/9/17 All Rights Reserved
REDEFINE FORMAT
COMPONENT
70
4/9/17 All Rights Reserved
CONT.,
The Redefine Format component:
72
4/9/17 All Rights Reserved
GATHER COMPONENT
74
4/9/17 All Rights Reserved
MERGE COMPONENT
Useful for creating ordered data flows & used more than
concatenate, but still infrequently
75
4/9/17 All Rights Reserved
INTERLEAVE
COMPONENT
77
4/9/17 All Rights Reserved
LOOKUP FILE
LookupFile:
Representsoneormoreserialfilesoramultifile
The
amountofdataissmallenoughtobeheldinma
inmemory
Thisallows
atransformfunctiontoretrieverecordsmuchm
4/9/17ore
78
All Rights Reserved
LAB EXERCISE
EXERCISE_QUESTIONS\PARAMETERS.tx
t
Parameters
EXERCISE_QUESTIONS\LOOKUP.txt
Lookup file
79
4/9/17 All Rights Reserved
PARTITION BY KEY
81
4/9/17 All Rights Reserved
CONT.,
You can supply the percentages that Partition by
Percentage uses to data records in either of two ways:
Byspecifyingthepercentagesinthepercentages
parameter.
Byconnectingtheoutputofanycomponentthatpr
oducesa list of percentages tothepctportof
Partitionby Percentage.
82
4/9/17 All Rights Reserved
CONT.,
You can assign a different percentage to each
output flow
83
4/9/17 All Rights Reserved
PARTITION BY RANGE
Reads splitter records from the split port, and assumes that t
hese records are sorted according to the key parameter.
84
4/9/17 All Rights Reserved
CONT.,
Determineswhetherthenumberofflowsconnec
tedtotheout port isequalton(wheren-
1representsthenumberofsplitter
records).
If
not,PartitionbyRangewritesanerrormessage
andstopsthe
executionofthegraph.
Readsdatarecordsfromtheflowsconnectedto
theinportin arbitraryorder.
85
4/9/17 All Rights Reserved
CONT.,
Distributesthedatarecordstotheflowsconne
ctedtotheout
port
accordingtothevaluesofthekeyfield(s),asfollo
ws:
Assignsrecordswithkeyvalueslessthanoreq
ualtothe
first splitterrecordtothefirstoutputflow.
4/9/17Assignsrecordswithkeyvaluesgreaterthant
86
All Rights Reserved
PARTITION BY ROUND-
ROBIN
RunProgram: Runsanexecutableprogram.
TheRunProgramcomponent:
Readsdatarecordsfromtheinportifyoucon
nectaflowto
thein port.
Runsthedatarecordsthroughtheprogramn
amedinthe command lineparameter.
Writesthedatarecordstotheoutportifyou
4/9/17 connectaflow 88
All Rights Reserved
DAY V
COMPONENTS
89
4/9/17 All Rights Reserved
INFORMATION
REGARDING PORTS
90
4/9/17 All Rights Reserved
CONT.,
The Gather Logs component:
Collectslogrecordsgeneratedbycompo
nentsthrough
therelog ports
Writesarecordcontainingthetextfrom
theStartText
parameter
tothefilespecifiedintheLogFileparame
ter
Writesanylogrecordsfromitsinportto
4/9/17 thefilespecifiedin the 91
All Rights Reserved
92
4/9/17 All Rights Reserved