Professional Documents
Culture Documents
Details
Category: ETL BODS
Last Updated on Monday, 17 December 2012 20:57
Written by Saurav mitra
Error Handling, Recoverability are the important aspects of any ETL tool. Some of the ETL
tools have some sort of in-built error-handling and automatic recovery mechanisms in place.
Alternatively we usually design our ETL framework to handle runtime errors or exceptions.
The three mains goals for a perfect ETL framework are Error Handling and Logging,
Recoverability and Restartability. If a job does not complete properly, we must fix the
problems that prevented the successful execution of the job and run the job again. However,
during the failed job execution, some dataflows in the job may have completed and some
tables may have been loaded, partially loaded, or altered. Therefore, we need to design the
ETL jobs to be recoverable i.e. rerun the job and retrieve all the data without duplicate or
missing data. There can be various techniques to recover from unsuccessful job executions.
Let us check the in-built features available in SAP Data Services. Recovery mechanisms are
available in SAP Data Services for batch jobs only.
TimeStamp The date and time when the thread generated the message.
CON - Connection errors. The connection indicated could not be initialized or failed
during execution.
Data Conversion Errors Suppose a field might be defined in the File Format
Editor as having a data type of integer but the data encountered is actually varchar.
Row Format Errors In the case of a fixed-width file, the software identifies a row
that does not match the expected width value.
In the File Format Editor, the Error Handling set of properties performs the following actions
on selection:
Can limit the number of warnings to log without stopping the job.
Check for either of the two types of source flat file error (Data Conversion & Row
Format).
Writes the invalid rows to a specified Error file in the Job Server.
Stop processing the source file after reaching a specified number of invalid row count.
1. Source file path and name, as multiple input source files can be processed using the
file format in the dataflow.
2. Row number of the error record in the source file.
3. Detailed error description for the rejected record.
4. Column number where the error occurred.
5. All columns values of the invalid row seperated by semicolons.
If the file format's Parallel process thread option is set to any value greater than 0 or
{none}, then the row number in source file value will be -1 for the invalid records.
Often we come across scenarios where we have the flat file definition in an excel sheet and we
need to create corresponding File Format in SAP Data Services. Alternatively we import file
format definition from a Sample Source file.
But based on the first 20 records SAP Data Services tries to give the best Data types of the input
columns from the flat file. In most of the cases we need to modify the datatype, length or their
precision.
So first of all open an editor and copy the below code and paste it.
2.
Next modify all the lines marked in BOLD with the information as per your requirement.
3.
After alteration save the file to a desired location with .ATL extension.
4.
Next from SAP Data Services Designer, Go to Tools and select Import From File
5.
6.
7.
8.
"reader_skip_empty_files" = 'yes',
"reader_write_error_rows_to_file" = 'no',
"root_dir" = 'E:\\POC\\',
"row_delimiter" = '\\n',
"table_weight" = '0',
"time_format" = 'hh24:mi:ss',
"transfer_custom" = 'no',
"use_root_dir" = 'no',
"write_bom" = 'no' );
Points to Remember
First of all modify the File Format DataStore name in my case it is FF_MyFileFormat
Next add the Field Definition section, i.e. Field Name and Data Type, Field Size, Precision
and Scale.
Modify the Column Delimiter of the file format metadata. In my case its Tab \\t
Next modify the Width section of the file format metadata after column delimiter section.
Going forward "column_width1" signifies our first field, "column_width2" signifies second
field width.
Only set the width for the VARCHAR Data Types corresponding to their Field Size.
Rest for all the other Data Type Fields put the column_width as 1
Lastly modify the File Location i.e the "root_dir". In my case it is 'E:\\POC\\'
Next give a name to the File Format and Location and File Name and we are good to go.
Alternatively sometimes we want to copy the Source or Target Table Definition in an Excel
Sheet for Analysis. So the simple way here is to select the Schema-Out of Source or Target
definition name and right-click to select Create File Format.
Next Double click at the square left of the Field Name or select the first row definition followed by
Shift and select the last row definition to Select the Entire Content.
Next Simply COPY(Ctrl+C) and then in the Excel sheet PASTE(Ctrl+V) and we get the table
metadata information in an Excel Sheet.
Again if we want to get the DDL script for a flat file of template table or source/target we can
simply right click the Schema, click Create File Format. Next Click on Show ATL, from there we can
copy the definition.
Many times we may need to replace the Data Services supported Data type to Native Database
Data types.