Professional Documents
Culture Documents
© 2010 Informatica
Abstract
You can parse data from a Microsoft Excel spreadsheet with a Custom Data transformation in Informatica Developer. The
Custom Data transformation returns row data to relational tables. This article describes how to configure the Custom Data
transformation.
Supported Versions
¨ Informatica Developer 9.0.1
Table of Contents
Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Data Transformation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Logical Data Object Model Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Custom Data Transformation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Configure Relational Hierarchy Ports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Export the XML Schema. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Create and Deploy the Data Transformation Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Create the Data Transformation Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Deploy the Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Configure the Service Name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Preview the Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Deploy the Application and the Service. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Overview
Your organization maintains employee records in Microsoft Excel sheets. Before you purchase a Human Resources
application, you want to expose the Excel data as a virtual database that you can query.
This article explains how to design a data object mapping in the Developer tool. The data object mapping includes a Custom
Data transformation. The Custom Data transformation calls a Data Transformation service to parse the data from the Excel
spreadsheets. The Custom Data transformation returns row data.
To migrate the data from Microsoft Excel spreadsheets complete the following tasks:
¨ Design a flat file data object to pass Excel file names to the Custom Data transformation.
¨ Create a Data Transformation Parser project in the Data Transformation Developer Studio.
¨ Deploy the project as a Data Transformation service in the Data Transformation repository.
2
¨ Add the Data Transformation service name to the Custom Data transformation in the Developer Tool.
¨ Deploy the Data Transformation service to the same machine as the Data Integration Service that runs the
application.
Data Transformation
Data Transformation is an application that transforms file formats such as Excel spreadsheets or PDF documents. You can
transform data in formats such as HL7, EDI-X12, EDIFACT, SWIFT, NACHA, FIXBAI2, and DTCC.
Develop Data Transformation projects in the Data Transformation Studio visual editor. Deploy the projects from the Data
Transformation Studio to the Data Transformation repository. Informatica accesses the services in the Data Transformation
repository when you create Custom Data transformation mappings and when you run them.
The Data Transformation Engine is the process that runs a Data Transformation service from the repository.
ExelSrc
Physical data object that determines which Microsoft Excel files to process. The input type is command. The
command lists all the Excel files in a directory. The Data Integration Service passes the name of the each
Microsoft Excel file in the directory to the Custom Data transformation.
3
The following image shows the runtime properties for the physical data object:
The list.bat command lists all .xls files in the NewComp directory.
@echo off
for /f %%a IN ('dir /b C:\NewComp\*.xls') do echo C:\NewComp\%%a
CDT_Employees
The Custom Data transformation receives the Microsoft Excel file name in the InputFileName port. The Custom
Data transformation passes the Data Transformation Engine the name of a Data Transformation service to run and
the EXCEL file name to process. The Data Transformation Engine opens the Microsoft Excel file, parses the data,
and returns XML to the Custom Data transformation. The Custom Data transformation passes rows of data to the
EMPL logical data object.
EMPL
The EMPL logical data object receives rows of employee data from the Custom Data transformation.
Excel Files
The employee data is in multiple Microsoft Excel files. Each Excel file contains a heading row and a row for each
employee.
The following table shows the first seven rows of employee data in a spreadsheet:
4
The following table shows the configuration:
Attribute Description
Location The folder and the project location in the Model repository.
Create Options Create As Empty. Do not generate ports from a service. Manually define the output ports on the Custom
Data transformation Structure view.
Input Type File. The input is a file that contains the path to EMP.xls.
5
The output data is optional in the ports unless Not Null is enabled.
The Developer Studio prompts for a schema to define the output data structure. Browse for the
Employees_Tables_Schema.xsd that you exported from the Developer tool.
6
The Developer Studio prompts for sample data for the project:
Browse for one of the Excel spreadsheets of employee data and import it into the project. The Developer Studio shows the
sample data.
After you configure the project, you can run it in the Developer tool. View the output.xml file to verify the data that the service
returns to the Custom Data transformation is correct.
The following figure shows the output.xml results in the Data Transforation Studio:
7
Deploy the Project
Deploy the Data Transformation project to the Data Transformation repository that is on the same machine as the Developer
tool. The project becomes a runnable service.
The Developer tool retrieves available service names from the Data Transformation repository. The service name for the
Excel Parser project is Emp_Excel_2:
Run the Data Viewer for the Custom Data transformation. The Data Transformation Engine runs the Emp_Excel_2 service in
the local Data Transformation repository. The service parses the each EXCEL spreadsheet and returns XML to the Data
Integration Service. The Custom Data transformation returns row data in the Data Viewer.
8
Deploy the Application and the Service
Create a data service with the EMPL logical data object. Define the names for the virtual table and schema. Create an
application for the data service and deploy the application to a Data Integration Service. You must also deploy the Data
Transformation service to the Data Transformation repository on the same machine that runs the Data Integration Service.
Author
Ellen Chandler
Principal Technical Writer
Acknowledgements
The author would like to acknowledge Thiagu Sundaramurthy for his help with this article.