Professional Documents
Culture Documents
What is Data Validation? Identifying errors in data sets that have been Moved or Transformed to ensure they are Complete and Accurate and meet Expectations or Requirements.
Customers estimate data testing SHOULD take 25-30% of all hours spent on Data Integration
Most customers admit they do not do enough data validation, resulting in poorer data quality and higher project risk
PowerCenter upgrades can take up to weeks or months to complete due to manual testing effort
It takes one day to upgrade the ETL software
Data is Identical
ETL version upgrade ETL Migration Database migration Application Retirement
9
Production Reconciliation
Protect the integrity of data that is loaded into production systems. Erroneous data due to failed loads, faulty logic or operational issues is caught in a proactive automated manner and can be addressed as needed
10
11
Database Views
V_Summary Id: name name: string V_Tests Price: integer DateId: name in: date V_Results Datename: string out: date Salary: float Price: integer Date Id: name in: date Date name: string out: date Price: integer Salary: float Date in: date Date out: date Salary: float
Reports
Execute Tests
Repository PowerCenter
Enterprise Data
12
Other
Run from GUI or CLI (DVOCmd) Built-in reporting
13
14
Technology Company Reduced data testing time by 80% with Data Validation Option
SAAS provider of Sales Compensation and analytics Data absolutely has to be correct as it affects peoples paychecks Very high visibility of the data with users Trust in the data is key
THE CHALLENGE New release every ~1 month 1 Full week of data testing by QA team per release Developers wrote SQL for testing the data Testers would execute the SQL, track errors and work with Developers to resolve And who was testing the SQL to make sure it was correct? INFORMATICA ADVANTAGE With DVO they are able to test 100,000s rows of data in regression tests Developers no longer required to write SQL Testers are now empowered and independent of developers RESULTS/BENEFITS Have created a test suite of over 1000 Tests Testers can manage the testing environment Can test large volumes of data Testing time reduced from 1 week to 1 day (80% less) Spend free time on higher level tasks
15
Production Reconciliation
Financial Services Company Ensures DW is Complete and Accurate with Data Validation Option
Good data is essential to good business decisions. Their calculations of portfolio risk and value must be correct. Spends hundreds of millions purchasing troubled debt in the USA The data and risk calculations on those assets must be correct. Bad data could cost them millions and put them out of business.
THE CHALLENGE
Business users were complaining about missing data in the systems. Data errors can lead to very costly bad business decisions. They were doing manual testing via developer-written mappings and PL/SQL Other products available today could not meet their requirements
INFORMATICA ADVANTAGE
With DVO they are able to perform detailed reconciliations across source and target systems. With DVO, they have a complete audit trail.
RESULTS/BENEFITS
DVO found where data was missing Found thousands of missing records due to bad coding, & improperly rerun failed jobs Reloaded all missing data in two weeks They are looking to implement ongoing incremental validation for all new data loaded into tables
16
Production Reconciliation
KEY BUSINESS IMPERATIVE AND IT INITIATIVE
Mid-size Technology Company Reconciling MDM data using Data Validation Option
Customer and contact hub is pivotal to efficient business operations Millions of records processed across various systems Ensure BAs, line managers and customers had access to accurate and complete data based on their needs
THE CHALLENGE INFORMATICA ADVANTAGE RESULTS/BENEFITS
No easy way to reconcile data in systems to identify bad data or identify extent of errors
Incorrectly augmented data in systems Gold record data didnt always match across systems Faulty records propagated downstream.
DVO reconciled data across systems (e.g. SalesForce and Hub) and found:
1000s of missing records between systems Incorrectly augmented D&B data Improperly coded golden records
Identified errors due to faulty DI logic, and error handling process Ensured incorrect records no longer being use in marketing campaigns Bad customer data no longer reaching customer in portal
17
18