You are on page 1of 28

<Insert Picture Here>

Data Masking using Enterprise Manager Managing Sensitive Information in Non-Production Environments
Ofir Manor Senior Technology Specialist, Oracle

ofir.manor@oracle.com

Agenda
Introduction Data Masking Overview Data Masking Examples Related EM technology

Agenda
Introduction Data Masking Overview Data Masking Examples

Securing Production Environment


In recent years, increasing attention is given to securing the production environment:
Regulatory Requirements (you know the list) Internet access every where (customers, partners) Increasing threats Increasing awareness to inside and outside threats

Oracle Database has a lot of functionality for this. For example:


Authentication ASO (Advanced Security Options) Network Traffic Encryption - ASO Data At Rest Encrypting ASOs Transparent Data encryption, Oracle Secure Backup Access Control privileges, roles, VPD, Label Security Auditing regular audit, Fine-Grained Audit, Oracle Audit Vault Limiting Super Users Oracle Data Vault

What About Other Environments?


Important systems -> many environments
Pre-prod, test, dev, training Usually more than one of each type Sensitive information all over the place

QA / dev can usually do anything in these environments. DBAs / sys admins can usually do anything in these environments Sometimes partners have full access to these environments (consultants, outsourcing dev / testing / monitoring etc) Are these environments audited? Do you practice careful access control?

What Can Be Done?


There are two options: 1. Heavily investigate in securing all your database environments

Adds IT administrative overhead auditing, privilege management etc Annoying QA / dev Not fun Will be always in lower priority Might be neglected, worked around etc over time

2. Make sure no sensitive data arrives to these environments


Mask the data while provisioning these environments Sensitive data can not leak if its not there An elegant, compliant solution

Agenda
Introduction Data Masking Overview Data Masking Examples

What is data masking?


What The act of anonymizing customer, financial, or company confidential data to create new, legible data which retains the data's properties, such as its width, type, and format. Why To protect confidential data in test environments when the data is used by developers or offshore vendors When customer data is shared with 3rd parties without revealing personally identifiable information

LAST_NAME SSN
AGUILAR BENSON DSOUZA FIORANO 203-33-3234 323-22-2943 989-22-2403 093-44-3823

SALARY
40,000 60,000 80,000 45,000

LAST_NAME SSN
ANSKEKSL 11123-1111

SALARY
40,000

BKJHHEIEDK
KDDEHLHESA FPENZXIEK

111-34-1345
111-97-2749 111-49-3849

60,000
80,000 45,000

Enterprise Manager Data Masking Pack


Test Clone Clone Test

Production

Staging

Major features Data mask format library Define once; execute multiple times View sample data before masking Automatic database referential integrity when masking primary keys
Implicit database enforced Explicit application enforced

Installed as part of Oracle Enterprise Manager (Grid Control) 10g Release 4 (10.2.0.4)

Format Libraries
Mask Primitives
Random Number Random String Random Date within range Shuffle Sub string of original value Table Column

User Defined Function


National Identifiers Social Security Numbers Credit Card Numbers

Example Create a New Format

User-defined mask formats


Email notification testing

Masking Definitions
Associates formats with database
Maps formats to table columns being masked Defines dependent columns Associated Database target

Automatically identifies Foreign key relationships Can specify undeclared constraints as related columns Import-from or export-to XML Create like to apply to similar databases

Referential Integrity Enforcement

Database -enforced

Application -enforced

Pre-Masking Validation
Ensure uniqueness can be maintained Ensure formats match column data types Check Space availability Warn about Check Constraints Check presence of default Partitions

Masking Workflow
Security Admin
Forma t Librar y

Identify Sensitive Information

Identify Data Formats

Masking Definition

DBA

Clone Prod to Staging

Review Mask Definition

Execute Mask

Clone Staging to Test

Prod

Staging

Test

Performance
Optimizations
SQL Parallelism for tables > 1 million rows Statistics collection before & after masking CTAS statement with NOLOGGING

Test results
Case 1 60GB Database 100 tables, 215 columns 20mins Case 2 6 column, 100 million row table Random Number 1.3 hours

Data Masking Pack feature details


Data Masking primitives
Random numbers Random digits Random strings Random date User defined function (PL/SQL) Exportable and importable format definition (XML-based)

Validation
Mask validation with data type Data overflow validation Multiple parent FKs, circular dependency, constraints Automatic exclusion of CLOB, BLOB, NCLOB, LONG, LONG RAW, XML column types Imported mask definition validated against database schema Space availability check

Masking algorithms
Unique value generation Shuffle Constant

Mask definition
Association of masking formats with application schema Related application columns without defined constraints in data dictionary Exportable and importable XML mask definitions Create Like to apply mask definition to other databases

Efficiency
One bulk operation per table regardless of number of masked columns CTAS to recreate masked table Leverage database features, e.g. parallelism, no logging.

Agenda
Introduction Data Masking Overview Data Masking Examples

Handling First / Last Name


Using Shuffle
Useful if first name and last name are in different columns Preserves real values and real data distribution Bigger data sets minimize leak risk

Using Random Strings


Really random Not real names, different data distribution

Using a table based lookup


Example fakenamegenerator.com

ID Number
Israeli ID Number uses a check digit
IsraCard, Mastercard etc also uses some kind of check digit

The check digit protects from:


One digit error Two adjacent digits replaced

The algorithm is well documented Easy to write a function to do it

ID Number Algorithm

ID Number Algorithm

Israeli ID Number Algorithm

Israeli ID Number - Format

Agenda
Introduction Data Masking Overview Data Masking Examples

You might also like