You are on page 1of 47

Oracle 10g Data Pump

2006/02/23 Say good bye to exp and imp (or not)!

Simon Pane
Agenda

 Data Pump Overview

 Using Data Pump

 Demonstration

 Data Pump Test Cases

2
Data Pump Overview

3
What is Data Pump?

 A replacement of the traditional export/import utilities?

 The evolution of the traditional export/import utilities?

 A completely new 10g utility serving a similar yet


slightly different purpose?

4
Other Options for Moving Data

 Traditional Export and Import


Pros
 Easy to use – most DBAs have years of experience using these
utilities
 Versatile – various options available; can specify what to include
 Platform independent
 Serial output
Cons
 Comparatively slow
 Can be network intensive
 Non-interruptible / resumable
 Limited filtering options (for example, can exclude just VIEWS)
 Limited remapping options (i.e. from one tablespace to another)

5
Other Options for Moving Data

 Transportable Tablespaces
Pros
 Undoubtedly the fastest way to move data
 Can use the traditional exp/imp or Data Pump to move meta-data
 Cross-platform support if the platform byte-order is the same
Cons
 Tablespaces must be made read-only
 Not selective (must move the entire tablespace)
 Flashback is not possible (tablespace is read only when copied)
 No physical reorganization is performed
 Datafile sizes remain constant
 Must use RMAN to convert the datafile if migrating to a platform with
a different byte-order (check V$TRANSPORTABLE_PLATFORM)

6
Other Options Used Less Frequently

 Extraction to a flat file and loading using SQL Loader


 Direct copy using database links (SQL Plus COPY
command)
 Oracle Streams
 3rd Party data ETL or reorg tools

7
Top 10 Reasons to Love DataPump

1. Similar look and feel to the old exp/imp


2. Can filter on the full range of object types
3. Can re-map datafiles and or tablespaces on import
4. Estimates the export file size (space needed)
5. Parallelizable
6. Significantly faster than the traditional exp/imp
7. PL/SQL Interface – programmable
8. A file is not actually required - can import through a
network link
 Track in v$session_longops
10. Resumable (interruptible and restartable)
8
Top 10 Reasons Not to Love Data Pump

1. Still generates redo (unlike direct path inserts)


2. Aggregation of exported data is not possible (sort only)
3. Performance on the server
4. Harder to tell what it’s doing at any given time
 No equivalent to the STATISTICS option
6. Cannot be used with sequential media such as tapes and
pipes (not read/written serially)
7. Only accesses files on the server, never the client
8. Oracle directories are required in the DB to access the files
 Does not support COMMIT on imp or CONSISTENT on exp
10. If constraints are violated on import, the load is discontinued
9
Operation Fundamentals

 Export/Import
These utilities would basically connect to the Oracle database
via Oracle NET and run queries or DDL/DML
Processing of returned results and I/O operations were done
on the client
 Data Pump
The executables call PL/SQL APIs
Therefore processing is done on the database server
This can be an advantage or a disadvantage depending on the
situation
“Self-Tuning”: no longer need to use BUFFER or RECORDSET

10
Export Operation

exp.exe

Network Oracle
Database

Export
File(s)

11
Data Pump Export Operation

expdp.exe

Network Oracle
Database

Export
File(s)

12
Key Differences

 Dump and log files are on the server, not the client
 Must have a DIRECTORY created in the Oracle
database for I/O
Permissions for the userid connecting to the instance, not the
schemas being exported or imported
 Canceling the client process does not stop the job
 Doesn’t automatically overwrite dump file if it already
exists – returns an error instead
 Parameters (command line) are reported in the log file
 Exported objects order based on table size
(descending) instead of alphabetically

13
Multiple Interfaces

 Command line utilities expdb and impdb


 Similar to the familiar exp and imp in usage
 Use HELP=Y for a list of commands
 Oracle documentation provides a comparison table to exp/imp

2. Enterprise Manager
3. PL/SQL
 Can be used independently but is difficult

 All of these call the DBMS_DATAPUMP API


 Uses Oracle Advanced Queuing
 Uses DBMS_METADATA

14
Unload Mechanisms

 Data Pump automatically chooses to unload data either


using:
Direct path
External Tables (new driver called ORACLE_DATAPUMP)

 Same “External Tables” mechanism that was introduced in


Oracle9i
 When will it use External tables:
When parallelism can be used
When the table contains a complex data type or structure that
prevents direct path unloads
A lot of tables fall under this situation – see Oracle documentation for a
complete list
 It doesn’t really matter to us which method is used
15
Multiple Processes

 Master Control Process


Spawns worker processes
Populates the master control table and log file
The master control table can be queried to track the job’s
process
At the end of an export, the master control table is written to
the dump file and dropped from the database
 Worker Processes
Performs the loading/unloading
Number of processes depends on the degree of parallelism
(the PARALLEL option)

16
Detaching and Re-Attaching

 Issuing “Ctrl-C” from the data pump import will


detach
Import is running on the server so it will continue
Brings you into “interactive-command” mode

 To re-attach, run impdp with the ATTACH= option


Example: impdp userid=system/oracle attach=JOB_01
Brings you back into “interactive-command” mode

17
New Views

 DBA_DATAPUMP_JOBS and USER_DATABASE_JOBS


Identify all jobs regardless of their state
Identify any master tables not associated with an active job

 DBA_DATAPUMP_SESSIONS
Identify user sessions that are attached to a job

 Data pump sessions populate v$session_longops


Documentation says that it is 100% accurate for imports but
testing proves otherwise!!!

18
Security Considerations

 Still uses the EXP_FULL_DATABASE and


IMP_FULL_DATABASE
 A privileged user will have these two roles
 A privileged user can:
Export/import objects owned by other schemas
Export non-schema objects (metadata)
Attach to, monitor, and control jobs initiated by others
Perform schema, datafile, and tablespace remapping

 Similar to the traditional export/import


 Supports label security
If exporting user has the EXEMPT ACCESS POLICY role

19
Object Statistics

 From Oracle documentation regarding data pump


exports:
“A parameter comparable to STATISTICS is not needed.
Statistics are always saved for tables.”
 From Oracle documentation regarding data pump
imports:
“A parameter comparable to STATISTICS is not needed. If the
source table has statistics, they are imported.”

20
Other Random Points

 Can still use a parameter file and the PARFILE


command line option
 Fully supports Automatic Storage Management (ASM)
 Can still flashback to a specified time or SCN
 Can still extract (or backup) DDL (meta data)
Using the SQLFILE option instead of the traditional
INDEXFILE or SHOW options
 Full support of LOBS

21
Using Data Pump

22
Oracle Directory Objects

 Must first create an Oracle directory object and give the


user who will be performing the Data Pump activities
permission to use it (or rely on defaults):

SQL> create or replace directory dpump_demo as 'C:\temp';

Directory created.

SQL> grant read,write on directory dpump_demo to simon;

Grant succeeded.

SQL>

23
Key Data Pump Export Parameters

 CONTENT={ALL | DATA_ONLY | METADATA_ONLY}


 DIRECTORY=directory_object (default=DATA_PUMP_DIR)
 DUMPFILE=[directory_object:]file_name [,…]
 ESTIMATE={BLOCKS | STATISTICS}
 ESTIMATE_ONLY={Y | N}
 EXCLUDE=object_type[:name_clause] [,…]
 FILESIZE=integer[B | K | M |G] (default=unlimited)
 FLASHBACK_SCN=scn_value
 FLASHBACK_TIME=“TO_TIMESTAMP(time_value)”
 FULL={Y | N}
 INCLUDE=object_type[:name_clause] [,…]
 JOBNAME=jobname_string

24
Key Data Pump Export Parameters

 LOGFILE=[directory_object:]file_name
 NOLOGFILE={Y | N}
 PARALLEL=integer (default=1)
 QUERY=[schema.][table_name:]query_clause
 SCHEMAS=schema_name [,…]
 TABLES=[schema_name.]table_name[:partition_name] [,…]
 TABLESPACES=tablespace_name [,…]

25
Data Pump Export Parameter Samples

 Multiple dump files using a substitution variable (%U):


DUMPFILE=DP_DIR1:SCOTT_20060223_%U.dmp

 Excluding indexes that start with “EMP”:


EXCLUDE=INDEX:“LIKE ‘EMP%’”

 Excluding the SCOTT schema from a FULL export:


EXCLUDE=SCHEMA:“=‘SCOTT’”

 Mimicking the traditional CONSITENT parameter:


FLASHBACK_TIME=“TO_TIMESTAMP”

 Exporting only TABLES, FUNCTIONS and VIEWS:


INCLUDE=TABLE,FUNCTION,VIEW

 Using a query clause


QUERY=emp:‘“WHERE salary > 100000”’
26
Key Data Pump Import Parameters

 CONTENT={ALL | DATA_ONLY | METADATA_ONLY}


 DIRECTORY=directory_object (default=DATA_PUMP_DIR)
 DUMPFILE=[directory_object:]file_name [,…]
 EXCLUDE=object_type[:name_clause] [,…]
 FULL={Y | N}
 INCLUDE=object_type[:name_clause] [,…]
 JOBNAME=jobname_string
 LOGFILE=[directory_object:]file_name
 NOLOGFILE={Y | N}
 PARALLEL=integer (default=1)
 QUERY=[schema.][table_name:]query_clause

27
Key Data Pump Import Parameters

 REMAP_DATAFILE=source_datafile:target_datafile
 REMAP_SCHEMA=source_schema:target_schema
 REMAP_TABLESPACE=source_tablespace:target_tablespace
 REUSE_DATAFILES={Y | N}
 SCHEMAS=schema_name [,…]
 SKIP_UNUSABLE_INDEXES={Y | N}
 SQLFILE=[directory_object:]file_name
 TABLE_EXISTS_ACTION={SKIP|APPEND|TRUNCATE|REPLACE}
 TABLES=[schema_name.]table_name[:partition_name] [,…]
 TABLESPACES=tablespace_name [,…]

28
Interactive Mode Commands

 ADD_FILE=[directory_object:]file_name [,…]
 CONTINUE_CLIENT
 EXIT_CLIENT
 FILESIZE=number
 KILL_JOB
 PARALLEL=integer
 START_JOB
 STATUS
 STOP_JOB

29
Demonstration

30
Exporting and Importing Sample Schemas

 expdp system/oracle@ORA1020
dumpfile=scott.dmp schemas=scott

 impdp system/oracle@ORA1020
dumpfile=scott.dmp schemas=SCOTT
remap_schema=SCOTT:LARRY

 expdb system/oracle@ORA1020
dumpfile=larry.dmp schemas=larry

 SELECT * FROM DBA_DATAPUMP_JOBS;

31
Using Interactive Mode

 Ctrl-C to detach from the current export

 Export> status
 Export> stop_job

 expdp system/oracle@ORA1020
attach=SYS_EXPORT_SCHEMA_01

 Export> start_job
 Export> exit_client

32
Data Pump Test Cases

33
Test Scenario #1

 Generated sample SIEBEL data


Brushed off the dust on some old SIEBEL data population
scripts (circa 07/2000)
Designed for SIEBEL 6 on Oracle 8.1.6
Actual data is not important

 Schema objects
Tables: 218 (many empty tables)
Indexes: 1180 (SIEBEL is a heavily indexed application)

 Schema size (from DBA_SEGMENTS)


Tables: 1255MB
Indexes: 148MB

34
Export Performance Test Criteria

 All work performed on laptop C681


 SGA remains constant
SGA_TARGET=0
SGA_MAX_SIZE=256MB
BUFFER CACHE=152MB
SHARED_POOL=60MB

 Mostly default parameters


 Not monitoring CPU utilization
 Performed 4 runs
Disregarded results from 1st run and averaged the other 3

35
Export Scripts

 EXP
 sqlplus -s system/oracle @timestamp.sql > exp.log
 exp.exe userid=system/oracle@ORA1020 file=SIEBEL.dmp
log=SIEBEL.log owner='SIEBEL'
 sqlplus -s system/oracle @timestamp.sql >> exp.log

 EXPDP
 erase SIEBEL.dmp
 sqlplus -s system/oracle @timestamp.sql > expdp.log
 expdp.exe userid=system/oracle@ORA1020
dumpfile=SIEBEL.dmp logfile=SIEBEL.log
schemas='SIEBEL' directory=test_dir
 sqlplus -s system/oracle @timestamp.sql >> expdp.log

36
Export Performance Test Results

exp expdp
Average Export 6:02 3:18
Time
Estimated File Size N/A 1.201 GB
Actual File Size 965 MB 621MB

37
Import Scripts

 IMP
 sqlplus -s system/oracle @timestamp.sql > exp.log
 imp.exe userid=system/oracle@ORA1020 file=SIEBEL.dmp
log=SIEBEL.log fromuser='SIEBEL' touser='SCOTT'
commit=y
 sqlplus -s system/oracle @timestamp.sql >> exp.log

 IMPDP
 sqlplus -s system/oracle @timestamp.sql > expdp.log
 impdp.exe userid=system/oracle@ORA1020
dumpfile=SIEBEL.dmp logfile=SIEBEL.log
schemas='SIEBEL' directory=test_dir
remap_schema=SIEBEL:SCOTT
remap_tablespace=TOOLS:SCOTT_DATA
 sqlplus -s system/oracle @timestamp.sql >> expdp.log
38
Import Performance Test Results

imp impdb
Average Import Time 27:07
Average Import Time (no 25:19
indexes)
Average Import Time (no rows) 27:27

 Database was in ARCHIVELOG mode


 Destination tablespace and archived log destination
were both on ASM drives
 Machine performance was degraded much more by
impdb import
 No import tuning performed (only COMMIT=Y)

39
Test Scenario #2

 Data taken from an actual CGI internal application


 Schema objects
Tables: 22
Indexes: 26

 Schema size (from DBA_SEGMENTS)


Tables: 300MB
Indexes: 101MB

40
Export Scripts

 EXP
 sqlplus -s system/oracle @timestamp.sql > exp.log
 exp.exe userid=system/oracle@ORA1020 file=SCOTT.dmp
log=SCOTT.log owner='SCOTT'
 sqlplus -s system/oracle @timestamp.sql >> exp.log

 EXPDP
 erase SIEBEL.dmp
 sqlplus -s system/oracle @timestamp.sql > expdp.log
 expdp.exe userid=system/oracle@ORA1020
dumpfile=SCOTT.dmp logfile=SCOTT.log schemas='SCOTT'
directory=test_dir
 sqlplus -s system/oracle @timestamp.sql >> expdp.log

41
Export Performance Test Results

exp expdp
Average Export 1:24 1:32
Time
Estimated File Size N/A 290MB
Actual File Size 261MB 233MB

42
Import Scripts

 IMP
 sqlplus -s system/oracle @timestamp.sql > exp.log
 imp.exe userid=system/oracle@ORA1020 file=SCOTT.dmp
log=SCOTT.log fromuser='SCOTT' touser='LARRY' commit=y
 sqlplus -s system/oracle @timestamp.sql >> exp.log

 IMPDP
 sqlplus -s system/oracle @timestamp.sql > expdp.log
 impdp.exe userid=system/oracle@ORA1020
dumpfile=SCOTT.dmp logfile=SCOTT.log schemas='SCOTT'
directory=test_dir remap_schema=SCOTT:LARRY
remap_tablespace=SCOTT_DATA:LARRY_DATA
 sqlplus -s system/oracle @timestamp.sql >> expdp.log

43
Import Performance Test Results

imp impdb
Average Import Time 5:48 2:26

 Database was in NOARCHIVELOG mode


 Destination tablespace and archived log destination
were both on ASM drives
 No import tuning performed (only COMMIT=Y)

44
Conclusions

45
Conclusions

 Data Pump is an exciting new Oracle 10g tool that


provides many benefits over the traditional export and
import utilities
 Whether to use Data Pump, Transportable Tablespaces,
or even the traditional exp/imp will depend on the situation
 Since the command line interface is easy to use and so
similar to the traditional exp/imp, DBAs and developers
should spend the time to learn how to use it

 Final thought: since Data Pump dump files and traditional


export dump files are not compatible/interchangeable,
should a new file extension be used??? (.dmp vs .dpd)

46
The End

Comments, Questions ???

47

You might also like