You are on page 1of 47

Best Practices for Interpreting

PL/SQL Hierarchical Profiles for Effective Tuning


CON2082
26 October 2015
Martin Buechi, Lead Software Architect

Avaloq | www.avaloq.com
Switzerland, UK, Luxembourg, Germany, France, Singapore, Hong Kong, Philippines, Australia

Performance Is Key
Presenter Sharing Passion for Efficient Development of High-Quality DB Applications
Avaloq: Swiss core banking system vendor
Development started 1993 (Oracle 7), 200 PL/SQL developers
Business logic in 20 M lines of PL/SQL in database
Approach also relevant for small shops with little DB code

Martin Bchi: Architect responsible for Oracle foundation


Oracle PL/SQL Developer of the Year 2009
Other Oracle interests: modularization, ILM, tools
Goals
Share passion. Help you improve your development.
No selling of consulting services or products.
2

Goals of Performance Profiling


Performance is a key feature of any application

Performance is what users experience:


The response time from the click
to the display of the result on the screen.
The universal experience of programmers
who have been using measurement tools has
been that their intuitive guesses fail.
Donald Knuth
3

Overall End-to-End Process and Profile


Design for performance, then measure performance & tune where applicable
Sample tuning process

1. Focus on most valuable business process and biggest resource consumers.


2. Explore alternatives to achieve same business outcome (requirements, usage).
3. Measure response time
4. Tune what takes most time. List improvement options. Implement best.

End-to-end performance profile


0 ms
Browser
Network
Middle tier
Database

100 ms

200 ms

Why Profile PL/SQL?


Often significant potential
Where to look for performance
Level

Potential

# Presentations at OOW

Algorithm (e.g., PL/SQL)

Very high

Very few

Physical data modelling

Usually lower

Few

SQL

Usually lower

Very many

Infrastructure, instance

Usually lower

Many, past very many

Answer questions get actionable information on important business tasks


How long did PL/SQL take?

How can the time be reduced?

Why did it take so long?

Are we done (economically optimal)?


5

Not profiling PL/SQL?


Suboptimal performance because
not enough PL/SQL or
not optimal PL/SQL.

PL/SQL Profilers
Property

dbms_profiler

dbms_hprof

Since (and mostly unchanged since)

8.1.5

11gR1

Recording by line

Call sequence recording

Memory allocation recording

Low performance overhead

Recording

table

file

Start with dbms_hprof, then use dbms_profiler if reporting on line-level required (e.g.,
unclear where time spent inside subprogram, code coverage reporting)
7

Installation of PL/SQL Hierarchical Profiler


Package
Installed on all DBs: @$ORACLE_HOME/rdbms/admin/catproc.sql
grant execute on SYS.DBMS_HPROF to <user>;
Directory
create or replace directory PLSHPROF_DIR as '/tmp';
grant read, write on directory PLSHPROF_DIR to <user>;
Optional tables (SYS.DBMSHP_%) for reporting
@?/rdbms/admin/dbmshptab.sql
8

Gathering of Profiles
PL/SQL API
dbms_hprof.start_profiling('PLSHPROF_DIR', 'myprofile.hpf');
/* activity to be profiled */
dbms_hprof.stop_profiling; /* optional, or just terminate session */
Built into SQL Developer, TOAD, PL/SQL Developer, etc
SQL Developer loads
profile into tables and
deletes raw files when
using built-in functionality.

Taking Useful Profiles


Scoping
Avoid unnecessary actions (PL/SQL executions) when profiling is activated
Wait events in profile if and only if PL/SQL on stack profile time elapsed time
In profile: time in PL/SQL, wait in SQL executed by PL/SQL, Blocking wait in PL/SQL API
(dbms_aq.dequeue, dbms_lock.request, dbms_alert.waitany, etc)
Not in profile: SQL*Net message from client, wait in SQL execute directly by client

Ideally, turn start / stop with top-level call or on same call stack level

Measure realistically
First vs. following executions
On DB: PL/SQL code loading, SQL parsing, buffer cache

In session: package initialization


10

Build Your Own Advanced Gathering


No out of the box solutions for
Gathering for other session and based on module like dbms_monitor for SQL trace
No third-party tools for scoping of profiles / no always running recorder like ASH

Build your own


Instrumentation wrapper that calls dbms_application_info.set_* and
Starts / stop PL/SQL hierarchical profiler (and other tools). Unique name
since re-start with same name overwrites. OK to concatenate.
Receives commands from other sessions, e.g., through global context
Raise an SR to support Bug 12883592 : ADDITIONAL FUNCTIONALITY TO
INSPECT THE CURRENT STATE OF OTHER SESSION
Logon trigger as last resort

11

Make Gathering & Reporting Accessible to Developers & Users

12

Example with Hierarchy and Skew


create or replace procedure h is
procedure sleep(seconds number) is l_cnt number;
begin
if seconds is not null then dbms_lock.sleep(seconds); end if;
select count(*) into l_cnt from dba_objects, dba_tablespaces;
end sleep;
-procedure call_no_sleep
is begin sleep(null); end call_no_sleep;
-procedure call_sleep(seconds number)
is begin sleep(seconds); end call_sleep;
begin
call_sleep(1); call_sleep(10);
call_no_sleep;
end h;

13

Content: Call, Elapsed Time with PL/SQL on Stack, Return


Header
Start
Call
Execute s
Return

P#V
P#!
...
P#C
P#X
P#R
...

PLSHPROF Internal Version 1.0


PL/SQL Timer Started
PLSQL."SYS"."DBMS_LOCK"::11."SLEEP"#9689ba467a19cd19 #197
1000064

Namespace

Object

Type

Subprogram

Sig hash

Line

Presentation & Reading Skills to Find Relevant Problems Quickly


May not need the most detailed format every time
Format options

Raw or processed (HTML, table, graphical) with or without navigation (drilldown)


Individual calls or aggregated (average, median, histogram)
Different perspectives
Look at profiles from multiple angles before taking actions.
A fool with a tool is still a fool
Study & understand output on toy example first
engineer: efficiently get relevant, actionable information
scientist: understand everything
presenter: nice graphics to explain

15

HTML Output with plshprof


Example
oracle@srv:db > ${ORACLE_BIN}/plshprof -output myprofile myprofile.hpf
PLSHPROF: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 64bit Production
[7 symbols processed]
[Report written to 'myprofile.html']
oracle@srv:db > ls myprofile*.html
myprofile_2c.html
myprofile_2n.html
myprofile_md.html
myprofile_ms.html
myprofile_pc.html
myprofile_td.html
myprofile_2f.html
myprofile_aggr.html
myprofile_mf.html
myprofile_nsc.html
myprofile_tc.html
myprofile_tf.html

myprofile_fn.html
myprofile_nsf.html
myprofile_ts.html
myprofile.html
myprofile_nsp.html

Script to concatenate to single HTML file in appendix.


16

PL/SQL or SQL?

High SQL percentage?


Problem in SQL
Test system with bad I/O
Bad algorithm in PL/SQL with too many SQL calls
17

PL/SQL Hierarchical Profile and SQL Trace


SQL trace shows
Bad SQL, slow I/O, concurrency waits
Number of rows returned processed in SQL

Must start SQL trace first


1 dbms_monitor.session_trace_enable;
2 dbms_hprof.start_profiling('PLSHPROF_DIR', 'myprof.hpf');
Bug 22085980 : SETTING A DATABASE EVENT IN A SESSION STOPS THE PL/SQL
HIERARCHICAL PROFILER

18

Expensive and Called Multiple Times


Duration = #Execution * #AverageDurationPerExecution fewer calls faster

Focus on high duration and many executions


19

Counts: Do We Need to Call a Subprogram n Times?


May not directly yield biggest benefit, but why so many calls?

20

Subtree with Much Higher Ind% than Next Line

21

Breakdown of Children / Parent


Optionally with tool showing source code
Percentage of total run

Percentage of Descendants (not Subtree)

Look at breakdown of time in report and examine source code.


Percentage of function as part of subtree not shown.

22

Time by Module

Very useful if time in some packages split across very many subprograms.
Possible option to estimate tuning potential.
23

Skew in Subtree Between Callers (Parents)

24

Skew in Subtree Calls from Same Parent


Own call analysis tool for individual call and histogram reporting
select * from table(single_call('PLSHPROF_DIR', 'myprofile.hpf',
'K.H.H.SLEEP'));
OCC

PARENT

SUBTREE

FUNCTION

DESCENDANTS

K.H.H.CALL_SLEEP

2,800,569

42

2,800,527

K.H.H.CALL_SLEEP

11,812,431

17

11,812,414

K.H.H.CALL_NO_SLEEP

1,762,206

12

1,762,194

Focused report
${ORACLE_BIN}/plshprof -trace '"K"."H"."H.SLEEP"' -skip 1 -collect 1 output second myprofile.hpf
25

Limit Reporting to Contiguous Set of Invocations of Subtree


When gathering cannot be properly scoped / focus without distorted averages
Subtree focus & number of invocations

plshprof -trace '"K"."H"."H.SLEEP"' -skip m -collect n ...


Always three dot-separated arguments between "". See raw trace for examples:
Standalone subprogram H in schema K

"K"."H"."H"

PL/SQL VM

"".""."__plsql_vm"

Anonymous PL/SQL

"".""."__anonymous_block"

Specific SQL

"K"."P"."__dyn_sql_exec_line631"

26

Export to Microsoft Excel for Annotation and Processing

27

Load into Tables SYS.DBMSHP_%


Example

28

Careful: Hierarchy and Skew Lost in Import into Tables

29

Sample Domain-Specific Custom Reports for Generated Code


id

type

classif

cond

9090

[doc.bp.class(9089)=101619]

5075

[rm$spread_deriv.rel_eam(doc) = '0']

3. Join
1. Generate

id

type

module

function

subtree_elapsed_time

deriv#

cond$1

973

2. Execute with HProf, load into tables

create or replace package body k.deriv# is


function cond$1(
i_doc
doc_mgr#.t_doc
) return varchar2 is
30

Graphical Presentations and Profiler for Other Languages


Interactive flame graph

(http://www.brendangregg.com/flamegraphs.html)

exec flatten('PLSHPROF_DIR', 'myprofile.hpf', 'myprofile.flat')


flamegraph.pl myprofile.flat > myprofile.svg

31

flamegraph.pl Input from Sample Program


Compact semicolon-separated time-aggregated call stack with duration
Execution count in brackets is our addition
H.H [1] 7
H.H [1];H.H.CALL_SLEEP [2] 4
H.H [1];H.H.CALL_SLEEP [2];H.H.SLEEP [2] 59

H.H [1];H.H.CALL_SLEEP [2];H.H.SLEEP [2];H.__static_sql_exec_line7 [2] 3609411


H.H [1];H.H.CALL_SLEEP [2];H.H.SLEEP [2];DBMS_LOCK.SLEEP [2] 11003530
H.H [1];H.H.CALL_NO_SLEEP [1] 1
H.H [1];H.H.CALL_NO_SLEEP [1];H.H.SLEEP [1] 12

H.H [1];H.H.CALL_NO_SLEEP [1];H.H.SLEEP [1];H.__static_sql_exec_line7 [1] 1762194

32

flamegraph.pl Output of Sample Program

Semantics
y-axis: Call stack, x-axis: width proportional to duration, siblings sorted alphabetically
Interactive: Click to zoom / reset zoom, Search
33

Google Chrome Developer Tools cpuprofile


Generate cpuprofile from PL/SQL HProf trace
exec chrome_cpuprofile('PLSHPROF_DIR', 'myprofile.hpf','myprofile.cpuprofile')
Files too big? Use (sparse / adaptive) sampling

34

Chrome Input from Sample Program


{"head": {
"functionName": "(root)", "id": 1,
For aggregated times
"children": [
{
"functionName": "H.H", "hitCount": 7, "id": 2,
"children": [
{
"functionName": "H.H.CALL_NO_SLEEP", "id": 3, ...
"functionName": "H.H.SLEEP", "id": 4, ...
"functionName": "H.__static_sql", "id": 5, ...
}, ...
]
}
]
},
"startTime": 0, "endTime": 16375218,
"samples":
[2, 4, 5,
4, ... ],
"timestamps": [0, 1, 5, 1762199, ... ]

Call stacks

Samples
(Stack ID, time)

Google Chrome Developer Tools cpuprofile


Opening .cpuprofile files

https://developers.google.com/web/tools/chrome-devtools/profile/rendering-tools/js-execution
36

Flame Chart of Sample Program

Semantics
y-axis: Call stack (top down)
x-axis: time

Many more visualization tools that we dont


use, e.g., Callgrind / KCachegrind, gprof, etc.
37

Difference Between Runs (Tuning, Input Variation, etc)


Manual
Stare at two reports and compare manually
SQL to compare runs loaded into SYS.DBMSHP_% tables

plshprof HTML with two input files


${ORACLE_BIN}/plshprof -output comp h1.hpf h2.hpf

38

Sample Comparison Overview Report

39

Limiting Call Depth in Gathering


Smaller files, but relevant information may be lost requiring re-gathering
Syntax

dbms_hprof.start_profiling(, max_depth => n)

omitted details

Raw trace file content: Time summed up at last reported level


max_depth => null
P#C PLSQL."K"."H"::7."H.CALL_SLEEP"
P#X 1
P#C PLSQL."K"."H"::7."H.SLEEP"
P#X 23
P#C PLSQL."SYS"."DBMS_LOCK"::11."SLEEP"
P#X 1000064
P#R
... /* statis SQL call */
P#R
P#X 0
P#R

max_depth => 2
P#C PLSQL."K"."H"::7."H.CALL_SLEEP"
P#X 2800544
P#R

40

Bits & Pieces 1/2


Compilation options (plsql_%)
PL/SQL hierarchical profiler also works for wrapped and natively compiled code
With inlining (plsql_optimize_level=3 or pragma inline), calls to inlined subprograms
are not visible. Likewise when call to deterministic function is optimized away.

Usage notes
Performance verhead low enough to run on production.
Ensure enough free disk space and not on a critical file system if full. Cannot set
maximum file size.
No need for a fast server. Just compare relative times in PL/SQL (not SQL vs
PL/SQL).
41

Bits & Pieces 2/2


Other usages
Call stack until hang since no stop needed alternative to oradebug dump
errorstack 1
Actual call tree (of test input only) vs. complete possible static calls of PL/Scope

References
https://docs.oracle.com/database/121/ADFNS/adfns_profiler.htm#ADFNS023

42

Memory Allocation Profiling: UGA or PGA


Gathering
dbms_hprof.start_profiling(..., profile_uga | profile_pga => true);
Often in combination with dbms_session.get_package_memory_utilization
Whats in the trace
P#Z 1169723168
P#Z -2146643687

Oracle internal. Not documented.

Reporting
plshprof -uga or -pga
dbms_hprof.analyze(..., profile_uga | profile_pga => true)
43

Conclusions & Action Items


Conclusions
Performance is a key feature of any application
Oracle-provided tools are sufficient for analysis of PL/SQL hierarchical profiles,
except for skew analysis and graphical peoplefor whom this presentation delivers.
Understand output on toy example before analyzing real code.

Action items
Profile PL/SQL as part of end-to-end performance analysis.
Integrate gathering and reporting into your developer and
key business user UI.

44

45

Thank you for your attention.

Martin Bchi
martin.buechi@avaloq.com

Sample Code Only


The KSH, PL/SQL, and Java code for analyzing and transforming PL/SQL Hierarchical
Profiler traces as shown in the presentation can be found at
https://drive.google.com/open?id=0B40PcvWHeRrrZW1oRjg1WHFfOEE.

This is sample code. Not production quality code. No warranties.

47

You might also like