You are on page 1of 51

Automatic Shader Bounding for

Efficient Global Illumination


Bruce Walter
Cornell University

Course: Compiler Techniques for Rendering


Thursday August 11, 2011
1

Credit
Based on group project:
Automatic Bounding of Programmable Shaders for Efficient
Global Illumination, SIGGRAPH 2009
Edgar Velzquez-Armendriz
Shuang Zhao
Milo Haan
Bruce Walter
Kavita Bala

2
2

Motivation
Programmable shading

The Pumpkin Factory 2008

Artistic control, flexibility


Manual lighting design

Physically-based rendering

Veach and Guibas 1997

Limited material models


Global illumination
3

About the Shaders


Shader

surface matte(color Kd) {


return Kd * max(dot(N,L),0);
}

Simple RenderMan-like example language


Material (surface) shader that implements a BRDF
Allows arbitrary code within shader
Extensible to new material models

Challenge
Shader

surface matte(color Kd) {


return Kd * max(dot(N,L),0);
}
G.I. needs additional related functions

shade (point, light)

color

sample (point)

direction and probability

bound (points, lights) max color

Contributions
First step towards closing the gap
Interface
Provides the functions G.I. needs
Interval arithmetic

Automated compiler
Converts the shaders to interval form

Demonstration
Photon mapping
Multidimensional lightcuts

Result Preview
Multidimensional lightcuts

Photon mapping

Environment map lighting

Indirect Illumination

Procedural varying
material
Procedural varying
normal
Measured material with
PCA Reconstruction
Procedural Textures

Outline
Introduction
Background and related work
Interface between shaders and G.I.
Compiler
Results
Conclusions and future work

Related Work
Production shaders
Simplified G.I.
[Tabellion and Lamorlette 2004]

Automatic simplification
[Pellacini 2005]

GPU translation for relighting


[Pellacini et al. 2005, Ragan-Kelley et al. 2007]

Related Work
Interval methods
Ray Tracing
[Synder 1992, Bala et al. 1999, Heidrich and Seidel 1998]

Tesselation
[Moule and McCool 2002]

Culling
[Hasselgren and Akenine-Mller 2007,
Hasselgren et al. 2009]

Affine Arithmetic
[Comba and Stolfi 1993]

10

Related Work
Heidrich et al. 1998
High quality rasterization of textures

Hasselgren et al. 2009


Subdivison surfaces pre-tessellation culling
Compiler with type propagation
Clarberg et al. / An Optimizing Compiler for Automatic Shader Bounding

Clarberg et al. 2010

Shader
Diffuse
Phong
IsoWard
Ward
Ashikhmin
Fractal
Average

s(x, i )
#instr
20
149
151
174
486
447

i)
s(x,

before
after
3.4
2.8
2.2
2.1
3.4
2.4
3.5
2.5
2.8
2.3
4.2
2.8
3.2
2.5

i)
s(
x,
before
after
6.7
4.9
4.8
4.0
4.9
3.2
5.4
3.7
3.8
3.0
4.6
3.1
5.0
3.7

Performance optimization of interval code

Table 2: Instruction count compared to the original shaders


using SSE for double-sided bounding shaders, before and
after optimization. A prime application is importance sampling. The number of instructions is reduced by on average 23.8% and 27.1%, respectively. All numbers exclude the
overhead of NaN propagation.

Steiner

Mitchell

!"#$

Kummer

Figure 4: Examples of implicit surfaces evaluated using interval arithmetic to find the first intersection along each ray.
code specialized for single-sided intervals, only saves on average 5%. With our optimizations, the savings are 27%.
For Fractal, the difference is smaller due to control flow
making it harder to eliminate long computations chains, but
there is still plenty of optimization potential locally.

Our focus is on enabling global illumination


The effect of different optimizations are shown in Figure 3
for two representative shaders: an isotropic version (x =y )
of the Ward BRDF [War92], and the complex anisotropic
Ashikhmin model [AS00]. To get these results, we specified loose compile-time bounds on each shaders inputs; all

6.3. Implicit Surfaces


To measure the performance of optimized bounding code on
the GPU, we have implemented a simple ray tracer for im-

11

Outline
Introduction
Background and related work
Interface between shaders and G.I.
Compiler
Results
Conclusions and future work

12

Interface
Functions needed
shade (point, viewDir, lightDir) color
sample (point, viewDir) direction
bound (points, viewDirs, lightDirs) max color
Covers most Monte Carlo global illumination algorithms
need versions for both eye and light tracing
multiple importance sampling needs related prob function

13

Interface
Shade (point, viewDir, lightDir)
Evaluate shading for single point and light
Written by the user as shader program
Sample (point, viewDir)
Select outgoing direction with low noise
Needed by virtually all Monte Carlo rendering algorithms
Bound (points, viewDirs, lightDirs)
Compute upper bound on shade for range of input
Used to construct our sample function and by Lightcuts
14

Importance Sampling
Importance sampling needed for:
Distribution Ray Tracing
Path tracing
Bidirectional path tracing
Metropolis
Photon Mapping
and many more

15

Importance Sampling
Generate directions with probability related to shade
Minimize weights (f/p ratio) to reduce noise

16

Importance Sampling
Shader is an unknown function

17

Importance Sampling
Shader is an unknown function
Construct piecewise-constant bounding function

18

Importance Sampling
Use bound to obtain piecewise values
Subdivide regions for better approximation

19

Importance Sampling
Use bound to obtain piecewise values
Subdivide regions for better approximation

20

Importance Sampling
Guaranteed sampling quality
f/p ratio <= volume of bounding function
Stored as a cubemap over directions

-Z

-Y

+Z

-X

+Y

+X
21

Multidimensional Lightcuts
Scalable rendering [Walter et al. 2006]
Works on clusters of lights and shading points
Uses bounding functions (S2 and S3)

Bound direction (S2)

Bound direction and location (S3)


22

Design
Importance
Sampling
Compiler
shade

Runtime
bound

sample

23

Outline
Introduction
Background and related work
Interface between shaders and G.I.
Compiler
Results
Conclusions and future work

24

Interval Arithmetic Review


1
+
4

=
5

25

Interval Arithmetic Review


[0,2]
+
[3,6]

=
[3,8]

26

Interval Arithmetic
Define interval versions of basic operations
+, -, *, /, sqrt, pow, etc.

Composable for compound operations


Such as f(x) = sqrt(x*x - x)
Replace each basic operation with interval version

27
27

Interval Arithmetic
Simple and general
Simple conversion even for complex functions
Cost typically 2-4x more than original non-interval version

Result intervals are often larger than necesary


Intervals are conservative
always contain all possible outputs but may also include other values

Does not track correlation among values


Interval for x*x can contain negative values

Worse with large starting intervals


Subdividing initial intervals can help
28
28

Interval Arithmetic
Higher order variants can track value correlation
Affine and Taylor Interval Methods
Tighter result intervals
More expensive to compute

Not cost effective in our tests


Your milage may vary

29
29

Lifting the Shaders


surface matte(color Kd) {
return Kd * max(dot(N,L),0);
}
shade

bound s2

bound s3

Kd
N
L

vector3
vector3
vector3

vector3
vector3
interval3

interval3
interval3
interval3

dot(N,L)
max(dot(N,L),0)
Kd*max()

real
real
vector3

interval
interval
interval3

interval
interval
interval3
30

Static Single Assignment Form


Standard compiler technique
[Cytron et al. 1991]
Simplifies analysis and control flow
y = 1.0;
y = 2.0;
x = y;
x = x + 1.0;

Convert to SSA

y1 = 1.0;
y2 = 2.0;
x1 = y2;
x2 = x1 + 1.0;

31

Translating Branches
if (x > 0.0)
y = 0.0;
else
y = 1.0;
return y;

x=
y=

-1

-1

y1 = 0.0;
y2 = 1.0;
y3 = Psi(Greater(x,0.0), y1, y2);
return y3;

32

Translating Branches
y = z = 0.0;
if (x > 0.0)
y = 1.0;
else
z = 1.0;
y1 = 0.0;
y2 = 1.0;
z1 = 0.0;
z2 = 1.0;
y3 = Psi(Greater(x,0.0), y2, y1);
z3 = Psi(Greater(x,0.0), z1, z2);

x=
y=
z=

-1

-1

-1

33

Using the Interface


Shade

shader
Photon Mapping

Sample

Compiler

Multidimensional
Lightcuts

Bound

Runtime

API

34

Outline
Introduction
Background and related work
Interface between shaders and G.I.
Compiler
Results
Conclusions and future work

35

Results
Highlight complex shading
System
Mostly unoptimized Java implementation
Dual quad-core Xeon E5440 2.83 GHz, 16 GB RAM
Sun Java 1.6.0-11 64 bit

36

Bounds Comparison
Compare our bounds with hand-crafted
ones
Use multidimensional lightcuts as
reference
Analytic materials
Depth of field
Antialiasing
Environment map lighting
[Walter et al. 2006]

37

Bounds Comparison
Our automatic bounds

Hand-crafted bounds

10,000
8,000
6,000
4,000
2,000
0

Time: 459 s

Time: 176 s

640x480 images, 256 eye rays per pixel, 55,000 lights


38

Importance Sampling
Automatic Importance sampling Cosine-weighted (equal time)

Time: 1,830 s

Time: 2,031 s
39

Importance Sampling
Automatic Importance sampling Cosine-weighted (3x time)

Time: 1,830 s

Time: 6,134 s
40

Results Big Buck Bunny


bershader

Time: 648 s
Size: 852x480

Time: 516 s
Size: 512x512
32 eye rays per pixel
55,000 lights
10+ M triangles
41

Results Big Buck Bunny


bershader
Blender materials
Multiple branches
86 lines of code

Kajiya-Kay hair shader


Environment map and
sun lighting

Time: 648 s
Size: 852x480

Time: 516 s
Size: 512x512
32 eye rays per pixel
55,000 lights
10+ M triangles
42

Results Elephants Dream

bershader
Fully automated G.I.
Multidimensional lightcuts

464 s
852x480 image
32 eye rays per pixel
55,000 lights
43

Results Pillow
Measured material shader
PCA Decompression
83 data textures
243 lines of code

Time: 2,413 s

512x512 image
4 eye rays per pixel
50,200 lights
44

Results Pillow

45

Lightcuts vs Brute Force


Scene

Time

Speed-up

Tableau

459.2 s

601 x

Big Buck Bunny

647.6 s

311 x

Chinchilla

516.3 s

282 x

Elephants Dream

464.0 s

834 x

2,412.5 s

8x

Pillow

46

Conclusions
First step to automatically combine
programmable shading and physically
based rendering
Compiler generates interval versions of
shaders
Prototype enables multiple Monte Carlo
global illumination algorithms

47

Limitations and Future Work


Expensive importance function creation
Use known good sampling when possible
Improve tightness of the bounds
Higher level primitives in language
Generalize shading language
Arbitrary loops
Port to other languages, OpenSL?
Support other types of shaders
Light, displacement, etc.
48

Acknowledgements
NSF
CAREER 0644175, CPA 0811680, CNS 0403340

Intel
Microsoft
CONACYT-Mexico 228785
NVIDIA
Autodesk

49

The End

50

51

You might also like