You are on page 1of 668

N .M. Patrikalakis (Ed.

Scientific Visualization
of Physical Phenomena

With 533 Figures, Including 233 in Colour

Springer-Verlag
Tokyo Berlin Heidelberg
New York London Paris
Hong Kong Barcelona
NICHOLAS M. PATRIKALAKIS
Associate Professor of Ocean Engineering
Massachusetts Institute of Technology
Cambridge, Massachusetts, USA

About the Cover:


The cover picture shows a three-dimensional perspective view of JASON, a re-
motely operated underwater vehicle in relation to its target of interest, the USS
Scourge. The sunken ship, part of the US Great Lakes fleet during the war of
1812, was the site of a remote underwater archaeological survey during the
spring of 1990. The photograph was produced by Marquest Group, Inc. (see the
invited paper by W.K. Stewart in this volume).

ISBN-13:978-4-431-68161-8 e-ISBN-13:978-4-431-68159-5
DOl: 10.1007/978-4-431-68159-5

This work is subject to copyright. All rights are reserved, whether the whole or
part of the material is concerned, specifically the rights of translation, reprinting,
reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in
other ways, and storage in data banks.
© Springer-Verlag Tokyo 1991
Softcover reprint of the hardcover 1st edition 1991

The use of registered names, trademarks, etc. in this publication does not imply,
even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
Preface

This volume presents the proceedings of the 9th International Conference of the
Computer Graphics Society, CG International '91, Scientific Visualization of
Physical Phenomena, held at the Massachusetts Institute of Technology in
Cambridge, Massachusetts in the United States of America, on June 26-28,
1991. Since its foundation in 1983, this conference has continued to attract high
quality research articles in all aspects of computer graphics and its applications.
Previous conferences in this series were held in Japan (1983-1987), in Switzer-
land (1988), in the United Kingdom (1989), and in Singapore (1990). Future
CG International conferences are planned in Japan (1992), Switzerland (1993),
and in Australia (1994).
The title of this book, Scientific Visualization of Physical Phenomena, re-
flects the special emphasis of this year's CG International Conference. Visual-
ization in science and engineering is rapidly developing into a vital area because
of its potential for significantly contributing to the understanding of physical
processes and the design automation of man-made systems. With the increasing
emphasis on handling complexity in physical and artificial processes and sys-
tems, and with continuing advances in specialized graphics hardware and in pro-
cessing software and algorithms, visualization is expected to play an increasingly
dominant role in the foreseeable future.
The keynote paper, the five invited papers, and the 31 refereed papers in-
cluded in this book represent the state-of-the-art in scientific visualization.
These papers are grouped into 11 chapters. The keynote and invited papers in
the first chapter address leading-edge research aspects of visualization from the
perspectives of computer animation; simulation and modeling, volume visual-
ization~ visualization in astrophysics; computational geometry; engineering
design and analysis; and ocean sciences and engineering. The refereed papers
were selected, after peer review, from a large number of papers submitted from
around the world and are included ill Chapters 2 to 11, entitled as follows: Ani-
mation; Parallel Processing; Volume Rendering; Visualization Methods; Ray
Tracing/Rendering; Picture Generation; Computational Geometry; Visualiza-
tion in Engineering, Fluid Flow Visualization; and Applications. Countries rep-
resented in this volume include Australia, Belgium, Canada, Germany, Italy,
Japan, New Zealand, Singapore. Switzerland, the United Kingdom, and the
United States of America. Thus, there is wide international coverage of research
in this important emerging area.
This volume concludes with a listing of the CG International '91 conference
committees, our staff, cooperating societies, sponsors, and the technical re-
viewers. The efforts, support, and contributions of all these individuals and
organizations are gratefully acknowledged. Special thanks are due to Drs.
Chryssostomos Chryssostomidis, Rae A. Earnshaw, Bertram Herzog, and
VI

Tosiyasu L. Kunii for their support, advice, and organizational assistance and to
Ms. Marge Chryssostomidis for editorial assistance in compiling these proceed-
ings. Finally, special appreciation is due to Ms. Barbara Dullea, director of
the CG International '91 Secretariat, who expertly assisted us in organizing CG
International '91 and in making it a resounding success.

Nicholas M. Patrikalakis
Table of Contents

Chapter 1: Keynote and Invited Papers

Visualization: New Concepts and Techniques


to Integrate Diverse Application Areas
T.L. KUNII and Y. SHINAGAWA .......................................................... 3
Introduction to Volume Synthesis
A. KAUFMAN ..... .... ..... ............ .... ...... ......... ............. .................. ....... ..... 25
Computer Visualization in Spacecraft Exploration
of the Solar System
W.R. THOMPSON and C. SAGAN ......................................................... 37
Computational Geometry and Visualization:
Problems at the Interface
L.J. GUIBAS .......................................................................................... 45
Visualization for Engineering Design
H. NOWACKI ......................................................................................... 61
Visualization Resources and Strategies
for Remote Subsea Exploration
W.K. STEWART ..................................................................................... 85

Chapter 2: Animation

A Particle-Based Computational Model


of Cloth Draping Behavior
D.E. BREEN, D.H. HOUSE, and P.H. GETTO ................................... 113
Physically-Based Interactive Camera Motion Control
Using 3D Input Devices
R. TURNER, F. BALAGUER, E. GOBBETTI, and D. THALMANN ......... 135
Aspects of Motion Design for Physically-Based Animation
D. HAUMANN, J. WEJCHERT, K. ARYA, B. BACON,
A. KHORASANI, A. NORTON, and P. SWEENEy .................................. 147
VIII

Chapter 3: Parallel Processing

Terrain Perspectives on a Massively Parallel SIMD Computer


G. VEZINA and P.K ROBERTSON ....................................................... 163
Surface Tree Caching for Rendering Patches
in a Parallel Ray Tracing System
W. LAMOTIE, K. ELENS, and E. FLERACKERS ................................... 189

Chapter 4: Volume Rendering

Context Sensitive Normal Estimation for Volume Imaging


R YAGEL, D. COHEN, and A. KAUFMAN .......................................... 211
Rapid Volume Rendering Using a Boundary-Fill Guided
Ray Cast Algorithm
P.M. HALL and A.H. WATT ............................................................... 235
A 3D Surface Construction Algorithm for Volume Data
R SHU and RC. KRUEGER ................................................................ 251

Chapter 5: Visualization Methods

Compositional Analysis and Synthesis


of Scientific Data Visualization Techniques
H. SENAY and E. IGNATIUS ................................................................. 269
Precise Rendering Method for Edge Highlighting
T. TANAKA and T. TAKAHASHI ........................................................... 283
Fractals Based on Regular Polygons and Polyhedra
H. JONES and A. CAMPA ..................................................................... 299

Chapter 6: Ray Tracing/Rendering

Ray Tracing Gradient Index Lenses


KG. SUFFERN and P.H. GETTO ......................................................... 317
Shapes and Textures for Rendering Coral
N.L. MAX and G. WYVILL .................................................................. 333
A New Color Conversion Method
for Realistic Light Simulation
T. NAKA, K NISHIMURA, F. TAGUCHI, and Y. NAKASE ................... 345
IX

Chapter 7: Picture Generation

Two-Dimensional Vector Field Visualization


by Halftoning
R.V. KLASSEN and S.J. HARRINGTON ................................................. 363
Three Plus Five Makes Eight: A Simplified Approach
to Halftoning
G. WYVILL and C. McNAUGHTON ...................................................... 379

Chapter 8: Computational Geometry

A Theory of Geometric Contact for Computer


Aided Geometric Design of Parametric Curves
C. LEE, B. RAVANI, and A.T. YANG ................................................. 395
Generalization of a Family of Gregory Surfaces
K. UEDA and T. HARADA ................................................................... 417
A New Control Method for Free-Form Surfaces
with Tangent Continuity and its Applications
K. KONNO, T. TAKAMURA, and H. CHIYOKURA ................................. 435
Hybrid Models and Conversion Algorithms
for Solid Object Representation
L. DE FLORIANI and E. PUPPO ............................................................ 457
Surface Generation Using Implicit Cubics
B. Guo ................................................................................................. 485

Chapter 9: Visualization in Engineering

Equilibrium and Interpolation Solutions Using Wavelet Bases


A.P. PENTLAND .................................................................................... 507
Dynamic 3D Illustrations with Visibility Constraints
S.K. FEINER and D.D. SELIGMANN .................................................... 525
Piecewise Linear Approximations
of Digitized Space Curves with Applications
1. IHM and B. NAYLOR ......................................................................... 545

Chapter 10: Fluid Flow Visualization

Ellipsoidal Quantification of Evolving Phenomena


D. SILVER, N. ZABUSKY, V. FERNANDEZ, M. GAO,
and R. SAMTANEY ................................................................................. 573
x

Smoothed Particle Rendering for Fluid Visualization


in Astrophysics
M. NAGASAWA and K. KUWAHARA .................................................... 589

Chapter 11: Applications

Pan-Focused Stereoscopic Display Using a Series


of Optical Microscope Images
K. KANEDA, S. ISHIDA, and E. NAKAMAE ......................................... 609
Reconstructing and Visualizing Models of Neuronal Dendrites
I. CARLBOM, D. TERZOPOULOS, and K.M. HARRIS ............................ 623
A Visualization and Simulation System
for Environmental Purposes
M. GROSS and V. KUHN ..................................................................... 639
Synchronized Acquisition of Three-Dimensional Range
and Color Data and its Applications
Y. WATANABE and Y. SUENAGA ......................................................... 655
Piecewise Planar Surface Models from Sampled Data
D.A. SOUTHARD ................................................................................... 667

Conference Organization Committee ................................................ 681


List of Sponsors ................................................................................... 683
List of Technical Reviewers ................................................................ 685
List of Contributors ............................................................................. 687
Keyword Index .................................................................................... 689
Chapter 1
Keynote and Invited Papers
Visualization: New Concepts and Techniques
to Integrate Diverse Application Areas
Tosiyasu L. Kunii and Yoshihisa Shinagawa

ABSTRACT

Visualization models are diverse as a natural reflection of the diversity of the highly growing appli-
cation areas of visualization. This is a piece of work trying to integrate them into a small set of
handy models based on a few crisp concepts. As one of such concepts, we take the Reeb graph and
show how to build an integrated visualization model named ModelVisual serving to integrate ~e
diversity. ModelVisual is an abstraction hierarchy of incrementally modular data structure whIch
includes the topological, geometrical and other layers and the operators working on the layers to
recognize, view and display them. The self-visualizing visualization model is provided for imple-
menting ModelVisual. The application models integrated are: the homotopy model which again
integrates the triangulation model, the spline surface model and the loft surface model; the singular-
ity model applied to garment wrinkling ; the tree model applied to forest growth. The implication of
the integrated visualization model to the visual computer architecture through information localiza-
tion is also clarified.
Keyword: integrated visualization model, Reeb graph, self-visualizing visualization model, homotopy,
singularity, forest growth model, information locality, visual computer architecture

1. INTRODUCTION

This research is based on a belief after scientific tradition that the diversity in the appearance of
phenomena and objects does not necessarily mean the diversity in the rules governing them. The
rules sometimes are called models or theories. Usually a few basic rules do well as the quantum
theory in physics and the periodic table in chemistry. The term "visualizatiori" is used very popu-
larly to signify the computer display of the appearances which are indeed diverse. Then, what are a
few basic rules governing the diversity of visualization ?

2. ModelVisual: AN INTEGRATED VISUALIZATION MODEL

As in the other established disciplines, the structure common to the appearances of the diverse
phenomena and objects visualized provides the basis of the rules. It is a type of abstract data struc-
ture, hierarchically organized for modularity.

An incrementally modular visual structure can be built by having topology at the most abstract layer
of the hierarchy, geometry at the next layer by adding the coordinate system and the measure to the
top layer, and non structural information such as colors and mass to the bottom layer. The operators
defined are: the layer specific operators which consists of the topological, geometrical and the other
attribute operators, and the global operators which include the view, recognition, display and data-
base operators.
The data structure thus defined serves actually as the basic visualization model and is named Model-
Visual. For the topological layer, to go beyond what has been already proposed and developed
while keeping the layer modular in itself, surface topology which can go beyond the graph

3
4

theoretical and combinatorial topology is incorporated. To be more specific, in addition to the


Euler-Poincare characteristic (see for example, do Carmo 1976) of a surface to model the visible sur-
face and its changes in terms of the number of the vertices, edges, faces, holes and rings of the sur-
face appearing in three dimensional modelers (see for example, Mlintylli and Sulonen 1982; Chi-
yokura and Kimura 1983; Chiyokura 1988), the mountaineer's equation (see for example, Griffiths
1981) as an aspect of the Morse theory (see for example, Milnor 1969) is utilized to model the sur-
face curve characteristics and changes in term of the number of the peaks, pits and passes of the sur-
face, at the nodes of the Reeb graph (for details, see Thom 1988; Shinagawa, Kunii, Nomura,
Okuno and Young 1990) which is playing the role of the generalized topological "skeleton" of the
three dimensional surface structure of the object or phenomenon being visualized.

ModelVisuai as a candidate basic visualization model is tested later against the cases of visualization
applications, particularly the visualization of complex physical objects and phenomena such as
human auditory ossicles, garment wrinkling and forest growth, and it is shown that ModelVisual
does model the fundamental visual structure of all the cases and thus serves as the basic model.
At the end of the paper, the conceptual architecture of the visual computers is presented and dis-
cussed in the light of the information locality made possible by ModelVisual.

3. A NOTION OF SELF-VISUALIZING VISUALIZATION MODEL: Model Visual AS A


SELF-VISUALIZER

Let us now organize the basic visualization model ModelVisual introduced previously, and briefly
sketch a small core concept for making it self-visualizing. Not an easy job, but necessary and worth
considering it as "should be" one of the major targets of visualization. The basic computational
methods of generating itself are known in different areas, including the early work of Von Neumann
on the theory of self- reproducing automata (Von Neumann 1966) and the rather popular but partial
realization of compiler-compiler tools ,yacc and lex (see for example, Schreiner and Friedman,1r.
1985; Pyster 1988). To the best of our knowledge, however, a self-visualizing visualization model is
not known yet.
Let us first confirm that to visualize is for human being. Then, the notion of the self-visualizing visu-
alization model means that the model contains the display information on all the structures, operators
and their relationships so that the model displays itself for human being to recognize and select, out
of what displayed, the operators and also the structure as the operand of the operators. The human
being can, at least partially but hardly fully, delegate the recognition and/or selection operations to
the model. The model always prompts, on the display screen, the course and results of the opera-
tions for further human interaction.
For realizing the evolutionary architecture of the model, ModeiVisual is designed to make it incre-
mentally modular, namely any evolution in the structure can be added on without affecting the exist-
ing structure. The layered hierarchical structure is adopted to implement the architecture. The
model contains the followings: the structure layers I, the operators II, and the self-visualization
mechanism III which is the implementation of the self-visualizing visualization model. The opera-
tors are grouped into two, intra layer operators ILA and inter layer operators II.B, and perform 8
categories of functions: define, transform, update, delete, search, recognize, select and display. The
self-visualization mechanism III consists of the self-visualization administrator III.l, the two inter-
faces - the human interface III.2 and the model interface to interact with the rest of the model III.3
- , the self-visualizing symbol depository IIl.4 and the self-visualizing operators IlLS, based on our
early work on a menu generator (Kunii and Shirota 1989). A provision is also made for supporting
visualization data sharing, prototyping and history management through a visualization database
management system (Krishnan and Kunii 1990).
The real power of ModelVisual to integrate diverse application areas in visualization is in the
abstraction hierarchy of the structure layer. As shown in the rest of the paper, the incremental
modularity of the abstraction hierarchy allows all the information to be attached freely to the core
information of the model which is in the topology layer. Actually, to represent the topological infor-
mation of objects, the Reeb graph is used in our approach and other information is attached to the
Reeb graph (Thom 1988; Shinagawa, Kunii, Nomura, Okuno and Young 1990). Before finishing the
section, we list the structure layers of ModelVisual:
5

I. Structure layers
1.1. The topology layer
Topological structures which are continuous as in the case of a topological space, discrete as in
the case of a graph in graph theory, or both.
1.2. The geometry layer
Geometrical structures which also are continuous as in the case of analytical surfaces, discrete
as in the case of volume rendered images and tesselated textures, or both.
1.3. The non visual structure layer
The other attributes of the structures, which are visible as colors, invisible as mass, or both.

4. REPRESENTATION OF TOPOLOGY BY A REEB GRAPH

We define the Reeb graph which ModeiVisual uses to represent the topological "skeleton". George
Reeb first introduced this graph in his thesis (for details, see Thorn 1988; Shinagawa, Kunii,
Nomura, Okuno and Young 1990). The Reeb graph is defined on a manifold. A manifold can be
regarded as a space that has the same local properties as the Euclidean space and generalized to
make two sets of parameters (coordinates) continuously related if there is the overlap of the two
ranges of homeomorphic maps from planes to open pieces of a parametrically represented surface.
The formal definition can be found, for example, in Armstrong 1983. For instance, a plane or a
sphere is a 2D (two dimensional) manifold. In what follows, the surface of a 3D object is con-
sidered as a 2D manifold.

The definition of the Reeb graph is as follows:


DEFINITION

Let f: S ~ R be a real valued function on a manifold M. The Reeb graph of f is the quotient
space of the graph of f in M x R by the equivalence relation - given below:
»
(X.1,f(X I -.(X 2,f(X 2» -I
holds If and only If f (X I) = f (X 2) and X I' X 2 are in the same connected component of f (f (X I»'
The equivalent class of X is denoted by [X 1 in what follows. For convenience, we define a function
1t: M~MxRl-
as the function as
1t(X I) = [X d·
That is, the two points on the graph (X I' f (X I» and (X 2' f (X 2» are represented as the same node
1t(f (X I» in the Reeb graph if the values of f are the same and they belong to the same connected
component of the inverse image of f (X I) (or f (X 2». All points that belong to the same equivalent
class of the original space is represented as a node in the quotient space such as the Reeb graph.
In this paper, the Reeb graph of the height function h (X) on the surface (the 2D manifold) of an
object is considered. Here, h (X) gives the height of the point on the manifold
X = (Xl' X2, x3)
where XI' x2, x3 E R,
namely,
h(xI,x2,x3) =x3'
To be simple, we consider the height z of each point
X=(x,y,z)
on the surface. The Reeb graph of the height function on a surface is the quotient space of the
graph (x, y, z) which identifies (x I' Y I' z) and (x 2, Y 2, z) if these two points are in the same con-
nected component on the cross section of the surface at the height z. For example, the Reeb graph
of the height function on a torus shown in Fig. la is as in Fig. lc. This is easy to see when we con-
sider cross sectional planes perpendicular to z axis as in Fig. 1b; all the contour lines on each plane
are represented as a node in the Reeb graph.

4.1 Mountaineer's Equation

The most important nodes of the Reeb graph are the nodes that represent the pits, passes and peaks.
6

peak

~:::=:::::=x:::::==:>:::

~----:><

pIt
(a) peak (b)

pass

pit

(c)

Fig. 1. A torus and the critical points (a), the cross sections (b) and the Reeb graph (c)

Fig. 2. The cylindrical coordinate system on a garment


7

These points are called the critical points. If the numbers of pits, passes and peaks are denoted as
#(pits), #(passes) and #(peaks) respectively, then
#(pits) - #(passes) + #(peaks) = X(S).
This equation is known as "the mountaineer's equation" and can be found in Griffiths (1976). X(M)
is the Euler-Poincare characteristic of the surface S which is related to the number g of holes (han-
dles) through S as
_ 2 - X(S)
g - 2 .
g is also called the genus of the surface S. A torus, for example, has one peak, two passes and one
pit as shown in Fig. 1. The Euler-Poincare characteristic is 0 because its genus is 1. It is easy to
see that the mountaineer's equation holds for a torus because
X(torus) = 1 - 2 + 1 = O.
The discussions of the Euler-Poincare characteristic can be found, for example, in do Carmo (1976).
In the Reeb graph, g is the number of the loops of the graph.
The mountaineer's equation serves to maintain topological integrity when the shape of an object is
changed by topological or geometrical shape operators. For example, when a pit is created on a sur-
face, the mountaineer's equation makes sure that a pass is generated with it so that the genus of the
surface remains unchanged.

5. INTEGRATED MODELING OF GEOMETRY BY ModelVisual

Having the Reeb graph in the top topological layer of the integrated visualization model Model-
Visual, we now show how versatile the model becomes in uniformly modeling diverse visualization
cases. To make the explanation clear, we look into one level lower layer of abstraction in the
abstraction hierarchy of the model. Now the master of the scene is the Reeb graph, the scene is the
second geometrical layer of the model. What we are actually showing is how different types of
geometrical information can be modularly and incrementally associated with the Reeb graph in the
first layer to form the second layer. With the Reeb graph it is simple. It is done by associating any
geometrical information with the nodes of the Reeb graph.
Below, we list the diverse cases of geometrical information which are heavily application dependent,
and briefly tell how uniformly they can be handled by ModelVisual, particularly by the Reeb graph.
These applications are:
1. surface reconstruction from the contours by the homotopy model which appears in various medical
and geographical visualization including computed tomography (CT) and also topographical maps;
2. volume rendering in medical visualization;
3. critical points and bifurcation in garment wrinkling modeling;
4. generating walk-through animations for human body inner trips;
5. saddle points as the critical points in forest growth visualization.

Case 1. Surface reconstruction from the contours by the homotopy model


A node p of the Reeb graph is associated with a contour (also called a contour line) (1c1(p)) of the
object modeled by ModelVisual. The information lost when making the equivalence class of the
Reeb graph is completely recovered by this continuous geometrical information. When the object
represented has a complex shape as seen commonly in natural objects, the advantage of the hierarch-
ical modular structure of ModelVisual becomes prominent by separately storing the small key infor-
mation in the top layer as the Reeb graph in the primary memory and very large geometrical infor-
mation attached to it in the geometry layer in the secondary memory. In this case, the surface (the
2D manifold M) is reconstructed from a finite number of contours. This corresponds to the surface
reconstruction from the contours.
Case 2. Volume rendering
A node of the Reeb graph is associated with the interior image of a contour line. The interior
image can be a cross sectional images such as a CT (computed tomography) image, and thus is
associated with a node of the Reeb graph. This representation finds good application in volume
rendering. Medical imagery favors volume rendering techniques in reconstruct a solid object from a
8

given series of CT images (Hennan and Liu 1979; Chen, Hennan, Reynolds and Udupa 1985;
Frieder, Gordon and Reynolds 1985; Levoy 1988; Drebin, Carpenter and Harahan 1988; Upson and
Keeler 1988). Octrees are also used to this end (Meagher 1982; Mao, Kunii, Fujishiro and Noma
1987), and serves as another instance of the case.
Case 3. Singularity theory, critical points and bifurcation
A node p of the Reeb graph is associated with the critical points of the contour 1t-1(p). First, each
contour is assumed to be represented by a differentiable shape function
f;:[O, 1] -+ 1t-1(p), as
fj(s) = (x, y).
A function
nx: R2 -+ R
that projects a point on the xy -plane to the x axis is then defined as
nx(x, y) =x.
Finally, the node p of the Reeb graph is associated with the critical points (local maxima and
minima) and the characteristic points (the p+ +C singularity) of nx (fj)' The Reeb graph representa-
tion is used in modeling garment wrinkling based on the singularity theory (Kunii and Gotoda 1990;
Kunii and Gotoda 1991). In this model, the cylindrical coordinate system (r, e, z) and the projec-
tion function n, to the r axis are used instead of the Cartesian coordinate system as in Fig. 2.

Case 4. Walk-through animation


A node p of the Reeb graph is associated with a location inside the contour that p represents. This
finds ~pplication in a walk-through animation (Shinagawa, Kunii, Nomura, Okuno and Young 1990).
That IS, when a view point is located on a point associated with p and p moves along the Reeb
graph, we can walk through and observe the inside of the object we are visualizing. A walk-through
animation is useful for the simulation of a gastroscope or a needle otoscope (Nomura 1982).
Case S. Saddle points as the critical points in modeling forest growth
For botanical tree- and forest- growth visualization as shown in Aono and Kunii (1984), and Kunii
and Enomoto (1991), the Reeb graph can represent the skeletons of the trees and also the pattern of
forest fonnation processes. In this application, the critical points, particularly the saddle points
(passes) of the Reeb graph are significant. In modeling tree growth, the passes correspond to the
branching points, and the peaks and pits to the tips of the branches growing upward and downward,
respectively. The branches growing horizontally become the singular points of the Reeb graph of the
height function and should be treated separately as a special case; one immediate remedy is to rotate
the axes of the height function, for example 90 degrees, to avoid this artefact of artificial singularity.
To improve the modeling capability of the ModelVisual, such temporary remedy is not really desir-
able. Instead of the height function, we have to use a more generalized function for the Reeb graph
such as the function which is defined along an arbitrary curve with a non uniform measure.
In modeling forest growth, the tree interaction through mutual shading results in either the further
growth or the diminution of the trees at the various locations of a forest. Such locations are the
branching points of the forest growth and become the saddle points of the Reeb graph when we use
the growth function for each tree instead of the height function. The nodes that are not critical can
be interpolated from the critical points. For this reason, they are derivative and can be neglected.
With each saddle points, branching infonnation (for example, the branching angle) is associated and
stored in the second geometrical layer of ModelVisuai.
We briefly sketched how diverse visualization cases can be uniformly modeled by ModelVisual
based on the Reeb graph. The sections 6, 7 and 8 are dedicated to the further explanations of three
visualization cases, 1, 3 and 5 to show the extent of the diversity we have to handle.

6. THE HOMOTOPY MODEL

~is section is devoted to the theme of the case 1 of the previous section on "the surface reconstruc-
non from the contours which are stored in the second geometrical layer of Modelvisual as the gen-
erative infonnation and are associated with the Reeb graph in the top ,topological layer of
9

ModelVisual.
This is a high demand area, particularly in the medical field where constructing the entire shape. of a
human organ from a set of cross-sections is of great significance because it is difficult to enVIsage
the three-dimensional structure of the organ by viewing individual slices.
When a surface model is used, there are two typical ways to do this. One method approximates the
contours using linear line segments and the other uses a spline function. The triangular tile tech-
niques use the fonner method to generate triangular patches between contours on adjacent cross sec-
tional planes (see Fig. 3) (Fuchs, Kedem and Uselton 1977; Christiansen and Sederberg 1978; Bois-
sonat 1988). Wu, Abel and Greenberg (1977) used the latter method and reconstructed surfaces with
a spline approximation between the adjacent contours. The homotopy model (Shinagawa, Kunii,
Nomura. Okuno and Hara 1989; Shinagawa and Kunii 1991) used a homotopy and included the two
reconstruction methods described above as special cases. In this model, each contour is represented
by a shape function and the surface generated between a series of contours is represented by a
homotopy as the locus of the transfonnation (Fig. 4). This enables us to handle the problem con-
tinuously, not discretely. To be more precise, each contour on the plane Z == Zj is parametrized by a
variable s E [O,~} and represented by a function
/;:[0, 1] -+ 1t (p), as
/j (s) == (x, y).
Then, the surface reconstructed between the contours on the planes Z == Zj and Z == Zj+l is represented
by the homotopy
F: Rx[O, I] -+R2
from /j to /j+l such that
F (s , 0) == / j (s) and
F(s, 1) ==/j+l(s)
for all points s E [0, 1].
The homotopy model consists of the continuous toroidal graph representation and the homotopic
generation of surfaces from the representation. The continuous toroidal graph shows how each con-
tour line is parametrized as a function and shows the correspondence between the adjacent contours.
Now we are ready to generalize the discrete toroidal graph by abstracting the features essential to
reconstruct surfaces from contours. First of all, the lower and upper contours are represented by the
parameters x and y. The points on the contours are designated by the shape functions f(x) and ~(y)
whose domains are the interval [0, 1]. In the continuous toroidal graph, the horiwntal and vertIcal
distances between the two vertices of the continuous toroidal graph represent the differences of the
parameter values between the two vertices. When a path passes through (x,y) where f(x) is the coor-
dinate value of a point P and g(y) is that of a point Q, it means that ,P and Q is connected by a
homotopy.
Generally, a surface is represented by a concatenation of monotonously increasing functions as
shown in Fig. 5. The decision of this path is itself an interesting theme (Shinagawa and Kunii
1991). The toroidal graph that represents triangular patches are expressed as a concatenation of step
functions. This can be considered as an approximation of the monotonously increasing functions in
the discrete case. The spline approximation method is represented by the function y = x on the con-
tinuous toroidal graph. This can be considered as an approximation of the monotonously increasing
functions in the piecewise continuous interpolation case. Thus, the continuous troidal graph abstractly
models the basic key feature of these typical approximation models.
Using the continuous troidal graph representation, the homotopy model generates the surface patches
between adjacent contours by connecting corresponding points on each contour by a homotopy. In
other words, in this model, all the points on a contour have their corresponding points on the adja-
cent contours and a homotopy is used for connecting the corresponding points like fibers. Here, the
correspondence is represented by the continuous toroidal graph.
Then, the homotopy model is more than the generalization of the typical surface approximation
models. By carefully specifying a set of toroidal graphs, such essential surface operations as taking
the first-, second- and higher- order derivatives of the surface to identify the surface properties can
be defined on the surface generated by the homotopy model as needed, qualifying the homotopy
model to take part in the basic model. The properties include the peaks, pits, and saddles of the
surface and are serving as the nodes of the Reeb graph as explained in the section 4.1 on the
10

Fig. 3. Triangular tile technique

~ =f(X)
t= 0.3

t = 0.7
c__~
C~
t = 1.0
= g ex)

Transformed

Fig. 4. Surface generation using a homotopy


11

yO

ynn -
Y.Unn-l("~
.~
y2 V
x.UO(y)

yl

yO ~
I)
xO xl x2 xnn-l ,,0

Fig. 5. A path on the continuous toroidal graph

Fig. 6. Three auditory ossicles reconstructed by the homotopy


corresponding to a cardinal spline sunace
12

mountaineer's equation. The derivatives also yield the ridges and valleys which serve as the arcs of
the Reeb graph.
This means the homotopy model allows ModeLVisual to generate the Reeb graph as the core of the
top abstract layer from the lower geometrical layer.
Fig. 6 shows the three human auditory ossicles (malleus, incus and stapes) reconstructed by using a
homotopy that corresponds to the cardinal spline surface. The outline curves are approximated by
the cardinal spline. Fig. 7 shows the same objects reconstructed by the Christiansen's triangulation
method with the Gouraud shading. The shape looks ambiguous.

7. MODELING GARMENT WRINKLING

This section gives a closer look at the theme of the case 3 in the section 5 on singularity theory, crit-
ical points and characteristic points through the visualization modeling of a garment wrinkling
phenomenon. Questions all the time asked once we started to talk about garment wrinkling were
"Why wrinkling? Why not something of more scientific or industrial merit?" Indeed, to model
complexity, it serves as a good example, both scientifically and industrially. For the scientific merit,
of course, we could pick up the whole universe instead. The advantage of taking garment wrinkling
as an example to study is in its handiness coming along with the complexity sufficient to understand
its nature such as singularity.
One of the main concerns of cosmology is to model the beginning and evolution of the universe as a
whole. Singularity theory is playing the central role there. Look at the initial creation of matter in
the universe from the state of homogeneous energy distribution in space (see for example Wald
1984; Weinberg 1972). Since the energy is equal to the mass of matter multiplied by the light
speed twice, there is a possibility to consider and model the creation of mass as the creation of a
wrinkle in the energy space where homogeneous energy distribution was broken and high energy
concentration is taking place at the locations where matter exists. The birth of matter is modeled by
the birth of singularity. In our visualization model ModelVisual, at the highest level of abstraction,
garment wrinkle creation can possibly be modeled the same simply by replacing the word 'energy'
by 'cloth' and changing the scale factors. Although whether this can be actually done or not is an
open problem, one of the largest challenge and temptation is testing the wrinkling modeling of the
creation of the universe against the computer graphics four dimensional (4D) visualization of the
astronomical observation. Challenge to open problems is the privilege of any scientists who are
essentially volunteers to discover something new.
Let us now turn to the question of the industrial merit of studying garment wrinkling. In the fas~ion
industry, garment wrinkling actually is now considered as an important key factor in garment deslgn,
especially at the highest level of design of fashion. Fashion designers try to get the most out of.any-
thing that constitutes garments. Particularly, to exploit the physical characteristics of the fabric of
garments, wrinkling is occupying a major position in fashion design, for example to give a ~elaxed
and casual atmosphere to garments when worn. The design process has a number of stages mclud-
ing the initial sketch by fashion designers and the extraction of patterns from the initial sketch by
pattern making experts. The extracted patterns, when assembled, are expected to match the original
image of the fashion designers. The traditional tools of designers have long been limited to crayons
and paper. Recently several CAD systems are proposed to assist the designers with graphical edit-
ing and 3D previewing tools, but little has been done to fulfill the requirement of simulating the
wrinkling behavior of garments.
In 1990, Hinds and McCartney presented a system that can display draped shape of garments. They
used sinusoidal functions defined on quadrilateral patches to create folds or wrinkles. The resulting
images, however, look rather stiff. Recently two major modeling techniques, usually called
physically-based modeling and geometrical modeling, have been devised to model soft objects.
Geometrical models (Wyvill, McPheeters and Wyvill 1986; Barr 1984) use geometrical information
to represent the shape of objects. Jacobian (coordinate transformation matrix) or scalar field are
examples of such geometrical information. Although geometrical models are simple and easy to pro-
gram, dynamic constraints are relatively difficult to get incorporated in these models, and accordingly
the resulting objects sometimes behave unrealistically. Physically-based models (Platt and Barr
1988; Terzopoulos and Fleishner 1988; Weil 1986), while computationally more complex than
13

Fig. 7. Three auditory ossicles reconstructed using the triangulation method with Gouraud shading

Q
/ / / /-/-/
(a) (b) (c)

Fig. 8. Typical signs: (a) a fold, (b) crossing lines and (c) a cusp
14

geometrical models, offer the realism. A set of differential equations derived from physical laws play
a central role here. At first glance, these physically-based models with the advent of high-power and
low-cost workstations, look superior to geometrical models because they seem to be able to simulate
what is really occurring. However, physically-based models are computationally too expensive and
they tend to be unstable under widely changing dynamical constraints. Sometimes it is drastically
affected by a small change of the parameter values.
Thus, to model the shapes of garment wrinkles, it is not practical to simulate their behavior by
directly solving differential equations because a garment belongs to an object composed of extremely
many components. We prefer to use some global information to reduce the amount of computation.
7.1 Wrinkling as Global Information
We need to understand and model the nature of wrinkling as the global information (Kunii and
Gotoda 1990a; Kunii and Gotoda 1990b). When a garment is deformed, wrinkles are formed or
extinguished. Shape changes are mainly observed around wrinkles while the other parts of the gar-
ment remain unchanged. In fact, the geometry of the wrinkles and of the other parts are different: at
the wrinkles both the metric and the curvature change, whereas at the other parts the metric is
preserved and only the curvature changes. This observation implies a potential key hypothesis of the
wrinkles as the indexes of global shape change.
The hypothesis that is used to construct a model of garment wrinkling by employing a mathematical
method known as singularity theory (Arnold 1986). This theory provides a mathematical foundation
to handle the qualitative geometric changes of a given system. The basic idea of singularity theory is
in considering a singular set of surface-to-surface mapping. More specifically, a series of projections
of a given surface are taken and analyzed. Fig. 8 shows three typical types of projections. In the
framework of singularity theory, our particular interest is related to contours. The theory shows that
the signs depicted in Fig. 8 are the only stable patterns in general, and the other types of signs are
unstable, namely, if we take a projection from a slightly different direction, the pattern is decom-
posed into some combinations of the signs shown in Fig. 8.
An important point here is that one can distinguish general types of signs which are stable from spe-
cial types of signs which are unstable. If no a priori knowledge is assumed on the surface to be
analyzed and if a projection is taken in an arbitrary direction, then the sign to be observed is almost
always one of those in Fig. 8. The other types of signs are too rare to be observed. However, if the
surface does have a special structure, we can expect that the rare types of signs are also observed.
We conclude here that the stable signs can serve as the indexes of surface projections.
7.2 Singularity Signs in Garment Wrinkling
When a garment wrinkles, in most cases the following signs appear in the projections: cusps, folds
and crossing lines. As shown in Fig. 9, however, there are special instances where the other types
of signs emerge by changing the directions of viewing a given shape. This figure shows a situation
where (a)a cusp and a fold approach to each other, (b)then merge, and (c)finally depart from each
other. Note that (a) and (c) show different configurations: the lines that form the cusp and the fold
are exchanged in the process of merging. The merged state (b) is classified as the p+ +c singularity
(Arnold 1986; Kergosien 1981) which describes the structure of branching.
The same p+ +c singularity also describes the structure of vanishing. Branching and vanishing are
complementary to each other. A point on a surface can be a branching point when it is looked from
one side, and can be a vanishing point when it is looked from the other side. Since this singularity
is very rare, the behavior of the points corresponding to this type of singularity can serve as the far
more greater constraints than the behavior of the other points. Such p+ +c singular points are the
characteristic points of garment wrinkles. They serve as the primary indexes of the global shape
change. In ModelVisual, the primary indexes are associated with the nodes of the Reeb graph in the
top topological layer.
7.3 Modeling Primitives of Wrinkles
As explained above, the local structures around the characteristic points remarkably influence the
global structures of garment wrinkles.
15

upper view upper view upper view

side view side view side view

(a) (b) (c)

Fig. 9. A case where p+ +c singularity emerges:


(a) fold and cusp, (b) p+ +c singularity, and (c) fold and cusp

Fig. 10 Wrinkle modeling primitives


16

Now, having the basic sets of modeling primitives in the top layer of ModelVisual associated with
the nodes of the Reeb graph, which in tum are the positions and the types (branching or vanishing)
of characteristic points, what is the most essential information to be stored in the second geometrical
layer of ModelVisual sufficient to reconstruct the original shape from the compressed data in a satis-
factory way? Our choice is the contours that are associated with the characteristic points of the
special singular configurations (see Fig. 10) in the top layer. This enables us to limit the number of
possible wrinkle shapes to be reconstructed to those which are valid. Note here that contours are not
necessarily on planes. For specifying the contours, we choose several sample points and interpolate
the coordinate values between those points by a curve such as a spline function.

It is usually observed that several wrinkles appear on the garment cloth, and become connected with
each other or diminish at their ends. Our modeling primitives can describe such a complex situa-
tion: connections can be represented by branching points and diminishings by vanishing points. The
description yields a graph named a wrinkle graph, as shown in Fig. 11, where vertices correspond to
the characteristic points and edges to the associated contours. The wrinkling graph is another
instance of the Reeb graph, and the graph adequately abstracts the essential structure of a wrinkling
phenomenon.

7.4 Testing ModelVisual through Wrinkling Animation


Different from the case of modeling human auditory ossicles, modeling garment wrinkling is a case
of modeling a dynamic phenomenon. Hence, testing requires animation. Using ModelVisual, two
examples of wrinkle formation processes are animated as shown in Fig. 12. A case of wrinkles
formed around the arm of a jacket is used. The first basic parameter is the angle between the human
forearm and the upper arm at the elbow. Since the arm of a jacket is surface topologically modeled
as a cylinder in ModelVisual, the cylindrical coordinate system (r, e, z) and the projection function
TI, to the r axis are used. A node p of the Reeb graph is associated with the critical points of the
contour 7[-1(p). Each contour is assumed to be represented by a differentiable shape function
Ii :[0, 1] --t 7[-1(p) as
li(S) = (r,e, z).
A function
TI,: R2 --t R
that projects a point p on the (r ,e, Z )-plane to the r axis is then defined as
TI,(r,e, z) = r.
Finally, the node p of the Reeb graph is associated with the characteristic points (the p+ +c singular-
ity) of TI, (fi)' To test the model in the simplest situation first, we assume the situation where no
characteristic points are newly created or destroyed during the process of wrinkle formation. It
means the Reeb graph in the first layer of ModelVisual is assumed fixed during the animation, and
the nodes of the Reeb graph can be utilized as the second parameter of this animation. The
animated result approximates the visual reality. Further testing is planned for garment drapery and
human facial wrinkles.

8. MODELING FOREST GROWTH

The case 5 of the section 5 outlined how we could model the branching patterns of tree- and forest-
growth by the Reeb graph in the top abstraction layer of ModelVisual. This section gives the follow
up description on tree- and forest- growth modeling which is based on interacting tree growth model-
ing to the extent necessary to visualize the forest growth. The main emphasis here is on how to
abstract the essential topological, geometrical and other features based on ModelVisual leaving the
details on the interaction modeling to Kunii and Enomoto (1991).

First, let us understand the nature of a forest to the extent necessary to model it properly by Model-
Visual. In a forest, a tree is interacting with the other trees of the same or different species and the
environment including the other vegetation and animals from the time of its start as a seed through
the growth and maturity periods until the time of its death. In forestry, the process that vegetation
invades a large and bare area and grows into a stable state, is called the primary succession. Cle-
ments (1916) called the stable state the climax. The process that a tree cannot live its lifetime by the
accidents such as diseases and insects causing damages is called the secondary succession. Forestry
considers the succession as the main feature of a forest (see, for example, Shugart 1984). In a
forest, many trees, the other life and the environment interacting with each other form an ecological
17

---

,
I
I
I

--- ,
I
,,

Fig. 11 A wrinkle graph of modeling primitives as an instance of the Reeb graph

Fig. 12. Garment wrinkling animation


18

system, or a ecosystem. Then, modeling the forest ecosystem means modeling the forest succession.
In modeling the forest succession, two approaches exist: A top-down approach and a bottom-up
approach. The top-down models are called the forest models and the bottom-up models the tree
'!'Odels. The top-down and bottom-up models are just two different forms of views of the same rea!-
Ity called a forest. Actually, associating individual trees with the nodes of the Reeb graph, and theIr
relationships with the arcs in the top layer of Model Visual, we can integrate the two models into a
single model capable of representing the ecological succession of a forest as a set of interacting trees
in a given environment. A fairly elaborate and detailed tree model is built on the four dimensional
(4D) modeling of many species of trees incorporating the interactions among individual trees (Kunii
and Enomoto 1991).

8.1 Tree Interaction Modeling


It is known that, among the internal properties of trees, the sunlight-photosynthesis relation is dom-
inating the production rate of trees. Then, mutual shading is the dominating interactions among trees
to control the forest growth and then to decide the attributes associated with the arcs of the Reeb
graph. It means that the model appropriate for representing forest formation processes should be
able to procedurally, namely algorithmically specify the growth of trees by utilizing the sunlight-
photosynthesis relation and then by automatically computing the amount of sunshine received after
mutual shading by using the geometrical information in the second layer of ModelVisual.
Such algorithm for forest- and tree- growth specification takes the form of parallel (also called con-
current) algorithm based on the growth topology and geometry. However, to the best of authors'
knowledge, the tree models used in visualizing tree- and forest- growth could not present such paral-
lel algorithm until we published the model in May, 1984 under the name of the A-system (Aono and
Kunii 1984). It allowed interactive tree image generation, and produced the three-dimensional (3D)
geometrical model of the most kinds of higher order trees from a few parameters (see Figs. 13 and
14):
(1) A divergence angle(d),
(2) Branching angles(h 1, h2) and
(3) Contraction ratios(r 1, r2).
The A-system had enough facilities for tree image generation. It also had the capabilities to compute
the total area of the all the leaves of a tree, the effective total leaf area to receive sunlight, and the
production rate of the tree by the sunlight-photosynthesis relation. To efficiently compute the
effective total leaf area of a tree to receive sunlight in the second geometrical layer of ModelVisual,
a scan-line incremental method was developed.
Enhancements are made to take the other internal properties of a tree into consideration such as
modeling:
(1) the branching sequence,
(2)the growth and death of the branches and the trunk, and
(3)the rate of the foliage active in photosynthesis,
all depending on the species and age of the tree. The enhanced system is named the KEA-system.

There are too many leaves in a forest, and it is a waste of time to calculate the amount of sunlight
each leaf is bathed in. It is advantageous if we can change the processing unit into a larger one
without the loss of generality. Here, let us examine what happens if we choose a tree as the pro-
cessing unit. To simplify the model of a single tree, the top view of a tree is assumed to be a circle
(see Fig. 15).
From the scan-line algorithm, the amount of sunlight received by each tree is given by the formula:
Sum = K *R *R *rc (0.6 < K < 0.7)
where R is the radius of the circle of a tree, and K is a constant obtained from the scan- line algo-
rithm. The ground under the tree is not fully shaded, and there is some light streaming through the
leaves. By the scan-line algorithm, the amount of the sunlight faIling through the leaves is calcu-
lated. To show the differences in the intensity of sunlight at the different parts of a tree, we define
the light consumption function (see Fig. 16). The above formula is modified assuming that the light
intensity over the tree is lovr> as:
19

Fig. 13. A divergence angle d Fig. 14. Branching angles hI, h2 and contraction ratios rI, r2

Top View
lr--_ _

Side View ----------+. D/R


0 .....

K is the absorption rate andD is the distance


from the center of a tree of radius R

Fig. 15. A circle model of trees Fig. 16. A light consumption function
20

Lconsume = lovr * (aD 2+bD +c )

where D is the distance from the tree root, R is the radius of the circle of the tree (see Fig. 15), and
a, b, and c are the constants for adjusting the sum of the sunlight to make it equal in the above two
formulae.

8.2 Visualization by Parallel Algorithmic Animation


For rendering numerous numbers of tree leaves, they were substituted by the tips of the branches
painted in the color of the leaves. By performing the parallel algorithmic animation of the tree
growth at a year interval, the tropical rain forest formation processes were visualized for 250 years
(regarding the extended explanation of algorithmic animation, see Thalmann and Thalmann 1990).
Figs. 17 shows a scene from the results of the visualization.
As mentioned before as the case 5 in the section 5, the tree interaction through mutual shading
results in either the further growth or the diminution of the trees at the various locations of a forest.
Such locations are the branching points of the forest growth and become the saddle points of the
Reeb graph of the growth function. The nodes that are not critical can be interpolated from the criti-
cal points. For this reason, they are derivative and can be neglected in the top topological layer of
ModelVisual. With each saddle points, branching information is associated and stored in the second
geometrical layer of ModelVisual.
Interesting enough, the bifurcations of the forest growth observed in the tropical rain forest in Pasoh
of peninsular Malaysia, was also observed in the parallel algorithmic animation. Here, the saddle
points of the Reeb graph are now the bifurcation points. This type of singularity observed on the
animation screen is not just fun to watch. It is a key to understand the nature of the forest ecosys-
tem and then to model it by ModelVisual in a way common to other complex phenomena such as
the formation of the universe and garment wrinkling.

9. VISUALIZATION MODELING AND VISUAL COMPUTER ARCHITECTURE

9.1 The Visual Computer Architecture Based on the Principle of Locality


Visualization is computationally very heavy and in many cases too slow to run on currently available
hardware. A care has to be taken required to overcome the situation. In hardware, computational
speed gain has been achieved having the information locality in time and space as the basic principle
(see for example, Hennesy and Patterson 1990), besides the device evolution. Looking back the evo-
lution of computer architecture, it can be discussed either from the view point of the evolution of the
devices used or from the evolution of the objects processed. The former can be safely named the EE
(electrical and electronic) view to computer architecture and the latter the CS (computer science)
view.
Looking at the evolution of computers from the CS point of view, particularly from the principle of
locality, the first generation architecture named the Von Neumann architecture was for numerical
computation with the unit of locality of one word, equivalent to a few bytes. The shift of the weight
in computing from number crunching to data processing, caused the migration of many pages of
business files to computers overflown from the main- to secondary- memory space. Then, the
second generation computer architecture called the virtual storage architecture assumed the size of
the unit of locality of a few to several hundred bytes. Now we are talking about visualization.
What we have to do first is to estimate the unit of locality in visualization aiming at the third genera-
tion architecture named the visual computer architecture as you see as a journal title and the theme
in 1985 (Kunii 1985). The unit of locality is in the range of a few kilo to mega bytes.
21

Fig. 17. Side view and upper side view of tropical rain forest growth
22

9.2 Addressing Schemes


The locality has been turned into the actual hardware speed up through the addressing schemes
which exploit the locality. For numerical computation with the locality unit of one word, a word
address counter called the program counter which pointed at a word in the main memory space
served well supported by an automatic counter increment mechanism. For files, the virtual storage
architecture added a file page counter, usually simply called the page counter, with an automatic con-
secutive page roll-in/out mechanism in between the main- and secondary- memory.
For visualization, ModelVisual has potential to localize visual information using the Reeb graph in
the top layer as the addressing scheme in a-node-at-a-time manner (we can make this multiple for
parallel architecture) with the support of a graph tracer driven by the search operator of Model-
Visual, a node counter, an automatic counter incrementer and also a lower layer roll-in/out mechan-
ism. The size of the memory necessary to store unit information attached to a node or a point p of
the Reeb graph and represented in the lower layers is now enumerated as shown in the following.
9.3 Locality Unit in Visualization
Let us figure out the size of unit information in visualization based on the cases listed up in the sec-
tion 5.
Case 1. Surface reconstruction from the contours by the homotopy model
Each contour is usually approximated by linear line segments or a spline function. In both ca~es,
each contour is specified by giving a series of coordinate values of the points on it or control pomts
of the spline function. When double-precision floating point values are used for about 60 points, the
estimate is roughly 2 kilo bytes for a contour.
Case 2. Volume rendering
A full-color image of the interior of a contour is about 3 mega bytes in size assuming that it is
represented by about 1024 x 1024 pixels.
Case 3. Singularity theory, critical points and bifurcation in garment wrinkling
Assuming that there are about 10 critical points on a contour and that double-precision floating point
values are used for the coordinate values, the estimate is roughly 320 bytes. For the simulation of
garment wrinkling, weave patterns and other attributes need to be added as is described in Kunii,
Amano, Arisawa and Okada (1974), and as with the case 2, 3 mega bytes are required for each con-
tour.
Case 4. Walk-through animation
By making the same assumption as the above, 32 bytes are necessary for representing the location of
the view point. Adding the information on a line of sight and also on a twist angle, it is about 66
bytes in size. For walk-through animation, the information of the case 1 is also necessary, and the
total is around 2 kilo bytes.
Case 5. Saddle points as the critical points in modeling forest growth .
Branching information is associated with each saddle point. The amount of memory reqUIred for a
point is in the order of 1 kilo bytes.

CONCLUSIONS

Did we achieve the initial goal of finding a few or hopefully a single visualization model to inte~at­
edly cover diverse application cases of visualization? Through a few but quite diverse case studIes,
we can safely conclude that the proposed model named ModelVisual has survived the tests. It has
been even a thrilling experience to observe how effectively the modular incremental abstraction
hierarchy of ModelVisual works without much break down to model the diversity in a unified
manner. Particularly, the Reeb graph serving as the core of the top layer has shown to be more than
useful. It has been enlightening. The model has also revealed the potential capability to serve as the
foundation of designing the visual computer architecture. The principle of information locality is
finding the best match with the Reeb graph to implement the addressing scheme of the visual com-
puter.
23

ACKNOWLEDGEMENT

The authors would like to thank all the sponsors who provided the funds and facilities for this
research. Dr. Y. Kergosien of University of Paris-Sud has been extremely helpful in giving us valu-
able advises on singularity theory.

REFERENCES

Aono M, Kunii TL (1984) Botanical Tree Image Generation. IEEE Computer Graphics & Applica-
tions, Vol. 4, No.5, pp.1O - 34
Arnold VI (1986) Catastrophe Theory. 2nd ed., Springer-Verlag, Berlin, Heidelberg
Armstrong MA (1983) Basic Topology. Springer-Verlag, New York Berlin Heidelberg Tokyo, p.88,
pp.103-105, p.169
Bar AH (1984) The Global and Local Deformations of Solid Primitives. Computer Graphics (Proc.
SIGGRAPH), Vol. 18, No.3, pp. 21-29
Chen L-S, Herman GT, Reynolds RA, Udupa JK (1985) Surface Shading in the Cuberille Environ-
ment. IEEE Computer Graphics & Applications, Vol. 5, No. 12, pp. 33-43
Chiyokura H, Kimura F (1985) A Method of Representing the Solid Design Process. IEEE Com-
puter Graphics & Applications, Vol. 5, No.4, pp.32-41
Chiyokura H (1988) "Solid modelling with Designbase." Addison-Wesley, Singapore Wokingham
Reading Menlo Park New York Don Mills Amsterdam Sydney Bonn Tokyo Madrid San Juan
Christiansen HN, Sederberg TW (1978) Conversion of Complex Contour Line Definitions into
Polygonal Element Mosaics. Computer Graphics (Proc. SIGGRAPH), Vol. 12, pp.187-192
Clements FE (1916) Plant succession: An analysis of the development of vegetation. Carnegie Insti-
tute Pub. 242, Washington, D.C
do Carmo MP (1976) Differential Geometry of Curves and Surfaces. Prentice-Hall, New Jersey
London Sydney Toronto New Delhi Tokyo Singapore, p.273
Drebin RA, Carpenter L, Harahan P (1988) Volume Rendering. Computer Graphics (Proc. SIG-
GRAPH), Vol. 22, No.4, pp.65-74
Frieder G, Gordon D, Reynolds RA (1985) Back-to-Front Display of Voxel-Based Objects. IEEE
Computer Graphics & Applications, Vol. 5, No.1, pp. 52-59
Fuchs H, Kedem ZM, Uselton SP (1977) Optimal Surface Reconstruction from Planar Contours.
Comm. ACM Vol. 20, No. 10, pp.693-702
Griffiths HB (1976) Surfaces. 2nd ed., Cambridge University Press, Cambridge London New York
New Rochelle Melbourne Sydney, p.82
Herman GT, Liu HK (1979) Three-Dimensional Display of Human Organs from Computer Tomo-
grams. Computer Graphics and Image Processing, Vol.9, pp.1-21
Hinds BK and'McCartney J (1990) Interactive Garment Design. The Visual Computer, Vol. 6,
pp.53-61
Kergosien YL (1981) Topologie Differentielle. Comptes Rendus, 291, I, pp.929-932
Krishnan D, Kunii TL (1990) A Data Model for Engineering Design and Its Implementation Using a
Link-Oriented Database. In: Tjoa AM, Wagner R (eds.) Database, and Expert Applications
(Proc. of DEXA '90, Wien, Austria, August 29-31, 1990), Springer-Verlag
Kunii TL, Amano T, Arisawa H, Okada S (1974) An Interactive Fashion Design System 'INFADS.'
Computer and Graphics, Vol. 1, No.4, Pergamon Press (1975), pp.297-302 [presented at the
Conference on Computer Graphics and Interactive Techniques, July 15-17, 1974] [also reprinted
as Technical Report 86-2, Department of Information Science, Faculty of Science, the University
of Tokyo]
Kunii TL (1985) Editorial. The Visual Computer, VoU, No.1, p.Al and p.l
Kunii TL and Gotoda H (1990) Modeling and Animation of Garment Wrinkle Formation Processes.
In: Magnenat-Thalmann N and Thalmann D (eds.) Computer Animation' 90, pp. 131-147,
Springer-Verlag, Tokyo
Kunii TL, Enomoto H(1991) Forest: An Interacting Tree Model for Visualizing Forest Formation
Processes by Algorithmic Computer Animation - A Case Study of a Tropical Rain Forest -. To
appear in The Proceeding of Computer Animation '91, Springer-Verlag, Tokyo
Levoy M (1988) Display of Surfaces from Volume Data. IEEE Computer Graphics & Applications,
Vol. 8, No.3, pp. 29-37
Miintylii. MM, Sulonen R (1982) GWB: A Solid Modeler with Euler Operations. IEEE Computer
Graphics & Applications, Vol. 2, No.4, pp.17-31
24

Mao X, Kunii TL, Fujishiro I, Noma T (Dec. 1987) Hierarchical Representations of 2D/3D Gray-
Scale Images and Their 2D/3D Two-Way Conversion. IEEE Computer Graphics & Applications
Vol. 7, No.6, pp. 37-44
Meagher DJ (1982) Geometric Modeling Using Octree-Encoding. Computer Graphics and Image
Processing No. 19, 129-147.6.
Milnor J (1963) Morse Theory . Princeton University Press, Princeton
Nomura Y (1982) A Needle Otoscope. Acta Otolaryngol 93, pp.73-79
Hennessy JL, Patterson DA (1990) Computer Architecture: A Quantitative Approach. Morgan Kauf-
mann Publishers, Inc., Palo Alto
Pyster AB (1988) "Compiler Design and Construction." Van Nostrand Reinhold Company Inc., New
York Wokingham Melbourne Agincoun
Schreiner A, Friedman H (1985) Introduction to Compiler Construction with UNIX Prentice-Hall
London Sydney Rio de Janeiro Toronto Mexico New Delhi Tokyo Singapore Wellington
Shinagawa Y, Kunii TL, Nomura Y, Okuno T, Hara M (1989) Reconstructing Smooth Surfaces from
a Series of Contour Lines Using a Homotopy. In: Earnshaw RA, Wyvill B (ed) New Advances in
Computer Graphics. Springer-Verlag, New York Berlin Heidelberg Tokyo, pp.147-161
Shinagawa Y, Kunii TL, Nomura Y, Okuno T and Young Y-H (1990) Automating View Function
Generation for Walk-through Animation Using a Reeb Graph. In: Magnenat-Thalmann N, Thal-
mann D (eds.) Computer Animation '90. Springer-Verlag, Tokyo Berlin Heidelberg New York
London Paris Hong Kong, pp. 227-237
Shinagawa Y, Kunii TL (1991) The Homotopy Model: A Generalized Model for Smooth Surface
Generation from Cross Sectional Data. To appear in The Visual Computer, Vol.7
Shirota Y, Kunii TL (1989) Automatic Generator for Enhanced Menu Based Software -Program-
Specification-by-Examples- In: Salvendy G, Smith MJ (eds.) Designing and Using Human
-Computer Interfaces and Knowledge Based Systems (Proc. of the Third International Confeer-
ence on Human-Computer Interface). pp. 829-836, Elsevier, Amsterdam
Shugart HH (1984) A theory of forest dynamics. Springer-Verlag, New York
Terzopoulos D, Platt JC, Barr AH, and Fleisher K (1987) Elastically Defonnable Models. Computer
Graphics (Proc. SIGGRAPH), Vol. 21, No.4, pp.205-214
Thalmann NM, Thalmann D (1990) Synthetic Actors. Springer-Verlag, Heidelberg
Thom R (1988) Esquisse D'une Semiophysique . Inter Editions, Paris, p.57
Upson C, Keeler M (1988) V-BUFFER: Visible Volume Rendering. Computer Graphics (Proc.
SIGGRAPH), Vol. 22, No.4, pp.59-64
Von Neumann J (1966) Theory of Self-Reproducing Automata. University of Illinois Press, Urbana
London
Wa1d RM (1984) General Relativity. The University of Chicago Press, Chicago London
Weil J (1986) The Synthesis of Cloth Objects. Computer Graphics (Proc. SIGGRAPH), Vol.20,
No.4, pp.49-54
Weinberg S (1972) Gravitation and Cosmology . John Wiley & Sons
Wu S, Abel JF, Greenberg DP (1977) An Interactive Computer Graphics Approach to Surface
Representation. Comm. ACM Vol.20, No.10, pp.187-192
Wyvill G, McPheeters C, and Wyvill B (1986) Data Structure for Soft Objects. The Visual Com-
puter, Vol.2, pp.227-242

AUTHOR'S BIOGRAPHIES

Tosiyasu L. Kunii is currently Professor of Infonnation and Com-


puter Science, the University of Tokyo. At the University of Tokyo,
he started his work in raster computer graphics in 1968 which was
let to the Tokyo Raster Technology Project. His research interests
include computer graphics, database systems, and software engineer-
ing. He authored and edited over 30 computer science books, and
published over 120 refereed academic/technical papers in computer
science and applications areas.
Dr. Kunii is Honorary President and Founder of the Computer Graphics Society, Editor-in-Chief
of The Visual Computer: An International Journal of Computer Graphics (Springer-Verlag),
Associate Editor-in-Chief of The Journal of Visualization and Computer Animation (John Wiley
25

& Sons) and on the Editorial Board of IEEE Transactions on Knowledge and Data Engineering,
VillB Journal and IEEE Computer Graphics and Applications. He is on the IFIP Modeling and
Simulation Working Group, the IFIP Data Base Working Group and the IFIP Computer Graphics
Working Group. He organized and was chairing the Technical Committee on Software Engineer-
ing of the Information Processing Society of Japan from 1976 to 1981.
He is on the board of directors of Japan Society of SpOTtS Industry and also of Japan Society of
Simulation and Gaming. He served as general and program chairman of various international
conferences.

He received the B.Sc., M.Sc., and D.Sc. degrees in chemistry all from the University of Tokyo in
1962, 1964, and 1967, respectively. He is a fellow of IEEE and a member of ACM, BCS, IPSJ
and IEICE.

Address: Department of Information Science, Faculty of Science, the University of Tokyo, 7-3-1
Hongo, Bunkyo-Ku,Tokyo, 113 Japan

Yoshihisa Shinagawa is currently a Research Associate of t~e


Department of Information Science of the University of Tokyo. HIS
research interests include computer graphics and its applications.
He received the B.Sc. and M.Sc. degrees in information science
from the University of Tokyo in 1987 and 1990 respectively. He is
a member of the IEEE Computer Society, ACM, IPSJ and IEICE.
Address: Department of Information Science, Faculty of Science,
the University of Tokyo, 7-3-1 Hongo, Bunkyo-Ku,Tokyo, 113
Japan
Introduction to Volume Synthesis
Arie Kaufman

ABSTRACT
A recently increasing trend in the use of discrete voxel representation for a variety of
geometry-based applications is apparent. These applications include CAD, simulation, and
animation, as well as those that intermix geometric objects with 3D sampled or computed
datasets. In these applications, the inherently continuous 3D geometric scene is sampled
employing voxelization (3D scan conversion) algorithms, which generate a 3D raster of
voxels. The voxelized objects have to conform to some 3D discrete topological
requirements such as connectivity and absence of tunnels. During the voxelization process,
termed also the volume synthesis process, each voxel is assigned precomputed numeric
values that represent some measurable viewing-independent properties of a tiny cube of the
real object. These values are then readily accessible for speeding up the rendering process.
The voxelization algorithms are the counterparts of the 2D scan conversion algorithms, and
the 3D raster generated by them is the 3D counterpart of conventional 2D raster.

Key Words: volume visualization, volumetric graphics, 3D discrete space, voxel,


voxelization, 3D scan conversion, 3D raster, tunnel-free surface.

1. INTRODUCTION
Volumetric graphics is the subfield of. computer graphics concerned with yolume synthesis,
volume modeling, and volume visualization. Typically, volumetric graphics employs a
cubic frame buffer, a 3D raster of voxels (volume elements), to store the volumetric dataset
in discrete voxel space. Volumetric graphics is thus the 3D conceptual counterpart of 2D
raster graphics. Volume synthesis, which is the subject of this paper, is the process of
converting a geometric representation of a synthetic volumetric model into a set of voxels
that "best" represents that model within the discrete voxel space. The field of volume
modeling encompasses the synthesis as well as the analysis and manipulation of sampled
and/or synthetic objects contained within a volumetric dataset. Volume visualization, on
the other hand, is a visualization method concerned with the representation, manipulation,
and rendering of volumetric data. Once a geometric scene is discretized and stored in a 3D
raster, possibly intermixed with sampled data, volume modeling and visualization can be
used to manipulate, observe, and interpret the scene.

Common applications of volume visualization handle sampled or measured data, as in for


example medical imaging (e.g., [Bakalash and Kaufman 1989, Brewster et aI. 1984, Farrell,
Yang, and Zappulla 1985, Fuchs, Levoy, and Pizer 1989, Herman and Udupa 1983, Neyet
aI. 1990, Rhodes 1990, Tiede et aI. 1990, Vannier, Marsh, and Warren 1983], biology (e.g.,
confocal microscopy [Kaufman et aI. 1990, Leith, Marko, and Parsons 1989]), geoscience
(e.g., seismic measurements [Sabella 1988, Wolfe and Liu 1988]), industry (e.g., inspection

27
28

[Kruger and Cannon 1979]), meteorology [Grotjahn and Chervin 1994, Hibbard and
Santek 1999, Upson and Keeler 1999], and molecular systems (e.g., electron density maps
[Goodsell, Mian, and Olson 1999]). Although the voxel representation is more effective for
sampled imagery, it also has a significant utility in synthetic volumetric graphics, such as
computer-aided design (e.g., solid modeling [Kaufman 1999], finite elements, material
stress patterns), simulation and animation (e.g., Hughes Aircraft's RealSceneR flight
simulator), and scientific visualization (e.g., astrophysical simulation, fluid flow [Shirley and
Neeman 1999, Upson and Keeler 1999]). Furthermore, in many applications like surgical
planning and radiation therapy planning, the sampled data need to be visualized along
with objects that may not be available in digital form, such as osteotomy surfaces, surgical
cuts, prosthetic devices, grafts, scalpels, injection needles, isodose surfaces, and radiation
beams. These objects can be voxelized and intermixed with the sampled organ and
rendered using a variety of approaches [Goodsell, Mian, and Olson 1999, Kaufman, Yagel,
and Cohen 1990, Levoy 1990]. Other uses of synthetic voxel objects have included
Greene's voxel space automata for simulating plant growth [Greene 1999], Kajiya and
Kay's 3D texture for fur rendering [Kajiya and Kay 1999], and hypertextures [Perlin and
Hoffert 1999].

2. 3D DISCRETE TOPOLOGY
A nomenclature and a theory of 3D discrete topology have been developed, including topics
such as 6-, 19-, and 26-neighbors, connectivity, paths, and metrics. The terms used here
for 3D discrete topology are generalizations of those used in 2D discrete topology [Kong
and Rosenfeld 1999, Pavlidis 1992, Rosenfeld 1991, Srihari 1991]. Let us mark the
discrete finite 3D voxel-image space, which is a 3D array of grid points, as Z3. We shall
term a "voxel at coordinates (x, y, z)" as the continuous region (u, v, w) such that
x-O.5<u ~x +0.5, y-O.5<v ~y +0.5 and z-0.5<w ~z +0.5. Thus, a voxel "occupies"
a unit cube centered at the grid point (x ,y ,z). The voxel is usually assumed to be
homogeneous, that is, it it small enough to contain information regarding one single object
only. The 3D array of voxels, also termed the 9D raster, tessellates the 3D grid of voxels.

Each voxel (x ,y ,z )EZ 3 has three kinds of neighbors:


(1) six lace neighbors: (x ±l,y ,z), (x ,y ±l,z), and (x ,y ,z ±l)j
(2) twelve edge neighbors: (x ±l,y ±l,z ), (x ±l,y ,z ±1), and (x ,y ±l,z ±l)j and
(3) eight corner (vertex) neighbors: (x ±l,y ±l,z ±1).
We further define the six face neighbors as 6-neighbors. The combination of the six face
and twelve edge neighbors is defined as 18-neighbors, while that of all three kinds of
neighbors is defined as 26-neighbors. A 6-connected path is a sequence of voxels where
consecutive pairs are from the same class (e.g., black, opaque, bone in CT scan) and are 6-
neighbors. The 18- and 26-connected paths are similarly defined. A 6-connected tunnel is
a path of 6-connected transparent voxels through a surface or a volume. An 18-connected
tunnel and a 26-connected tunnel are similarly defined [Cohen and Kaufman 1990].
Synthetic objects in 3D discrete space are classified based on their connectivity (for 1D
objects, i.e., curves), absence of tunnels (for 2D objects, i.e., surfaces), and absence of
cavities (for 3D objects, i.e., solids). The next section provides more details on the
connectivity and the absence of tunnels of synthetic objects (see also [Cohen and
Kaufman 1990]).
29

3. VOXELIZATION
Voxelization converts the scene into a discrete voxel representation by 3D scan-converting
(voxelizing) each of the geometric objects comprising the scene. A 3D voxelization
algorithm for a given geometric object generates the set of voxels that best approximates
the continuous specification of the object and stores the set in a 3D array of unit voxels.

In the past, digitization of solids was performed by spatial enumeration algorithms which
employ point or cell classification methods in an exhaustive fashion, or preferably by
recursive subdivision [Lee and Requicha 1982]. However, subdivision techniques for model
decomposition into rectangular subspaces are computationally expensive and thus
inappropriate for medium or high resolution grids. The 3D scan-conversion algorithms we
have developed, on the other hand, follow the same paradigm as the 2D scan-conversion
algorithms commonly used in 2D raster graphics: they are incremental, use simple
arithmetic (preferably integer-based), and have a complexity that is not more than linear
with the number of voxels generated.

The literature of 3D scan conversion is relatively small. Danielsson [Danielsson 1970]


developed a 3D curve algorithm where the curve is defined by the intersection of two
implicit surfaces. Mokrzycki [Mokrzycki 1988] elaborated on Danielsson's method and
developed a similar algorithm. We have introduced elsewhere [Kaufman and
Shimony 1986] voxelization algorithms for 3D lines, 3D circles, and a variety of surfaces
and solids, including polygons, polyhedra, and quadrics. We have also described efficient
algorithms for voxelizing polygons using an integer-based decision mechanism embedded
within a scan-line filling algorithm [Kaufman 1987a]; for voxelizing parametric curves,
surfaces, and volumes using an integer-based forward differencing technique
[Kaufman 1987bJ; and for voxelizing quadrics and quartics such as cylinders, spheres,
cones, and tori using "weaving" algorithms in which a discrete circle/line sweeps along a
discrete circle/line [Cohen and Kaufman 1990J. All these algorithms preserve the topology
and allow control over the connectivity of the voxelized objects.

As defined in Section 2, every two consecutive voxels are 26-adjacent along a 1!6-connected
curve or line. Similarly, every two consecutive voxels are IS-adjacent along an 18-
connected curve or line, and every two consecutive voxels are 6-adjacent along a 6-
connected curve or line. Voxelization algorithms should generate curves and lines that are
either 6-, 18-, or 26-connected. The decision which type to use depends on the application.

A voxelized surface approximating a connected continuous surface has to be connected


(e.g., 6-, 18-, 26-connected). However, connectivity does not fully characterize the surface
because the voxelized surface may contain local discrete holes, termed tunnels, that are not
present in the continuous surface. A tunnel is a passage by a discrete line through a
voxel-based surface. The line crosses from one side of the surface to the other. A
requirement of our surface voxelization algorithms is that the volumetric representation of
a (continuous) surface must be "thick enough" not to allow the discrete rays to penetrate
them. A voxelized surface through which 6-connected rays (3D lines) do not penetrate is a
6-tunnel-free surface, a thicker surface through which 18-connected rays can not pass is a
18-tunnel-free surface, while an even thicker surface through which 26-connected rays can
not pass is a 26-tunnel-free surface. The voxelization algorithms should generate any of
these types of surfaces; the decision which one to use depends primarily on the application
and on the connectivity of the discrete rays employed during the rendering.
30

4. WHY VOXELIZE!
An appealing characteristic of the volumetric representation is its insensitivity to the
complexity of the scene. Volume rendering performance depends almost solely on the
(constant) resolution of the 3D raster. On the other hand, a typical 3D raster occupies a
large amount of memory; for example, for a moderate resolution of 5123 , one byte per
voxel, the 3D raster consumes 128M bytes. Since computer memories are significantly
decreasing in price and increasing in their compactness, such large memories are becoming
more and more feasible. Yet, the extremely large throughput that has to be handled
requires a special architecture and processing attention [Goldwasser 1984, Jackel 1985,
Kaufman and Bakalash 1988, Ohashi, Uchiki, and Tokoro 1985]. Volume engines,
analogous to the currently available geometry engines, will prevail in the future, with the
capability to synthesize, load, store, manipulate, and display volumetric datasets in real
time.

Unlike conventional surface graphics, the voxel-based image is in discrete form. This is the
cause for most of the difficulties of the voxel-based representation, similar to those of 2D
rasters [Eastman 1990]. Manipulation and transformation of the discrete volume are
difficult to achieve without degrading the image quality or losing some information.
Rotation of rasters by angles other than multiples of 90 degrees is especially problematic
since a sequence of consecutive rotations will completely distort the image. Another
problem is apparent when the camera zooms-in on the 3D raster. Voxels appear to be
parted from each other and may cause holes and high aliasing. Nevertheless, these
problems can be alleviated to some extent by methods similar to those adopted by 2D
raster graphics, such as antialiasing, supersampling, and smoothed shading.

A main point of the difference between voxel-based graphics and surface graphics is that in
the former the scene is created (voxelized) once for multiple viewpoints and lighting
conditions, while the latter must repeatedly scan convert the scene after every slight
change in the viewing parameters. Consequently, synthesizing complex scenes is especially
attractive. For example, adding texture to an object or applying smoothing methods
[Wang, Cohen, and Kaufman 1991] are one-time computations that improve the realism of
the image with no cost at the rendering time.

Moreover, in anticipation of repeated access to the volume data (e.g., animation), various
view-independent attributes can be precomputed as part of the voxelization, stored with
the voxel, and be made readily accessible to speed up the rendering. The voxelization
algorithm generates for each object voxel its color as well as its texture color, the normal
vector, and precomputed information concerning the light sources' interaction with the
voxel. The latter information may include, for example, two bits per light source to denote
whether that light source is visible, occluded, or visible through a translucent material.
Actually, the view-independent parts of the illumination equation, that is, the ambient
illumination and the sum of the attenuated diffuse illumination of all the visible light
sources, can also be precomputed.

A central feature of volumetric representation is that, unlike surface representation, it


contains the inner structures of objects which can be revealed and explored with the
appropriate rendering techniques. Natural objects as well as synthetic objects are likely to
be solid rather than hollow. Moreover, features such as translucency and amorphous
phenomena (e.g., clouds, fire or smoke) are volumetric in nature [Ebert and Parent 1990,
31

Upson 1986J. The inner structures provide another dimension of image complexity which
is inherently supported by volumetric graphics.

Rasters, in general, lend themselves to block operations such as bitblt or the 3D


counterpart, voxblt [Kaufman 1991J. In particular, the 3D raster lends itself to Boolean
operations that can be performed on a voxel-by-voxel basis during the voxelization stage.
This makes ray tracing of constructive solid models straightforward. Block operations add
a variety of modeling capabilities which aid in the task of model synthesis by supporting
the manipulation of true solids. Once a model has been constructed in voxel
representation, it is rendered the same way any other 3D raster is. Figure 1 shows a
volumetric ray tracing of a 256 3 reconstructed MRI head with an oblique cut and a mirror
that has been generated by a CSG subtraction operation of voxelized disks from a
voxelized polygon during the voxelization phase.

Volume synthesis is particularly useful for ray casting or ray tracing. Once a 3D raster
with precomputed view-independent attributes is available, a discrete ray tracing/casting
algorithm is engaged in which 3D discrete rays are traversed through the 3D raster.
Encountering a non-transparent voxel by the ray traversal algorithm indicates a ray-
surface hit, thereby eliminating the need for the computationally expensive ray-object
intersection calculations. This volumetric ray tracing approach is especially attractive for
ray tracing complex geometric scenes and constructive solid models, as well as 3D sampled
and computed datasets. We have implemented the voxelization and volumetric ray tracing
algorithms [Yagel et al. 1991J on Sun 4, Silicon Graphics Iris 4D, and lIP Turbo SRX
workstations, as part of the Cube system [Kaufman 1988], which is a general-purpose
voxel-based system for 3D graphics. Figures 1 and 2 show examples of objects voxelized
and then ray traced in discrete voxel space. In spite of the complexity of these scenes, ray
tracing time was approximately the same as for much simpler scenes, providing a
substantial improvement in computational speed over existing algorithms.

6. CONCLUDING NOTE
The progress so far in volume visualization, in computer hardware, and in software
development, coupled with the desire to reveal the inner structures of volumetric datasets,
guarantees that volume visualization will develop into a dominant field in computer
graphics and its associated applications, using either sampled, computed, or synthesized
datasets. Just as raster graphics in the seventies superseded vector graphics for visualizing
surfaces, volume (3D raster) visualization is now superseding surface graphics for handling
and visualizing volumes. Furthermore, just as raster graphics has been dominant for
rendering both surfaces and lines, volume visualization, coupled with progress in hardware,
will liberate computer graphics from the limitations of 2D raster graphics, and will prevail
in the nineties and beyond for rendering volumes as well as surfaces and lines.

Acknowledgment

This project has been supported by the National Science Foundation under grants MIP-
8805130 and IRI-9008109, and grants from Hughes Aircraft Company, Hewlett Packard,
and Silicon Graphics. This paper is based on a technical report [Cohen, Kaufman, and
Yagel 1991J written with Dany Cohen and Roni Yagel.
32

Figure 1: A 256 3 resolution MRI head mirrored on a back plane. An oblique cut uncovers a
portion of the brain. The edges of the mirror were subtracted by disks in a CSG operation.
Volumetric ray tracing time is only 23.4 seconds running on a 20 MIPS workstation.

Figure 2: Newell's teapot over a textured floor, mirrored in two mirror-walls. Volumetric
ray tracing time for this 256 3 resolution image is only 30.7 seconds running on a 20 MIPS
workstation.
33

6. REFERENCES

Bakalash, R. and Kaufman, A., "MediCube: a 3D Medical Imaging Architecture",


Computers 8 Graphics, 13, 2, 151-157, (1989).

Brewster, L. J., Trivedi, S. S., Tuy, H. K., and Udupa, J. K., "Interactive Surgical
Planning", IEEE Computer Graphics and Applications, ., 3, 31-40, (March 1984).

Cohen, D. and Kaufman, A., "Scan Conversion Algorithms for Linear and Quadratic
Objects", in Volume Visualization, A. Kaufman, (ed.), IEEE Computer Society Press, Los
Alamitos, CA, , 280-301, 1990.

Danielsson, P. E., "Incremental Curve Generation", IEEE Trans. on Computers, C-19,


783-793, (1970).

Eastman, C. M., "Vector versus Raster: A Functional Comarison of Drawing


Technologies", IEEE Computer Graphics & Applications, 10, 5, 68-80, (September 1990).

Ebert, D. S. and Parent, R. E., "Rendering and Animation of Gaseous Phenomena by


Combining Fast Volume and Scanline A-buffer Techniques", Computer Graphics, 2., 4,
367-376, (August 1990).

Farrell, E. J., Yang, W. C., and Zappulla, R. A., "Animated 3D CT Imaging", IEEE
Computer Graphics and Applications, 6, 12, 26-32, (December 1985).

Fuchs, H., Levoy, M., and Pizer, S. M., "Interactive Visualization of 3D Medical Data",
IEEE Computer, 22, 8, 46-51, (August 1989).

Goldwasser, S. M., "A Generalized Object Display Processor Architecture", IEEE


Computer Graphics and Applications, ., 10, 43-55, (October 1984).

Goodsell, D. S., Mian, S., and Olson, A. J., "Rendering of Volumetric Data in Molecular
Systems", Journal of Molecular Graphics, 7, 41-47, (March 1989).

Greene, N., "Voxel Space Automata: Modeling with Stochastic Growth Processes in Voxel
Space", Computer Graphics, 23, 3, 175-184, (July 1989).

Grotjahn, R. and Chervin, R., "Animated Graphics in Meteorological Research and


Presentation", Bulletin of American Meteorological Society, 66, 1201-1208, (1984).

Herman, G. T. and Udupa, J. K., "Display of 3D Digital Images: Computational


Foundations and Medical Applications", IEEE Computer Graphics and Applications, 3, 5,
39-46, (August 1983).

Hibbard, W. and Santek, D., "Visualizing Large Data Sets in the Earth Sciences", IEEE
Computer, 22, 8, 53-57, (August 1989).

Jackel, D., "The Graphics P ARCUM System: A 3D Memory Based Computer Architecture
for Processing and Display of Solid Models", Computer Graphics Forum, ., 1, 21-32,
34

(January 1985).

Kajiya, J. T. and Kay, T. L., "Rendering Fur with Three Dimensional Textures",
Computer Graphics, 23, 3, 271-280, (July 1989).

Kaufman, A. and Shimony, E., "3D Scan-Conversion Algorithms for Voxel-Based


Graphics", Proc. ACM Workshop on Interactive 9D Graphics, Chapel Hill, NC, 45-76,
October 1986.

Kaufman, A., "An Algorithm for 3D Scan-Conversion of Polygons", Proc.


EUROGRAPHICS'87, Amsterdam, Netherlands, 197-208, August 1987.

Kaufman, A., "Efficient Algorithms for 3D Scan-Conversion of Parametric Curves,


Surfaces, and Volumes", Computer Graphics, 21, 4, 171-179, (July 1987).

Kaufman, A. and Bakalash, R., "Memory and Processing Architecture for 3-D Voxel-Based
Imagery", IEEE Computer Graphics 8 Applications, 8, 6, 10-23, (November 1988), Also
translated into Japanese, Nikkei Computer Graphics, 3, No. 30, March 1989, pp. 148-160.

Kaufman, A., "The CUBE Workstation - A 3D Voxel-Based Graphics Environment", The


Visual Computer, 4, 4, 210-221, (1988).

Kaufman, A., "Voxel-Based Solid Modeling", Proc. International Conference on


CADI CAM and AMT, Jerusalem, Israel, 1.1.3-1-3, December 1989.

Kaufman, A., Yagel, R., and Cohen, D., "Intermixing Surface and Volume Rendering", in
9D Imaging in Medicine: Algorithms, Systems, Applications, K. H. Hoehne, H. Fuchs, and
S. M. Pizer, (eds.), , 217-227, June 1990.

Kaufman, A., Yagel, R., Bakalash, R., and Spector, I., "Volume Visualization in Cell
Biology", Proceedings Visualization '90, San Francisco, CA, 160-167, October 1990.

Kaufman, A., "The voxblt Engine: A Voxel Frame Buffer Processor", in Advances in
Graphics Hardware III, A.A.M. Kuijk and W. Strasser, (eds.), Springer-Verlag, Berlin, ,
1991.

Kong, T. Y. and Rosenfeld, A., "Digital Topology: Introduction and Survey", Computer
Vision, Graphics and Image Processing, 48, 3, 357-393, (December 1989).

Kruger, R. P. and Cannon, T. M., "The Application of Computerized Tomography,


Boundary Detection, and Shaded Graphics Reconstruction to Industrial Inspection",
Materials Evaluation, 36, 75-80, (April 1978).

Lee, Y. T. and Requicha, A. A. G., "Algorithms for Computing the Volume and Other
Integral Properties of Solids: I-Known Methods and Open Issues; II-A Family of Algorithms
Based on Representation Conversion and Cellular Approximation", Communications of the
ACM, 25,9, 635-650, (September 1982).

Leith, A., Marko, M., and Parsons, D., "Computer Graphics for Cellular Reconstruction",
IEEE Computer Graphics and Applications, 9, 5, 16-23, (September 1989).
35

Arie Kaufman is a Professor of Computer Science at the State University of


New York at Stony Brook. He is the director of the Cube project for volume
visualization supported by the National Science Foundation, Hughes Aircraft
Company, Hewlett-Packard Company, Silicon Graphics Company, and the State
of New York. Kaufman has held positions as a Senior Lecturer and the Director
of the Center of Computer Graphics of the Ben-Gurion University in Beer-Sheva,
ISrael, and as an Associate and Assistant Professor of Computer Science at Fro in
Miami, Florida. His research interests include volume visualization, computer
graphics architectures, algorithms, and languages, user interlaces, and scientific
visualization. Professor Kaufman has lectured widely and published numerous
technical papers in these areas. He has been the Papers Chair and Program co-
Chair for VISualization '90 and Visualization '01 Conferences, respectively, c0-
Chair for several EUROGRAPIDCS Graphica Hardware Workshops, and a
member of the IEEE CS Technical Committee on Computer Graphics. He
received a BS in Mathematics and Physics from the Hebrew University of
Jerusalem in 1960, an MS in Computer Science from the Weizmann Institute of
Science, Rehovot, in 1073, and a PhD in Computer Science from the Ben-Gurion
University in 1077.
Address: Department of Computer Science, State University of New York at
Stony Brook, Stony Brook, New York 11704-4400, USA.
Electronic mail: ari@sbcs.sunysb.edu
Computer Visualization in Spacecraft Exploration
of the Solar System
W. Reid Thompson and Carl Sagan

Computer visualization, including graphical, imaging, and animation methods, is an essential


tool for spacecraft exploration of the solar system. Its applications range from planning tra-
jectories and orbital tours, and animation of spacecraft encounters, to a variety of applications
involving the visual display and scientific analysis of returned data. Scientific visualization can
take forms as varied as the suite of instruments carried by the spacecraft, but includes represen-
tation of motion and fields, animation of imaging sequences, and the effective communication
of scientific results by the use of novel computer-generated graphics, images, and animated se-
quences. We present a few examples drawn from spacecraft flybys of the outer planets and the
ongoing scientific analysis of the wealth of data they have returned.

spacecraft, image processing, visualization, planetary science

INTRODUCTION.

From the late 1950's to today, the planning, execution, and data analysis phases of spacecraft
exploration, planetary science, and Earth remote sensing have stimulated development and novel
applications of computer hardware and software. There are many areas within the broad scope
of human and robotic activities in space for which the development and application of computa-
tional tools could be discussed. Here we concentrate on the voyages of exploration and scientific
discovery in our solar system: a quest that started 30 years ago with early reconnaissance of the
Moon and Mars, and continues today with such spacecraft as Galileo and Magellan.

VISUALIZATION IN SPACECRAFT MISSION PLANNING.

Techniques of computer modeling and visualization are used in all phases of spacecraft explo-
ration, from design and planning to mission operations to scientific analysis of the data returned.
As one example of applications in the design phase, we mention the challenge of orbital tour
design. The formulation of interplanetary trajectories and orbits is sometimes relatively simple,
but today spacecraft can make use of close encounters with planets and/or moons to accomplish
much more complex missions than would otherwise be feasible. Each "gravity assist" or "orbital
pumping" event constrains all future events, so that the visualization· and optimization of an
entire mission rapidly becomes formidable. We do not discuss the techniques here, but simply
point to Galileo as a current mission with an unprecedentedly complex orbital character. Galileo
is using close flybys of Venus and Earth to provide much of the energy required to reach Jupiter,
and will use close encounters with Jupiter's moons to help achieve Jupiter orbit and to execute
a coordinated series of close flybys of the Galilean satellites. [For an explanation of the basic
physics of gravity assist and some examples of Jupiter orbital tour designs see Bartlett and Hord
(1985) and references therein.] The planned tour of the Cassini spa~ecraft, a Satu~ orbiter !I;lld
Titan radar mapper, represents another example of a complex trajectory and orbItal planmng
exercise (D' Amario et al. 1989). It is now common for thousands of candidate trajectories to
be rejected before an optimum mission design is settled upon.

37
38

VISUALIZATION OF SPACECRAFT ENVIRONMENT AND OPERATION.

Solar system exploration is unique in that the spacecraft's remote sensing instrumentation
projects the human presence to a real physical environment far removed from our direct experi-
ence. For orientation of project personnel, for general educational purposes, and to demonstrate
to the public the nature of national and international voyages of exploration undertaken by their
governments, it is now possible to illustrate the essential elements of the spacecraft, its activities,
and its environment in a direct way - as if a human observer were traveling with the spacecraft.
The best and longest-running example of this kind of product is a series of computer generated
videos illustrating the encounters of the Voyager spacecraft with the outer planets. These were
all produced at the Jet Propulsion Laboratory (JPL), the facility responsible for the planning
and flight management of most planetary exploration missions by the United States, and were
pioneered by James Blinn and Charles Kohlhase, among others. These videos put the viewer
in the perspective of an observer flying along with the spacecraft during a planetary encounter.
Since the first release illustrating the Voyager 1 flyby of Jupiter in 1979, they have increased in
quality and complexity, but all are remarkable for their perspective and accuracy, in that the size
and rotation of the planet, revolution of its moons, and the star background together comprise
the true and changing perspective that would actually be seen near the spacecraft through the
period of the flyby. These videos are available on tape and, in many cases, video disk formats,
the latter being included in some commercially available archives. In Fig. 1, we present a few
representative frames from videos produced for the Jupiter and Saturn flybys of both Voyagers 1
and 2, and the Uranus and Neptune flybys made possible by Voyager 2's extended mission. JPL,
through its Digital Image Animation Laboratory, continues to be a leader in the production of
computer-generated video animation of this kind (Holzman and Blinn 1988).

VISUALIZATION FOR OBSERVATIONAL ENHANCEMENT.

We give the above title to the often computer-intensive activity of taking digital imaging or
other remote sensing data and transforming it into a form that greatly enhances the ability of
the scientist to see, measure, and understand the physical processes occurring on other worlds.

Fig. 1. (Following page.) Selected frames from video animations of planetary flybys by the
Voyager 1 and 2 spacecraft. (a) Voyager 1 approaches Jupiter. (b) Computer-generated image
of Jupiter near closest approach, showing lightning flashes. (c) Voyager 1 approaches Saturn
and Titan (left center). (d) Voyager 2 observes Saturn with its ultraviolet spectrometer (UVS).
The UVS field of view and pointing are indicated by the graphic overlay. (e) Computed view
above Saturn's rings at close range. Small moons at outer edge and dark "spokes" in ring are
seen. (f) Voyager 1 at the moment of Saturn ring plane crossing. (g) Voyager 2 maps Miranda,
a moon of Uranus. (h) The Sun reappears from occultation as Voyager 2 departs Uranus against
the backdrop of the constellations. Source: Jet Propulsion Laboratory.
39

c [

E F

H
40

The operational steps include corrections for camera geometric distortions, conversion to abso-
lute spectral reflectance, and a variety of perspective transformations and/or map projections.
They also include brightness transformations ranging from simple contrast enhancements and
clever use of grayscale/color intensity mapping, to removal of photometric gradients (shading
functions) from planetary objects - the latter process is the inverse of the problem of generating
realistic reflection properties for objects in computer-generated images. Often these operations
are carried out either to enhance the scientists' ability to discern subtle morphological or spec-
tral properties of planets or their natural satellites, or to prepare a calibrated digital data set
to be used as the input for various types of scientific analysis or modeling. Much of this work
falls under the domain of the field traditionally called "image processing" - in this case image
processing for generalized remote sensing applications. Selecting from the multitude of possi-
ble examples of visualization enhancements, we show in Fig. 2 an image revealing the diffuse
components of Jupiter's ring (Showalter et al. 1985).

Fig. 2. Grays al and color ar I verlr


utiliz d to show Jupit r ring (gray/-
white) a halo of mat rial ab v and
below th ring plan (dark gray/red/-
y~ow /blu ), and. the faint go am "
nng (bill xt dmg Oll ward from maiu
ring). C mbin d hi togram and color
map' shown at bottom. Sour : ~lark
R. Showal [ Stanford Uniwrsily.

As an example of photometric calibration and mapping, we show in Fig. 3 a global map of


Jupiter as it appeared to the Voyager 2 spacecraft. The color map is a visual rendition of a
digital data set of Jupiter's absolute cloud reflectance in four spectral bands (three of which
are used as red, green, and blue here), at 0.5 0 resolution in latitude and longitude. It was con-
structed by an automated computer mapping algorithm that takes many input images and, for
a given latitude and longitude grid point, ranks them with preference to the most nearly normal
(perpendicular) viewing geometry, both to maximize resolution and to minimize shading correc-
tions. The shading (photometric function) of the object is removed and a weighting/averaging
function applied to data sampled from the highest ranked images to construct a uniform global
data base of normal reflectance (the absolute reflectance the object would have if every element
of the surface were illuminated and observed from directly above). This product can then be
used as the starting point for several kinds of further scientific analysis (see examples below).
a

b ......
"""

Fig. 3. Global maps of Jupiter showing 3 of 4 bands of a data base constructed by computer mosaicing offorty-four Voyager 2
images. (a) approximately natural color; (b) a gaussian contrast stretch. Source: W. Reid Thompson, Cornell University.
42

Another kind of visualization for observational enhancement is the production of motion (movie)
sequences from imaging frames despite perspective changes, planetary rotation and pointing off-
sets. The geometrical registrations and transformations are similar to those mentioned above,
but in this case a time series of images is produced, most often to visualize details of meteoro-
logical dynamics impossible to discern from individual frames. Sometimes interpolation schemes
are applied to create a smoother sequence. Again, the Voyager encounters provide an excellent
example. Movies of Jupiter's atmospheric dynamics, evanescent dark "spokes" in Saturn's rings,
and Neptune's Great Dark Spot are prime examples of products produced for the purpose of
visualizing the motion and evolution of dynamic physical phenomena.

VISUALIZATION FOR ILLUSTRATING SCIENTIFIC RESULTS.

As the required hardware and software tools become more widely available and the complexity
of the phenomena studied increases, the use of computer graphics, images and videos to convey
the results of scientific investigations in planetary science is becoming more common. Direct
evidence of this trend is provided by national conferences such as the annual meeting of the
Division of Planetary Sciences of the American Astronomical Society, where more and more oral
and poster presentations include a video component, and many other presentations include a
complement of complex color graphics or digital images generated to better convey the scientific
results. One example drawn from our own work is shown in Fig. 4. The texturally complex
scene of Jupiter's clouds shown in the photometric data base of Fig. 3 makes quantitative studies
of cloud reflection properties, and how they change with illumination geometry, very difficult.
As a next step toward quantitatively investigating the ·differences in cloud height, thickness,
and chemical composition between different areas on Jupiter, we addressed the problem of
how to separate Jupiter into a manageable number of spectrally distinct units. Applying a
cluster analysis method of classification to the 4-band normal albedo data base, we found about
25 statistically distinct spectral cloud units (Thompson 1990). The 20 units that comprise
most of Jupiter's photometric surface are shown in Fig. 4. Both the global data base and the
spectral units are more thoroughly illustrated in a video published as part of a special issue of
the International Journal of Supercomputer Application8 (Thompson 1990). Examination of the
four-dimensional parameter space (ibid.) shows that most of the units form a regular progression,
but certain areas like the Great Red Spot and brown ovals deviate from the trend in spectral
properties of the rest of the planet, suggesting different compositions and/or unique vertical
structure in the clouds. Much more remains to be done in assessing the spatial correlations in
Fig. 4 and their relationship to dynamics, modeling the cloud vertical structure and composition,
and determining the physical basis of the spectral unit boundaries.

Video has also been used very effectively to illustrate time-dependent computer modeling results.
In planetary science, a superb example is the illustration of the evolution of vortices in the
atmosphere of Jupiter. The largest and most organized vortices occur on Jupiter as white ovals
and the Great Red Spot (GRS). Using a shallow water hydrodynamic model, Dowling and
Ingersoll (1989) have addressed such questions as these: If the wind field of Jupiter determined
by the Voyager spacecraft is taken as a boundary condition imposed on a fluid with small random
vorticity perturbations, how will the fluid evolve with time? Do large structures form, and what
is the nature and timescale of their evolution? Their video animation shows the time-evolution
predicted by the governing hydrodynamic equations, and reveals that large vortices do form and
over a short period of time (a few years) large, "proto-GRS's" interact and coalesce until only
one remains. A frame from this video is presented as Fig. 5.

SUMMARY.

In this report we have presented just a few examples of the many ways in which computer visu-
alization techniques are used in spacecraft exporation of the solar system and in the illustration
of research results in planetary science. The use of multidimensional color and video animation
are best developed in the "scientific documentary" medium, but are rapidly becoming more
common as important tools for illustrating and understanding the results of scientific analysis
43

Fig. 4. Results of a global spectral (color/albedo) classification of Jupiter's clouds. All cloud
material belonging to a given spectral class is identified by a specific color code on the rectangular
map projection. To reduce confusion, about one-third of the spectral units are shown in each
of the three panels. (a) Boundaries of the bright planet-girdling North Temperate Zone (NTZ)
(unit 4; bright pink horizontal segments) are spectrally similar to boundaries of white ovals in
the southern hemisphere. (b) The interiors of the oval to oblong "brown barges" just southward
of the NTZ are spectrally unique (unit 20; turquoise), while parts of the Great Red Spot (unit
11; yellow) are similar to a disturbed wake in the South Equatorial Belt (SEB). (c) An unusual
oval at high northern latitudes (unit 12; yellow) is spectrally similar to bright areas on the
northern edge of the SEB, while the interior of the Great Red Spot (unit 18; cyan) is spectrally
unique, indicating a unique combination of composition, height, and/or thickness of its clouds.

Fig. 5. A frame from the video ani-


mation of the evolution of large vortices
on Jupiter. Several large vortices ini-
tially form, but soon merge, leaving only
one GRS-sized vortex after a few years
of elapsed time. The free-surface height
in this "shallow water" dynamical model
is shown topographically for the upper
and lower layers, while potential vortic-
ity is coded in color on the upper surface.
See Dowling and Ingersoll (1989) for de-
tails of the model; Fig. 17 of that paper
presents these results in simple graphical
form. Source: Tim Dowling, MIT.
44

and computer modeling. The increasing availability of high performance workstations with high
level graphics and imaging capabilities will facilitate the production of complex illustrations and
video segments for increasing numbers of scientific investigations in the near future. Scientific
journals have not yet adequately responded to the increasing need for affordable high quality
color reproduction and video accompaniment in scientific papers.

REFERENCES.

Bartlett AA, Hord CW (1985) The slingshot effect: Explanation and analogies. Phys. Teach.
23: 466-473.
D'Amario LA, Byrnes DV, Diehl RE, Bright LE, Wolf AA (1989) Preliminary design for a
proposed Saturn mission with a second Galileo spacecraft. J. Astronaut. Sci. 37: 307-33l.
Dowling TE, Ingersoll AP (1989) Jupiter's Great Red Spot as a shallow water system. J. Atmos.
Sci. 46: 3256-3278.
Holzman RE, Blinn JF (1988) Computer graphics techniques and computer-generated movies.
Compo Phys. Commun. 49: 229-233.
Showalter MR, Burns JA, Cuzzi JN, Pollack JB (1985) Discovery of Jupiter's "gossamer" ring.
Nature 316: 526-528.
Thompson WR (1990) Global four-band spectral classification of Jupiter's clouds: Color/albedo
units and trends. Int. J. Supercomp. Appl. 4: 48-65.

W. Reid Thompson is Senior Research Associate in Astronomy and


Space Sciences at Cornell University, and recieved graduate degrees in
chemistry and in planetary studies from Cornell. While pursuing grad-
uate work as an NSF fellow, he received the DuPont Teaching Prize
and Cornell's Clark Award for Distinguished Teaching. Dr. Thompson
worked with the Voyager imaging team at the Jet Propulsion Labora-
tory during the Voyager 2 Uranus and Neptune encounters, and received
the NASA Group Achievement Award for Voyager activities. He is cur-
rently active in scientific analysis of data from the Galileo spacecraft
as well as ongoing laboratory and computational projects. His research
interests include modeling the spectral and compositional properties of
planetary atmospheres and surfaces, including the application of super-
computing and visualization techniques.
Address: Laboratory for Planetary Studies, 307 Space Sciences Bldg.,
Cornell University, Ithaca, NY 14853 USA.

Carl Sagan is the David Duncan Professor of Astronomy and Space


Sciences and Director of the Laboratory for Planetary Studies at Cor-
nell University. He has played a leading role in the Mariner, Viking,
and Voyager expeditions to the planets, for which he received the NASA
Medals for Exceptional Scientific Achievement and (twice) for Distin-
guished Public Service. He also serves as President of The Planetary
Society and as Distinguished Visiting Scientist at Caltech's Jet Propul-
sion Laboratory. A Pulitzer Prize winning author, he is also the creator
of the Cosmos television series.
Address: Laboratory for Planetary Studies, Space Sciences Bldg., Cor-
nell University, Ithaca, NY 14853 USA.
Computational Geometry and Visualization:
Problems at the Interface
Leonidas J. Guibas

Abstract

In this paper we survey certain geometric problems that arise in volwne visualization
and discuss how techniques from computational geometry can be applied to them. Such
problems include depth· sorting of polyhedral complexes, point· location, ray shooting and
tracing, and others. We give a few worked-out illustrative examples, as well as references to
the extant literature.

Keywords: computational geometry, volume vizualization, point-location, ray tracing, Delaunay


triangulations

1 INTRODUCTION

Computational geometry is, in its broadest sense, the study of geometrical problems from a com-
putational point of view. At the core of the field is a set of techniques for the design and analysis of
geometric algorithms. As such, computational geometry is normally viewed as part of theoretical
computer science 1 . The area had its origins in the early seventies-one usually thinks of Michael
Shamos' Ph.D. thesis at Yale (which later became the Preparata-Shamos textbook [35]) as the
original document defining the field. Computational geometry has undergone very rapid growth
in the last ten years and represents now an adult discipline.
Some of the current excitement of the field is due to a combination of factors: deep connections
with classical mathematics and theoretical computer science on the one hand, and many ties with
applications on the other. Indeed, the origins of the discipline clearly lie in geometric questions
that arose in areas such as computer graphics and solid modeling, computer-aided design, robotics,
computer vision, etc. Not only have these more applied areas been a source of problems and
inspiration for computational geometry but, conversely, several techniques from computational
geometry have been found useful in applications as well.
In this survey paper we explore some of the connections between computational geometry and an
area that has recently become very prominent in computer graphics, namely volume visualization.
This is the area of graphics that deals with techniques for modeling, manipulating, and rendering
volume data. Here volume data is to be contrasted with models of three dimensional scenes
IHowever, the same term was also used in the late sixties to denote what is now called CAGD-computer-aided
geometric design [17J.

45
46

based on geometrically described objects-such as, say, glass balls, with homogeneous interiors.
In volume visualization it is the data "interior" to the volume that carries the information of
interest. Such data arises naturally in many medical and scientific applications (e.g., tomography,
computational fluid mechanics, etc.).
We should remark at the outset that volume data often comes in regular three dimensional arrays
of volume cells, or "voxels". Some of the geometric aspects of volume visualization are most
appropriately treated within the context of a theory of 3-d discrete geometric space. Several
attempts have been made to develop the mathematical foundations of such spaces-see Rosen-
feld [38]. Computational geometry, on the other hand, mostly deals with what in graphics is
known as "object space" methods, i.e., with objects defined by real coordinates with no under-
lying discretization assumed. In this survey we will therefore ignore geometric issues in volume
visualization that are most appropriately treated in a quantized context. Issues like rendering
analytically defined curves, surfaces, and volumes on a 3-d raster (the "voxelization of geometric
models") will not be treated; references [8, 26] address those issues.
A first interaction between computational geometry and volume visualization, basic to all that
follows, occurs in the area of data structures and data representation. Volume data is often in the
form of a scalar or vector field, sampled at a set of points in space. Sometimes this set of points
comes with an associated polyhedral cell complex whose vertices are the original points. A very
common situation is one where sample points are spaced on a regular 3-d grid and the associated
cells (voxels) are small cubes based on the grid points. As another example, finite element grids,
though often irregular, always come with an associated complex. And even if the complex is not
given by the application, many visualization algorithms need to provide such a complex on top of
the sampled data for the sake of interpolating field values at non-grid points.
So how does one represent a polyhedral complex describing a partition of space? What are the
basic geometric and topological operations one needs to do with such a complex? Even in two
dimensions (i.e. describing a polygonal or just a triangular subdivision of the plane), this represents
a challenging problem and a lot has been written about it. See, for example, the discussion of
the quad-edge data structure in [23]. A good survey of the current literature on structures for
storing partitions of space (in two, three, and higher dimensions) can be found in the thesis of
Erik Brisson [3]. The best data structures separate the topology from the geometry of a complex.
In three dimensions the topological structure is the ordering of the facets around each edge, and
the incidence relationships between cells, facets, edges, and vertices. A 3-d structure has been
described by Dobkin and Laszlo [10], based on incident facet-edge pairs. This structure allows one
to walk in a connected fashion from each cell to its neighbors-a useful capability in visualization
algorithms that need to trace rays, planar sections, or isosurfaces through the volume data.
Geometric questions arise in several parts of the volume visualization pipeline. Most of the ones
of interest to us will be in the volume viewing process. The viewing process has to transform
3-d data to 2-d data by projection and sampling. Viewing algorithms can, at the outermost
level, either iterate over all voxels (object-order algorithms) and compute for each voxel the pixels
of the 2-d image it affects (and its contribution to each), or iterate over all pixels (image-order
algorithms) and calculate for each pixel all the voxels that contribute to it in the image. In both
cases geometric questions arise. For instance, in the former case, computing a good order in
which to iterate over the voxels is a challenging problem; in the latter, determining the set of
voxels affecting a particular pixel is non-trivial. Additional geometric questions arise in volume
shading and in the manipulation of volume data.
47

In what follows we present just a sample of problems in volume visualization that give rise to
interesting questions in computational geometry (such as the two mentioned above)-and report
on what is known about these questions. Examples include depth ordering polyhedral complexes,
three dimensional point-location, ray shooting and tracing, isosurface and cross section compu-
tations, and hierarchical structures. As the reader will see, sometimes not much is known about
these areas. We hope that the present paper will motive some computational geometry researchers
to look at these problems more closely, and also familiarize researchers in volume visualization
with some of the applicable computational geometry literature.

2 COMPUTING A DEPTH ORDERING FOR A POLYHEDRAL COM-


PLEX

2.1 TESTING FOR THE PRESENCE OF CYCLES

As discussed in the introduction, in object-order volume visualization methods, we traverse the


list of elements or cells defining our volume, and for each of them compute its contribution to
screen pixels. If our volume cells can have partial transparency, then the accumulation of their
contributions over each particular pixel becomes easier if we process the cells in a particular order.
Specifically, if we can find a sorted sequence of all the volume cells so that, from the point of view
of the eye, whenever cell A occludes cell B, then cell A is in front of cell B in that sequence, then
we can reduce all transparency calculations to standard compo siting operations [34] by processing
our cells according to that sequence.
Such good sequences have a long history in computer graphics, for they correspond to the ordered
object lists used in list priority algorithms for hidden-surface elimination-see for example [39].
The use of a z-buffer by modern graphics hardware obviates the need for priority lists if all
objects are opaque. However, the need for an ordering remains, if the objects to be rendered
have overlapping projections and can have partial transparency, as is the case with our volume
cells. When such an ordering is available, our cells can be rendered in either front-to-back, or
in back-to-front orders (there are trade-offs in efficiency), and their contributions to each pixel
can be compo sited in the right order. The way this compo siting is to be done is standard in the
graphics literature and will not be further discussed here.
As was pointed out already in [39], the first difficulty with this approach is that a good ordering
need not always exist. Figure 1 shows the standard example of three triangles A, B, and C,
where A occludes B, B occludes C, and C occludes A. It is easy to see how to hang a polyhedral
subdivision around those three triangles so that there would be three cells that form a cycle in
the occlusion relation. Obviously such a subdivision cannot be sorted.
It is an interesting problem in computational geometry to decide if a polyhedral subdivision admits
of a valid ordering in a given direction (say the z-direction), and if so to compute such an ordering.
Unfortunately, not much is known about this problem. Let us simplify the problem a little and
consider a collection of n pairwise disjoint non-vertical line segments, or rods, in space. We wish
to sort the rods in z-direction, meaning that we wish to find an ordering of the rods so that if
rod a's projection on the xy-plane intersects that of rod b, and rod a passes above (in +z sense)
b at that point, then a occurs in front of b in our list. We wish to find such a sorted list if it
exists, or otherwise to report a cycle in the "above" relation-demonstrating that no ordering is
48

Figure 1: Three triangles A, B, and C form a cycle in the occlusion relation.

possible. The fastest known method for this problem works in time O( n log n + k), where k is
the number of pairs of rods whose projections on the xy-plane intersect. In effect this is the time
needed to compute the line arrangement formed by the projections of our rods on the xy-plane, by
the method of Chazelle and Edelsbrunner [5]. This computes all the "above" relations bet weens
pairs of rods, so what remains is to attempt a topological sort of the "above" relation [28] in
time O( n + k). The number k of pair-wise intersections can of course be as high as 8( n 2 ), so the
important open question is whether one can solve the rod (and cell) sorting problem in better
than O(n2) time.
In the special case when the rods are infinite straight lines a positive answer has been given by
Chazelle, Edelsbrunner, Guibas, Sharir, and Stolfi [6]. They give a randomized algorithm whose
complexity is O(n!+<). Their idea is to apply a standard sorting algorithm to the lines and then
check the correctness of the proposed ordering. For this they need a subroutine that efficiently
checks if a group of lines passes above another group of lines. They accomplish this by first
mapping the lines to points or hyperplanes in 5-dimensional Euclidean space via their Plucker
coordinates [42], and then using standard randomized partitioning techniques . Unfortunately
these techniques do not seem to extend to the case of rods or cells.
If a cycle is present, one approach we can take is that of subdividing our rods or cells into smaller
pieces, so that cycles are eliminated. Fuchs , Kedem, and Naylor [18] propose a scheme based on
cutting the objects by planes and storing these planes in the nodes of a binary tree. This technique
of binary space partition trees is well-known in computer graphics and actually it simultaneously
destroys cycles from any point of view. It is likely that if we want to only destroy cycles along a
particular direction, then we can get by with a more economical number of cuts . Of course, for
the situation with rods, a quadratic number of cuts always suffices. It is enough to break each pair
of rods whose projections intersect over their intersection point. It is an interesting open question
in combinatorial geometry whether one can do better. In a special bi-partite situation, where
the n rods can be partitioned into two groups, red and blue, so that all intersecting projections
correspond to red-blue pairs (and some additional consistency conditions hold) , then [7] shows
that O( n 9 / 5 ) cuts suffice to eliminate all cycles in the "above" relation.
49

2.2 COMPLEXES KNOWN TO SORTABLE A PRIORI

Since the problem of checking a polyhedral subdivision of space for the presence of cycles (and of
fixing it if there are cycles), seems to be quite difficult, we might ask if there are some common
special situations in volume visualization where we can establish the absence of cycles by a priori
reasomng.
Perhaps the most common subdivision used in volume visualization is one where volume elements
are equal size cubes aligned with the axes. This subdivision arises naturally out of sampling along
a regular three dimensional grid. For this common situation, Herman and Liu [25] have shown
that, if all faces of such a cubical subdivision are sorted according to the (Euclidean) distance of
their centers from the eye, then the resulting ordering is consistent with all occlusion relationships
between the faces (they were interested in displaying approximations to organ surfaces formed by
collections of faces in the cubical lattice). An easy variation of their argument shows that the
same conclusion holds if we order the cubical cells themselves according to the distance of their
center to the eye.
What if our sampled data comes from an irregular grid-some arbitrary collection of n points
in three dimensional Euclidean space. Then, of course, there is no canonical tetrahedralization
(or polyhedralization) of the grid, so we can ask if we can always find some tetrahedralization
with the property that it contains no cycles in the occlusion relation. Here there is a very nice
result due to Edelsbrunner which states that the Delaunay tetrahedralization of our point set has
this acyclicity property from any viewpoint [13]. The Delaunay tetrahedralization is a very useful
triangulation (we will use the term "triangulation" because it is shorter than "tetrahedralization")
with numerous applications in science and engineering. If our grid points satisfy some mild non-
degeneracy assumptions, namely that no three points are collinear, no four coplanar, and no
five cospherical, then the Delaunay triangulation is uniquely defined. It consists of exactly those
tetrahedra with vertices grid points and satisfying the property that their circumsphere contains
no other grid point in its interior. For a discussion of properties and algorithms of Delaunay
triangulations see [14].
There is an elegant way to prove the Delaunay acyclicity result that we briefly mention here,
because it implies also a sorting algorithm. It is due to an old paper of Delaunay [9]. Suppose
that ABC D, and BCD E are two Delaunay tetrahedra sharing a common face BCD, and so that
a ray i from the eye 0 intersects ABC D first and BCD E second. In other words, ABC D occludes
BC DE. Recall also from elementary geometry the concept of the power of a point 0 with respect
to a sphere 5: it is defined to be the square of the distance from 0 to the center of 5, minus the
square of the radius of 5. If f. is any line through 0 intersecting 5 at points Pl and P2 , then the
power equals the product of the lengths OPl x OP2 ; in particular, if 0 is outside 5, the power
equals the square of the length of the tangent from 0 to 5. Now Delaunay uses a little elementary
geometry to show that the points of intersection of the circumsphere surrounding BCDE must
be further along i than those of the circumsphere of ABC D. See figure 2 for an analog in two
dimensions. Therefore the power of 0 with respect to the circumsphere of ABC D is less than
that with respect to the circumsphere of BCDE. This argument extends to any (non-adjacent)
occluding tetrahedra by just considering all the intermediate tetrahedra the ray i intersects. In
other words, the power of the eye 0 with respect to the circumspheres of the Delaunay tetrahedra
gives us a good function by which to sort the tetrahedra, so as to be consistent with all occlusion
relationships.
For additional discussion of this material, see [31].
50

Figure 2: The power of point 0 wrt the circumcircle of triangle ABC is less than the power of 0
wrt to the circumcircle of BCD.

3 POINT-LOCATION IN GENERAL POLYGONAL / POLYHEDRAL


SUBDIVISIONS

Many geometric operations that are immediate for regular grids become algorithmic challenges
when we have to deal with polyhedral subdivisions that have been fitted to irregular grids. One
of the most important of those is point-location, the operation of finding the cell in a subdivision
containing a particular query point. This arises in several uses of volume data sets for visualization.
For instance, one way to visualize fluid flows in computational fluid dynamics is to introduce
particles at arbitrary locations and follow them through the flow. This is called particle advection
[24]. In other applications, we may wish to compute arbitrary polygonal cross sections of a volume
data set. In both of these situations an initial point-location step is required to localize where
we are in the volume data set. The need for good point location algorithms in visualization is
discussed by Neeman [32], who develops a practical scheme based on k-d trees that works well in
many cases.
From a more theoretical point of view, point-location is a well studied problem for two dimensional
polygonal subdivisions, where several optimal algorithms are known. For a planar polygonal
subdivision of n edges, several algorithms are known that in O( n) time build a data structure
of size O( n)j this structure can then be used to answer point-location queries in time O(log n) .
Kirkpatrick [27] has shown how to construct a hierarchy of coarser and coarser subdivisions that
can be used for a point location method with these bounds. A different construction , based on
binary search ideas, was given by Edelsbrunner, Guibas, and Stolfi [15] attaining the same bounds.
To give a flavor of these methods, we present here an adaptation of Kirkpatrick 's method to
perform point location on subdivisions consisting entirely of rectangles aligned with the axes.
The rectangles need not all be the same size, or have exactly one neighbor across each side.
Such irregular rectangular networks arise in some finite element applications. The method builds
a hierarchy of subdivisions that are all of the same type as the original: edges are vertical or
horizontal and regions rectangular.
So let us assume we are given an arrangement of n rectangles that is a partition of one large outer
rectangle R (if this is not true, then it can be enforced in O( n) time by a simple strategy). Our
goal will be to construct in linear time a sequence of arrangements Ao, AI, ... , Ak such that :
51

a b

c j k i

d e g
h

k f

Figure 3: The sticks of an arrangement, and the corresponding adjacency graph on the sticks.

• Ao is the given arrangement,


• each A is a partition of R,

• Ak consists of the single rectangle R,


• each A consists of at most nci rectangles for some constant a < 1,
• each rectangle in A+l is covered by at most two rectangles in A for 0 ::::: i < k (and further-
more we have pointers from each rectangle in A;+l to the one or two covering rectangles in
A).

Note these conditions imply that k = O(log n), and that the entire sequence can be stored in
space O( n + an + a 2 n + ... ) = O( n). Given such a sequence, we can do point-location in O(log n)
time as follows. For query point p, first we check that it is contained in the bounding rectangle R,
i.e. locate p in A k . Then for i from k - 1 down to 0, we can find which rectangle in A contains
p in constant time, assuming that we know which rectangle of A+1 contains p.

Given that we have an arrangement A of T rectangles (say represented by a quad-edge data


structure), it suffices to show how to construct the next arrangement A' of aT rectangles in O( T)
time. The sides and corners of R will stay fixed; we consider only changes to internal edges and
vertices.
There are five types of vertices (besides the corners of R): f-, -1, T, 1.., and +. In the arrangement
A, define a vertical stick to be a vertical line segment between vertices with two horizontal edges.
That is, a vertical stick is a sequence of vertical edges, terminated by (but not containing) vertices
of type T, 1.., or +. Define horizontal sticks similarly, extending between vertices of type f-, -1, or
+.
Say that two sticks are adjacent iff they border on a common rectangle; in particular a horizontal
stick is adjacent to a vertical stick iff they share a vertex, and two vertical (horizontal) sticks are
adjacent iff they are horizontally (vertically) visible from each other-see figure 3 for an example.
All the sticks and their visibility relations can be computed in O(T) time by a simple sweep and
mark strategy-we omit details for this.

Lemma 3.1 Given arrangement A with r rectangles and p vertices of type +, then A has s =
T - 1 + p sticks. Furthermore, if we consider the adjacency relation on sticks defined above, there
are at most 2(s + r - 1) adjacent pairs of sticks.
52

Figure 4: The four corners charged for a given stick.

Figure 5: When we remove a vertical stick s (bold line), we extend all the old horizontal sticks
that were adjacent to s to the other side to get a new partition of the rectangles that were to the
left and right of s.

Proof: To prove the first part, we charge four rectangle corners for each stick as in figure 4. Then
every corner except the four corners of R gets charged at least once. At + vertices the four.comers
get charged twice; corners at other types of interior vertices get charged exactly once. Now we
use the fact that each rectangle has four corners, and solve for s.
To prove the second part, first we claim there are at exactly 2s horizontal-vertical pairs of adjacent
sticks. Charge each stick twice, once at each end vertex. Then each + gets charged four times,
and every other interior vertex gets charged once. But each + contributes four pairs of adjacent
sticks, and each of the other type of vertices contributes one pair of adjacent sticks, so our 2s
charges exactly count the number of horizontal-vertical pairs of adjacent sticks.
To bound the remaining adjacent pairs, we note that each rectangle contributes at most one
horizontal-horizontal pair and one vertical-vertical pair, yielding at most 2r such pairs. This can
be improved to 2( r - 1) by considering the boundaries of R. 0
It follows that:

Theorem 3.2 The adjacency graph on sticks has average degree at most 6.

From the lemma it follows that to reduce the number of rectangles r by a constant fraction, it
suffices to reduce the number of sticks s by a constant fraction. By the theorem it follows that at
least half the sticks have degree at most twelve, thus by a greedy strategy (pick an untouched stick
of degree at most twelve and then mark all its neighbors as touched) we can find an independent
set S of at least 8/13 sticks in linear time.
Now for each stick sin S, we want to remove s. If s is vertical, then we remove s and extend all
the horizontal sticks that were to the left or right of s to the other side (see figure 5). Note that
if s was bordered by r' rectangles on the left or right before deletion, then they are replaced by
r' - 1 new rectangles, and furthermore we can delete s in O(r') time. Finally note that every new
rectangle is covered by two old rectangles, as we required.
Since the sticks in S are independent, all these repartitioning steps may be carried out indepen-
dently for each s in S, for a total time of O(r) to compute A' from A. As argued above, this is
sufficient to solve the rectangular point-location problem.
53

This completes our discussion of two dimensional point location. Unfortunately the three dimen-
sional point location problem is significantly more difficult. For example, the two dimensional
Edelsbrunner-Guibas-Stolfi method referenced above relies for its correctness on the fact that the
two dimensional analog of the occlusion relation discussed in Section 2 is always acyclic. For 3-d
complexes of convex cells that are acyclic in some direction, Chazelle gave a point-location algo-
rithm with polylogarithmic query time [4]. Specifically, using a data-structure called canal trees,
he is able to preprocess an n-face complex into an O( n) space structure so that a point location
query can be answered in time O(log2 n). So, for example, this data structure can be used for the
Delaunay tetrahedralizations discussed in Section 2-as they have the acyclic property.
Significant algorithmic progress for point-location in three dimensions for general polyhedral com-
plexes (possibly containing occlusion cycles) had to wait until two dimensional point location algo-
rithms were developed that also allowed dynamic updates to the subdivision. Such methods yield
a way to build a point-location structure for a (static) three dimensional subdivision by using the
persistence-addition techniques of Driscoll, Sarnak, Sleator, and Tarjan [12]. In an abstract set-
ting, their technique allows one to add persistence to a dynamic data structure, that is the ability
to query it about its state in past versions. We can apply persistence to our problem by viewing
the third (z) dimension as the "time" variable while we sweep our three dimensional subdivision
by a horizontal plane. Specifically, we maintain a two dimensional subdivision that is the cross
section of our three dimensional one with the sweeping plane. When the sweeping plane goes
over a vertex, we regard that event as a dynamic update to the two dimensional structure. Using
these ideas and their dynamic two dimensional point-location structure, Preparata and Tamassia
[36] were able to formulate a three dimensional point-location structure for a subdivision of size
n that has size O( n log2 n) and allows query time O(log2 n). Using a very recent new method of
Goodrich and Tamassia [22], the size has been improved to O(nlogn).
In dimensions higher than three, point location remains a very difficult problem.

4 RAY SHOOTING AND TRACING

In Section 2 we discussed some of the geometric problems that arise out of object-order volume
viewing algorithms. The other main class of methods for viewing volumes is image-order .algo-
rithms, in which we scan the display screen in raster order and determine for each pixel what
volume elements affect it. This is typically done in a manner akin to ray-tracing, that is by cast-
ing rays or beams from the eye through each pixel and computing the intersected voxels. This
approach gives rise to new geometric problems, many of which have been discussed in the context
of ray-tracing [20].
In the volume rendering situation we typically do not worry about reflected or refracted rays, so
we can make the simplifying assumption that all rays originate in they eye. On the other hand,
in our situation it is not sufficient to simply compute the first volume element intersected by a
ray-we have to trace a ray through all the volume elements intersecting it. However, if we have
an appropriate spatial subdivision structure representing our volume cells, then the computation
of tracing a ray through a connected sequence of simplices is fairly straightforward [19].
The basic geometric problem that remains is to compute the first volume element intersected
by a ray. We can model this as in the case of ray-tracing surfaces: we have a scene defined
by n non-penetrating triangles in space and a bunch of rays (say m of them) emanating from
54

a single source. For each ray we wish to compute the first triangle it intersects. The best
asymptotic solution to this problem is an algorithm of Overmars and Sharir [33] that solves it
in time O(n 2/ 3 m 2/ 3 10g"ln), for some small T In the non-batched case (i.e., we do not know a
priori all the rays), the best solution is based on the implicit point location method of Agarwal
and takes query time O( y'nlog n) per ray using O(n log n) space and O(n3/2log"l n) preprocessing
[1, 2]. These methods are quite intricate mathematically and probably do not yield practical
algorithms.
There are some less complex techniques for the case where we are shooting rays into a collection of
n aligned rectangles lying on parallel planes. Schmitt, Miiller, and Leister [40] give an algorithm
that takes O( mO.82no05910gC n) time and O( m 10gC m + n) storage, for some small constant c. For
the non-batched case they get query time of the form O( n"llogC n) with nearly linear preprocessing
and storage. Here the I is an exponent arising out of an implementation of partition trees-a
geometric data structure for solving range query problems [13]. A I of about .69 can be obtained
with reasonably practical methods, such as the conjugation trees of Edelsbrunner and Welzl [16].
So much for finding the first intersection of a ray with our volume data. We remarked earlier
that tracing a ray through subsequent voxels is relatively straightforward. This tracing, however,
raises a question similar to the one discussed in Section 2. Namely, if we know the values of some
scalar field at some irregular grid of points in space, how should we tetrahedralize this grid so as
to make this tracing as efficient as possible? To be specific, if we have a grid on n points in space,
how do we choose a tetrahedralization of the points on the grid so that a ray (line) intersects as
few cells as possible in the worst case? What can we expect this worst-case number to be?
Nothing algorithmic is known about this question in three dimensions. In two dimensions it is
known that there exist configurations of n points such that, for any spanning tree connecting
them, there always exists a line that cuts r!( y'n) edges of the tree [44]. A fortiori, the same lower
bound applies to any triangulation of that point set. Recently Guibas and Sharir [21] have given
a construction of a triangulation such that no line crosses more than O( y'nlog n) triangles. This
triangulation uses Steiner vertices (additional points), but still takes only linear size. It would be
nice to establish that for collection of n points in space there is always a polyhedral subdivision of
linear size based on these points so that no straight line intersects more than a sublinear number
of cells.

5 SURFACE EXTRACTION; CROSS-SECTION COMPUTATIONS

As most traditional graphics pipelines are tailored for surface display, many of the early volume
visualization attempts were based on extracting surfaces from the volume data. These were
typically isosurfaces, that is surfaces where the underlying scalar field takes on a particular value.
By varying the value of this constant, sections of the volume at different depths could be obtained.
In other applications, the ability to extract and display sections of the volume along various planes
is also important. We lump these two situations together, because in both we are concerned with
following a two dimensional surface within a volume data set.
The most important surface extraction algorithm is the marching cubes algorithm of Lorensen
and Cline [29]. In this method we assume a regular sampling grid based on cubic voxels. We
know the value of the underlying scalar field at the vertices of the little voxel cubes. We can
classify each vertex as positive or negative, according to whether its value is greater or smaller
55

than the desired surface value. Then, for each pattern of vertex signs, not all the same, associated
with a voxel, we hypothesize a particular way that the isosurface cuts through the voxel and
approximate the surface by a bunch of triangles whose vertices are on the edges of the cube. By
exploiting symmetries we can reduce the total number of cases to only fifteen. As we process the
voxel elements slice by slice, we can generate the actual triangles by interpolating their vertices
using the field values at the end-points of the corresponding edge. A normal gradient vector
is also computed by interpolation and then the triangle and its normal (to be used for shading
calculations) is passed to a depth buffer for rendering.
This simple method has some difficulties, however. The reason is that the signs at the vertices of
the voxel cube do not always uniquely characterize how the surface must pass though a voxel. If
we make an arbitrary choice, we may generate little topological inconsistencies that become visible
and annoying when we want to zoom in on that part of the image (with most volume data sets we
do not have the luxury of resampling to generate additional data in the region of ambiguity). Some
of the issues that arise on how to resolve these ambiguities based on the situation in neighboring
cells are described in [45].
The ideas of the marching cubes algorithm are not restricted to a regular cubical grid. For a
discussion of tracing curves and surfaces through other regular and irregular grids see [11] and
the references cited therein. The somewhat easier problem of computing planar or other sections
with known surfaces is discussed in [41], where the issue of how to get started when only finite
sections are desired is also addressed. Levoy [30] gave a way to render multiple isosurfaces with
partial transparency, but without explicit surface extraction.

6 HIERARCHICAL STRUCTURES

Suppose we are given a scalar field at the n points of an irregular grid in three dimensions and we
have fitted a tetrahedralization T to these points. Or perhaps we are given the tetrahedralization
T to start with, with scalar values at its vertices. Our goal is to construct a coarsened version
of T, say T', that takes only a fraction of the space that T takes to describe, and such that by
rendering T' we get a good approximation to T. Coarsened approximations can be useful in many
applications where we need to display T at lower resolution, or speed-up the processing of groups
of voxels with nearly the same value. In other words, we are trying to come up with an analog of
the "mip maps" of Williams [43], but for irregular grids. (For regular grids oct-trees do the trick,
but again we choose to omit a discussion of quantized space methods).
If our model is that all the original information is contained in the values associated with the
sample points, then we can forget about the tetrahedralization (after all, it was added only to
make trilinear interpolation possible) and think of the approximation in filtering terms. We
imagine that a filter kernel is centered on each of the data points. The value associated with a
ray traversing space is then the integral along the ray of the intersected kernels, each weighed
by the corresponding scalar value. In geometric terms we can think of each filter kernel as a
simple geometric figure centered at the corresponding point. Let us go to two dimensions to make
our discussion easier. We have a collection of n points, and on each point we have centered an
axis-aligned rectangle. To simplify further, suppose that the filter function is constant over each
rectangle. Then our goal is to come up with (say) n/2 points (some possibly new) and rectangles
centered on them so that these new rectangles approximate well the original ones. We mean
this in the sense that for any line the amount of "weighted rectangle mass" that it cuts in the
56

coarsened grid is reasonably close to the one it cut in the original.


Not much is known about such approximation problems. Intuitively it seems clear that if the grid
points are well distributed and if the scalar function is smooth, then taking a random sample of
half the original points and doubling the size of their rectangles should work well.
In the case where it is important to maintain the tetrahedralization structure during the coars-
ening, the problem becomes more difficult. Even coming up with a reasonable definition of what
the problem is forms a challenge. For triangulations in two dimensions the point location method
of Kirkpatrick [27] mentioned in Section 3 does define a useful notion of coarsening a subdivi-
sion. Kirkpatrick deletes a large set of low degree independent vertices and re-triangulates the
star-shaped holes thus formed. Because of planarity he can guarantee that (1) each deleted vertex
has bounded degree and (2) a fixed fraction of all vertices can be deleted. The bounded degree
condition guarantees that in a sense the coarsened triangulation is similar to the original. But in
the scalar field setting, it is not clear how this combinatorial coarsening interacts with the bilinear
interpolation, nor it obvious how to extend these ideas to three dimensions.

7 CONCLUSIONS

We hope that the previous examples have given the reader some ideas about how geometric algo-
rithms can be useful in volume visualization, as well as some pointers to the relevant literature. We
also hope that they will motivate further research on geometric problems relevant to visualization.

Acknowledgements: The author wishes to acknowledge valuable discussions with Marc Levoy
during the preparation of this paper. Michelangelo Grigni contributed material used in Section 3.

References

[1] P. Agarwal, A deterministic algorithm for partitioning arrangements of lines and its applica-
tions, Proc. 5th ACM Geometry Symp., 1989, 11-22.

[2] P. Agarwal, Ray shooting and other applications of spanning trees with low stabbing number,
Proc. 5th ACM Geometry Symp., 1989, 315-325.

[3] E. Brisson, Representation of d-dimensional geometric objects, Technical Report 90-08-03,


Department of Compo Sc. and Eng., University of Washington, 1990.

[4] B. Chazelle, How to search in history, Inf. and Control, 64 (1985), 77-99.

[5] B. Chazelle and H. Edelsbrunner, An optimal algorithm for intersecting line segments in the
plane, To appear in J. Assoc. Compo Mach ..

[6] B. Chazelle, H. Edelsbrunner, L. Guibas, M. Sharir, and J. Stolfi, Lines in space: combina-
torics and algorithms, Rep. UIUCDCS-R-90-1569, Dept. Compo Sc., Univ. of Illinois, Urbana,
1990; also to appear in Algorithmica.
57

[7] B. Chazelle, H. Edelsbrunner, L. Guibas, R. Pollack, R. Seidel, M. Sharir, and J. Snoeyink,


Counting and cutting cycles oflines and rods in space, 31st Annual FOCS Conference, (1990),
242-251.

[8] D. Cohen and A. Kaufman, Scan-conversion algorithms for linear and quadratic objects,
Volume Visualization, IEEE Press, 1991, 280-301.

[9] B. Delaunay, Sur la sphere vide, Izv. Akad. Nauk SSSR. Otdelenie Matematicheskii i Es-
testvennyka Nauk, 7 (1934), 793-800.

[10) D. Dobkin and M. Laszlo, Primitives for the manipulation of three-dimensional subdivisions,
Proc. 3rd ACM Compo Geom. Symp., 1987, 86-99.

[11] D. Dobkin, S. Levy, W. Thurston, and A. Wilks, Contour tracing by piecewise linear approx-
imations, ACM Trans. on Graphics, 9 (1990), 389-423.

[12) J. Driscoll, N. Sarnak, D. Sleator, and R. Tarjan, Making data structures persistent,
J. Compo and System Sc., 38 (1989), 86-124.

[13) H. Edelsbrunner, An acyclicity theorem for cell complexes in n dimensions, Pmc. 4th Annual
ACM Geom. Conf., 1989, 145-151.

[14] H. Edelsbrunner, Algorithms in Combinatorial Geometry, Springer-Verlag, 1987.

[15) H. Edelsbrunner, L. Guibas, and J. Stolfi, Optimal point-location in a monotone subdivision,


SIAM J. Comp., vol. 15,2, 1986, 317-340.

[16) H. Edelsbrunner and E. Welzl, Halfplanar range search in linear space and O(nO. 695 ) query
time, Inf. Proc. Letters, 23 (1986), 289-293.

[17) R. Forrest, Computational geometry, Proc. Royal Soc. London, 321 series A (1971),187-195.

[18) H. Fuchs, Z. Kedem, and B. Naylor, On visible surface generation by a priori tree structures,
Compo Graphics, 1980, 124-133 (SIGGRAPH '80).

[19] Ray tracing irregular volume data, Proc. 1990 San Diego Workshop on Vol. Vis., 1990, 35-40.

[20) A. Glassner, An Introduction to Ray-Tracing, Academic Press, 1989.

[21) L. Guibas and M. Sharir, Triangulations with low crossing number, in preparation.

[22) M. Goodrich and R. Tamassia, Dynamic trees and dynamic point location, to appear in
STOC '91.

[23) L. Guibas and J. Stolfi, Primitives for the manipulation of general subdivisions and the
computation of Voronoi diagrams, ACM Trans. on Graphics,4 (1985), 74-123.

[24] J. Helman and L. Hesselink, Surface representations of two- and three-dimensional flow topol-
ogy, IEEE Visualization '90, 1990, 6-13.

[25] G. Herman and H.K. Liu, Three-dimensional display of human organs from computed tomo-
grams, Compo Graphics and Image Proc., 1979, 1-21.
58

[26] A. Kaufman, Efficient algorithms for 3D scan-conversion of parametric curves, surfaces, and
volumes, Computer Graphics, 21 (1987), 171-179 (SIGGRAPH '87).

[27] D. Kirkpatrick, Optimal search in a planar subdivision, SIAM J. Camp., vol. 12, I, 1983,
28-35.

[28] D. Knuth, The Art of Computer Programming, Vol. I: Fundamental Algorithms,


Addison-Wesley (1973).

[29] W. Lorensen and H. Cline, Marching cubes: a high resolution 3D surface reconstruction
algorithm, Camp. Graphics, 21 (1987), 163-169 (SIGGRAPH '87).

[30] M. Levoy, Display of surfaces from volume data, IEEE CG&A, 8 (1988), 29-37.

[31] N. Max, P. Hanrahan, and R. Crawfils, Area and volume coherence for efficient visualization
of 3D scalar functions, Proc. 1990 San Diego Workshop on Vol. Vis., 1990, 27-33.

[32] H. Neeman, A decomposition algorithm for visualizing irregular grids, Computer Graphics,
vol. 24, 5, 1990, 49-56.

[33] M. Overmars and M. Sharir, Output-sensitive hidden surface removal, Proc. 30th IEEE FOCS
Symp., 1989.

[34] T. Porter and T. Duff, Compositing digital images, Computer Graphics, vol. 18, 3, 1984,
253-259.

[35] F. Preparata and 1. Shamos, Computational Geometry: An Introduction, Springer-


Verlag, 1985.

[36] F. Preparata and R. Tamassia, Fully dynamic point location In a monotone subdivision,
SIAM J. on Camp., 18 (1989), 811-830.

[37] F. Preparata and R. Tamassia, Efficient spatial point location, Proc. WADS '89., Lect. Notes
in Camp. Sc. 382,3-11.

[38] A. Rosenfeld, Three-dimensional digital topology, Inform. and Control, 50 (1981), 119-127.

[39] K. Schumacher, R. Brand" A. Gilliland, and A. Sharp, Study for applying computer
generated images for visual simulation, U.S. Air Force Human Resources Laboratory,
Tech. Rep. AFHRL-TR-69-14 (1969).

[40] A. Schmitt, H. Miiller, and W. Leister, Ray-tracing algorithms - theory and practice, The-
oretical Foundations of Computer Graphics and CAD, Springer-Verlag, 1988,997-1030.

[41] D. Speray and S. Kennon, Volume probes: interactive data exploration of arbitrary grids,
Proc. 1990 San Diego Workshop on Vol. Vis., 1990, 5-12.

[42] J. Stolfi, Primitives for computational geometry, DECjSRC Research Report 36, 1989.

[43] 1. Williams, Pyramidal parametrics, Camp. Graphics, 18 (1984), 213-222 (SIGGRAPH '84).

[44] E. Welzl, Partition trees for triangle counting and other range searching problems, Proc. 4th
ACM Geom. Conf., (1988), 23-33.
59

[45] J. Wilhelms and A. Van Gelder, Topological considerations III isosurface generation,
Proc. 1990 San Diego Workshop on Vol. Vis., 1990, 79-86.

Leonidas J. Guibas received his BS and MS in Mathematics from Cal-


tech in 1971. After completing his PhD in Computer Science at Stan-
ford in 1976 under Donald Knuth, he worked at the Xerox Palo Alto
Research Center for nine years. In 1985 he joined the faculty at the
Stanford Computer Science Department and the DEC Systems Research
Center, in Palo Alto, CA . He has been a Professor of Computer Science
and Engineering at MIT since 1990. His interests include computational
geometry, combinatorial algorithms, graphics , and robotics.
Address: Massachusetts Institute of Technology, Department of Electri-
cal Engineering and Computer Science, Room NE43-307, 545 Technology
Square, Cambridge, MA 02139, USA.
Visualization for Engineering Design
Horst Nowacki

ABSTRACT

This paper examines the special requirements for visualization to play an effective role in
engineering design, postulating interactive, open and modular visualization systems with
intelligent communication capabilities and fast response times for each model and image
update. It illustrates an approach to meeting such requirements by describing a prototype
systems, INVISCID, for Interactive Visual Computing in Design.

KEYWORDS

Visualization, Interactive Design, Scientific Visualization in Surfaces and Volumes, Computer-


Aided Design.

1. THE ROLE OF VISUALIZATION IN DESIGN

The art and science of visualization is making rapid advances in many areas of application,
driven by the dynamic developments in high performance graphical workstations and
supporting visualization software and firmware. Visualization in its modern form to a certain
extent also benefits advanced engineering design, particularly by the progress made in
Visualization in Scientific Computing (ViSC). Yet, in many cases the success of visualization is
judged too narrowly only by the quality and perhaps realism of the pictures it produces rather
than by how effectively it supports the designer's work. As a result many visualization systems
fall far short of realizing their full potential for engineering design. This paper will examine the
essential question of how visualization tools might be used more effectively in engineering
design systems.

Fig. 1 shows a scenario where functions of product modeling, state analysis and visualization
are merged together in a design environment. Design is the primary goal in this scenario, the
processes of modeling, analysis, and visualization are subservient to this end. Product
modeling defines geometric and non-geometric product data and is usually performed by
CAD systems. The analysis of state variables in the product and in its environment is the usual
task of CAE systems and components, while the display of product shape characteristics and
associated analysis results may be performed by a visualization module or system. Fig. 1 thus
suggests the integration of modeling, analysis, and visualization functions, either in a single
workstation or in a distributed computing environment. From this perspective the design
process requires the concerted action of these three supporting processes. Fig. 2 illustrates
how these processes are combined within an interactive design cycle. Modeling is performed
at the generation and correction stages, analysis results are visualized, then evaluated.

61
62

EifL-.1;. Modeling, Visualization, and State Analysis in the Design Process

I GENERATION I

I I
ANALYSIS
• I

I VISUALIZATION I I CORRECTION I

-.I EVALUATION II

r RESULT I
~ Process of Modeling, Visualization, Evaluation and Updating
63

From an engineering design viewpoint the integration of visualization functions in a design and
analysis system must thus be guided by the objective of improving design effectiveness and
the quality of design results. The success depends in large measure on design methodology,
for which there are no trivial rules. However, it is possible to state some of the basic
requirements for visualization in a design context :

Visualization, i.e. illustration by vector and raster graphical images, of engineering objects,
their state variables and their environment, frequently in a two- or three-dimensional
continuum. The images should be capable of conveying information of high semantic
content in condensed form (such as derived attributes, complex states and attribute
patterns).

Realism of the engineering object and its states, not so much realism of its images only.
This requires a model of the object with sufficient accuracy and a faithful display of its
relevant design characteristics. For example, certain physical processes require accurate
modeling of dynamic as well as kinematic performance in a Simulation, modeling of
deformable objects and of three-dimensional phenomena.

Interactivity, i.e. user access to and control of information about the model, at high
semantic level. Interaction only with features in the image of the model is often inefficient
and not adequate in design.

Intelligent communication at the human-computer interface. This can be enhanced by


object-oriented user interfaces and flexible forms of dialogue.

o
Fast response times, based on rapid updating of the model and the picture after design
changes. The quick update of the model for complex physical situations is by far more
difficult and often beyond the capabilities even of current super-workstations.

Beyond these basic requirements for visualization in engineering design, modE:lrn software
engineering objectives must also be taken into account. These include:

Modularity of the main components, i.e., of the design, analysis, and visualization
processes.

Network transparency of the functional modules in order to achieve site independence in


distributed systems.

Standardization of the major interfaces to ensure device independence and data


exchange between components in heterogeneous systems.

Based on these requirements and objectives a modern visualization capability for engineering
design can be conceived. The following major benefits are expected from the intensive use of
visualization in design :

o
The ability to display engineering information in high density and with much semantic
substance, using combined color raster and vector graphics.

Time savings in design by visual simulation and judgement of complex phenomena.

Better communication and documentation of ideas and facts.

Enhancement of new insights.


64

Integration of CAD and CAE systems with visualization.

o
Developments in design methodology by means of interactive visualization.

The interactive element in visualization is of crucial importance to effective design work. It also
constitutes the main new ingredient in visualization for design beyond the scope of the well-
established discipline of Visualization in Scientific Computing (ViSC).

This is why we have coined a new trademark for this brand of applications: "Interactive Visual
Computing in Design" (INVISCID). The following sections will describe current developments
in this field and will report on our own experiences in developing such a system.

2. VISUALIZATION FUNCTIONS FOR DESIGN

2.1 Requirements

A visualization system intended to support effective design work must take into account the
interactive, spontaneous and often approximate nature of the designer's working process.
This is why the requirements for visualization in design are substantially different from those in
photorealistic rendering or also in non-interactive scientific visualization. Some of the essential
requirements for visualization systems in design are the following:

Application independence:

Design related visual displays are needed in multitudinous options and tend to vary from
case to case. An application independent visualization system covering a wide range of
different applications can best provide the flexibility needed.

Interactive capability:

Interactions with the design object, i.e., product model, are a necessity in genuine design
tasks.

Modularity of the user interface (UIF):

The functions for user interaction with the system, which include dialogue support for
visualization tasks, should be a separate module in the system, which needs a clearly
defined interface to modeling and rendering functions.

Object Orientation in UIF:

Modern user interfaces are often built on an object oriented basis, i.e., they support the
notion of human interpretable information carrying objects communicating with each other
by messages. These objects are the building blocks for an intelligent man-computer
interaction.

Windowing support:

The support of windowing function (as in X WINDOWS) has become a standard


requirement, sometimes also combined with open solutions for network transparency
(PEX (Rost 1989)).
65

Three-dimensional functions:

Advanced visualization applications require surface rendering of 3D objects and


sometimes volume rendering. In addition, interaction with the three-dimensional product
model may be desired.

Scientific orientation:

Scientific visualization deals with the presentation of state variables in and around
products, hence it is not limited to realistic images. The association of color attributes with
variable states in the product geometry is the typical requirement for these applications.
Scalar, vector and tensor variable states need to be displayed for a full range of options.

Incremental updating:

For fast response in interactive work the incremental, local updating of the image and, if
possible, of the design model is of crucial significance.

2.2 Some Current Systems

Many scientific visualization and animation packages exist for high quality realistic or scientific
rendering applications. A few examples are: CA-DISSPLA, 01-3000, DORE, GRAPH KIT,
MOVIE.BYU, PATRAN-G, RENDERMAN, UNIMAP etc. They will not be reviewed here, since
their purpose is not prjmarily in visualization for interactive design.

Table I: Comparison of Features in Visualization Systems for Design

System AVS GROW PEX SASSAFRAS INVISCID


Feature

Application X X X X X
independence

Interactive X X X X X
capability

Modularity X X X X X
of user interface

Object orientation X X X X
in user interface

Windowing X X X X
support

Three-dimensional X X X
functionality

Scientific X
orientation

Incremental X
updating
66

Visualization systems intended for interactive design are much less frequent. Table I gives a
comparison of three such systems AVS (Upson 1989), GROW (Barth 1986), and PEX (Rost
1989) as well as the User Interface Development System SASSAFRAS (Hill 1986). The system
INVISCID was developed at TU Berlin and will be described in Section 3.

Table I shows that these systems possess a modern orientation toward design and meet
most of the requirements of Section 2.1. Each system emphasizes different key capabilities.
AVS is an application-independent, open interactive visualization system that serves as a
toolkit for developing scientific applications particularly in continuum mechanics (scalar fields).
GROW, the Graphical Object Workbench, is an application-independent, interactive interface
based on object-based graphics and inter-object relationships. PEX is a network transparent
3D graphics system based on extensions of the X Window system and PHIGS +. SASSAFRAS
is a good example of an application-independent user interface development system (UIDS)
which supports a high-level paradigm of human-computer interaction. It uses a new language
for specifying the syntax of man-computer dialogues.

2.3 Transformation Process

The process of visualization in interactive design is subdivided into several model


transformation stages (Fig. 3). Let us first consider the output process. The product model
stems from some data source, usually from some product modeling system. For design
purposes this state of the model is evaluated by physical, geometric and other criteria. This
analysis provides certain state variables associated with the product and its surrounding
environment. The state variables may belong to the entire product and its components (global
attributes) or to the continuum of product geometry and surrounding field (distributed state
variables) or to discrete locations (discrete state variables). Most variables are displayable in
some f?rm in association with the product geometry.

A selection process takes place in order to extract those features of product geometry that
can be displayed in association with product data. The result of this transformation is a
displayable geometry model with associated state attributes.

The rendering process converts the state variables into some displayable form and binds the
display attribute set with the displayable primitives of the geometric object. A graphical object
model is thus produced which consists entirely of individually displayable primitives. Usually
there is no simple one to one relationship between primitives in the original geometric model
and the displayable primitives in the graphical object model.

The display process transforms the displayable graphical object into a picture on the display
device.

During the input process these transformations must be performed in reverse order and in the
opposite sense. Most systems do not support the complete inverse.set of mappings. In some
cases interaction is confined to the graphical object model, in others the application
programmer is responsible for maintaining consistency throughout the model hierarchy. This
makes it very difficult to support direct interaction, say, with the product model. However, this
ability is of essential interest for visualization in interactive design.
67

DATA SOURCE:
PRODUCT MODEL

MODEL EVALUATION
AND EXTRACTION OF
VIEW RELEV. GEOMETRY

DISPLAYABLE
GEOMETRY MODEL

RENDERING:
EVALUATION AND BINDING
OF DISPLAY ATIRIBUTES

GRAPHICAL
OBJECT MODEL

DISPLAY PROCESS:
HANDLING OF
IMAGE INFORMATION

PICTURE

Fig. 3: Transformations in Visualization Process


68

2.4 The Role of Standards


In the areas of photorealistic rendering, animation and scientific visualization, solutions with
high performance workstations have been playing a prevalent role on which standardization
has had only limited influence. In practice there does not exist a software standard for
scientific visualization yet.

In visualization for design, available hardware platforms are not necessarily in the high
performance end so that software standards can be of more benefit here. It is also possible in
this context to make use of graphical standards like GKS, PHIGS, and PHIGS+ and of product
modeling standards like the evolving ISO Standard STEP (Anon 1988). The idea is then to
build modular, open systems in which the representations of the product model, displayable
geometry model, and graphical object model are each based on neutral standards. An
approach of this kind is described in the draft documents for the Presentation Information
Model in STEP, Version 1.0, Part 46 (Klement 1990).

In particular, the following standards are assumed to be used there for the transformation
process in visualization (Fig. 3):

STEP Presentation defines an attribute set for describing the geometric, display relevant
properties of a STEP product model as well as the display relevant attributes (camera
setting, light sources, color attributes etc.). This information is sufficient to perform the
extraction transformation (Fig. 3).

The resulting displayable geometry model can be rendered by means of PHIGS+. The
main difficulty in this approach was until recently that PHIGS+ rendering supports lighting
models, but not scientific visualization. Recent discussions between the PHIGS+ and
STEP Presentation Groups have led to an understanding that the PHIGS+ data structure
and attribute set will be redefined in such a way that it will permit the rendering of scientific
data given over a surface grid in a style similar to Gouraud and Phong shading. That is,
the PHIGS data structure and the PHIGS+ rendering methods will be made accessible to
scientific visualization.

This discussion shows how product modelling standards like STEP and graphics standards
like PHIGS and PHIGS+ may be applied to visualization processes in the future.

3. AN INTERACTIVE VISUALIZATION SYSTEM FOR DESIGN

3.1 Functionality and System Architecture

This subsection describes the concept and organization of an interactive visualization system
for engineering design, realized as a prototype development at the Technical University of
Berlin. It was named INVISCID (1nteractive Visual Qomputing jn Qesign). The next subsection
will give a few examples of results obtained in design applications with this system.

The system supports functions for modeling, analysis, and for the visualization of engineering
objects and their associated physical states. The objects possess line, surface or volume
geometry (product shape) and are usually embedded within a physical continuum
(surrounding environment). The states to be displayed are thus related to features in the
object itself, i.e., its edges, surface or volume, or in its surrounding field. The main emphasis in
the applications lies in design tasks related to problems in continuum mechanics. But within
such a scope it was required to keep the system as application independent as possible.

The visualization component, in particular, is kept strictly modular and is designed to generate
vector and raster graphical images of products in their environment and their states for two-
and three-dimensional continua.
69

The system has a strong object-oriented flavor. Objects are introduced conceptually as
primitives at the various levels of the product model, displayable geometry model, and
graphical object model (Fig. 3). Objects also exist in the User Interface as contact objects.
Objects possess their own attribute sets and communicate with each other by messages.
Objects maintain a self-contained attribute set, which is thus hidden from other objects unless
it is accessed by appropriate messages (informance hiding). Objects also serve as the
building blocks in establishing attribute inheritance flows. All objects in the system are
administered in a uniform way by the database management component, at least while they
are active in main memory. The central goal of the interaction functions in the system is to
support direct user access to all levels of the model, not just a manipulation of the image.

Other details in the requirements for the system correspond to those stated in subsection 2.1.

A large measure of application independence in the visualization process is achieved by


dealing with the rendering and display tasks as two separate transformation stages (Fig. 3).
Rendering prepares a graphical object model for display by generating the displayable
primitives and associating them with display attributes according to the states to be visualized.
The functionality of the visualization process can best be explained in terms of application
area, model data and displayable data.

Application area:

Table II gives a few examples of application areas related to geometry, structural and fluid
dynamics fields, and hence usually to fields in one-, two- and three-dimensional continua. The
table illustrates that the state variables needed to describe the condition of the object are of
scalar, vector or tensor types. This would also be true in many other design applications.

IYPESOF IYPESOF
GEOMETRY: STATE VARIABLES:

LINE SCALAR

SURFACE ~--..--tI VECTOR

VOLUME TENSOR

Fig. 4: Types of Model Data


70

TABLE II: APPLICATION AREAS AND STATE VARIABLE TYPES

APPLICATION AREA

VARIABLE

TYPE GEOMETRY STRUCTURES FLUIDS

COMPONENTS, STRESS, PRESSURE,

SCALAR CURVATURE, STRAIN POTENTIAL,

TORSION COMPONENTS STREAM FUNCTION

TANGENT, DISPLACEMENTS VELOCITY

VECTOR NORMAL,

BINORMAL

FUNDAMENTAL STRESS, STRESS,

TENSOR FORM OF STRAIN DISPLACEMENTS

DIFF. GEOMETRY TENSORS

Model data:

After performing an analysis, geometric elements in the model data are associated with state
variables. The geometric primitives in the model are of types line, surface and volume. The
state variables may be defined at discrete points (knots) of the geometry or continuously.
Each type of geometry may be associated with state variables of type scalar, vector or tensor
(Fig. 4).

Displayable data:

When the rendering process has produced displayable geometric primitives and bound them
with their desired appearance attributes (display attributes), then the display process must
generate the corresponding images. The display system has only a finite supply of displayable
primitives. In a standard graphics system, e.g., they would correspor)d to the available output
primitive set with its display attributes (POLYLINE, POLYMARKER, TEXT, AREA FILL etc.). In
PHIGS+ and in modern vendor systems, a continuous tone shading primitive is also available.
Table III gives an overview of how state variable attributes of scalar, vector or tensor types can
be mapped onto display attributes of the corresponding display primitive. This mapping offers
many options because each component of the state variable can be associated with any
chosen display attribute of that primitive. E.g., a scalar value at a discrete point can be
expressed by polymarker type, size, color or any combination thereof. In addition, the
association of state variable ranges with any color spectrum in principle is arbitrary and is
often performed indirectly by pseudo-color tables. These numerous options result in a rich
variety of possible rendering styles for any given application.

Note in passing that symmetric tensors can be reduced to three orthogonal vectors by an
eigenvalue transformation and hence can be displayed as three vectors.
71

Representation styles:

For any application area and distribution of state variables in some space, a great diversity of
representation styles exists for displaying the information to the designer. Engineers are
usually interested in quantitative as well as qualitative information about some state of the
object. Numerous forms of diagrams, plots, and color representations are familiar or intuitively
comprehensible to the engineer. A good visualization often consists of a combination of
vector and raster graphical images.

Meissner (1990) has systematized many of the possible options which are used in current
visualization systems. Figs. 5 and 6 demonstrate the variety of choices for visualizations in
surfaces and volumes.

The interactive visualization system INVISCID contains a visualization module VIP


Qlisualization fipeline for Scientific Data) (Meissner 1991) which supports the display of a
large class of geometric objects with their surrounding field states in many different
representation styles within the range shown in Figs. 5 and 6. The associations between
geometric primitives, display primitive attributes for state variables, and representation styles
can be flexibly selected.

TABLE III: TYPES OF DISPLAY DATA FOR STATE VARIABLES

STATE AVAILABLE DISPLAY

VARIABLE DISPLAY PRIMITIVE

(ATTRIBUTES) PRIMITIVES ATTRIBUTES

SCALAR MARKER TYPE, SIZE, COLOR

(SINGLE POLYLINE TYPE, THICKNESS, COLOR

VALUE) PATTERN STYLE, SIZE, COLOR

AREA FILL SHADED ATTRIBUTES

VECTOR POLYLINE TYPE, THICKNESS,

(MAGNITUDE (POSSIBLY COLOR, LENGTH,

+ DIRECTION) WITH MESH DIRECTION

LINES) SYMBOLS

TENSOR SAME AS LIKE

(3 MAGNITUDES FOR THREE

+ DIRECTIONS) VECTORS VECTORS


72

Fig. 5: Visualization in Surfaces

Fig. 6: Visualization in Volumes


73

System architecture:

Fig. 7 shows the basic architecture of INVISCID. Its main modules follow the classical
organization into User Interface (UIF), Core of Methods, and Data Base Management System
(DBMS).

The User Interface is built around a basic UIF System AIDA (Advanced Interactive Qesign
Architecture) (Ziegler 1988). AIDA supports an object-oriented, graphical dialogue with any
contact object displayed on the device. Objects can be from any level of the transformation
hierarchy (Fig. 3) and can be pointer connected if they belong to more than one level. Objects
have their own attribute sets and follow given rules in interaction. The application specific UIF
for each application is built on top of this basic layer.

USER / DEVICE

I/O HANDLING: DRIVER SOFiWARE


AND GRAPHICS PACKAGE

APPLICATION DEPENDENT UIF UIF


(FOR IDE4)

BASIC UIF SYSTEM: AiDA

I I
MODELING ANALYSIS VISUALIZATION CORE
OF
SYSTEM: .IDE4 SYSTEM: VIP METHODS

DATA BASE MANAGEMENT SYSTEM:


DBMS
COSMOS

,::::...
DB
-
-.....

'- -"'"

Fig. 7: Architecture of Interactive Visualization System for Design: INVISCID


74

The core of methods contains the modelling and analysis system IDEA (Interactive Qesign of
Streamlined Shapes) (Meissner 1991 b), which is implemented for some specific application
area, as well as the visualization module VIP (Meissner 1991a). Currently IDEA mainly
addresses applications in fluid dynamics and in geometric design.

The database management system was also developed at TU Berlin. It is the software module
COSMOS (Case Oriented Management of Structures) (Ziegler 1988), written in FORTRAN 77
which is responsible for the data structure administration only in main memory during a
design session. Other tools are needed for external data base administration. COSMOS is
application-independent and supports all predefined and abstract, programmer-defined data
types needed at the various levels of the module. Objects reside in COSMOS capsules and
may have a complex, pointer connected internal structure.

More details about the organization of the INVISCID system and its modules can be found in
(Meissner 1990), (Meissner 1991 a), (Meissner 1991 b), (Ziegler 1988).

3.2 Application Examples

Figs. 8 and 9 show two examples from an application of aerodynamic or hydrodynamic foil
design with flow visualization from the Interactive Visualization System in Design, INVISCID.

The foil is to be designed for low or minimal drag coefficient CD in parallel two-dimens~nal
onset flow in incompressible fluid at medium and high Reynolds numbers of Re = 10 or
above (turbulent flow). Constraints may be imposed with regard to laminar and turbulent
separation, e.g., with respect to the forward most permissible location of the separation point.
For a submerged foil in water, a cavitation safety constraint may also be applied. The potential
flow about the foil is calculated by panel methods, induced velocities and pressures can be
evaluated at any location in the surrounding field. The viscous drag is calculated by boundary
layer theory using an integral method by Eppler (1980) and by an estimate for separation drag
based on the Squire-Young formula.

This application is admittedly a straightforward design task, which might also be solved
without any interaction and visualization. It is chosen here as an illustration because of its very
simplicity to demonstrate the use of interactive design styles with visualization. It is easy to see
how the same approach can be extended to more complex design tasks.

The design task is composed of three main stages:

o Geometric design

o Flow simulation

o Visualization

The interaction style is object-oriented, based on AIDA, IDEA and COSMOS, and supports
different contact objects at each stage of this procedure. Contact objects are any partner
objects displayed on the device that interact by messages whenever a contact situation is
released. Objects are either tools (operators) like PENCIL, STREAMLINE, DRAG etc. or parts
(operands) like CURVE, CONTROL POINT, FLOW REGIME etc.

At the geometric design stage the foil shape is defined by a succession of contact object
operations such as :

o
PENCIL (creates defining polygon object, called LlNE.2D, Fig. 9 top left).
75

Fig. 8: Foil Design and Flow Visualization, First Example

Fig. 9: Foil Design and Flow Visualization, Second Example


76

o
OPERATOR.CURVE (creates curve, when applied to LlNE.2D object, with options of
Bezier and B-Spline approximations and interpolations. This generates a new object
called RATIONAL.CURVE).

SERVICE.CURVE (modifies the RATIONAL.CURVE, e.g. the degree of the curve).

o
CURVE.2-0RDER (creates curve of second order, e.g. circles, ellipses etc.).

The flow simulation is performed in response to the following contact operations, when
applied to the geometric objects :

o
STREAM.CALCUL (calculates potential flow singularity distribution on panels and velocity
and pressure results on the surface of the foil if the contact took place on the surface of
the profile; generates a flow mesh in the field (H-type or Ootype) and calculates potential
flow state variables throughout the field at mesh nodes if the contact took place in the
area around the profile. A default output will be generated).

STREAM.LlNES (traces a free number of streamlines through the flow field).

o
BOUNDARY.LAYER (calculates different boundary-layer thicknesses with the methods
described above. The momentum thickness will be visualized per default).

Certain results of these calculations are displayed automatically as echoes, others must be
invoked by visualization contact operations as follows:

o
OFFSET.NORMAL (an offset curve of scalar results is plotted against the body axis).

o
OFFSET.VECTORS (plots scalar results as an offset curve, with vector spikes, against
body contour, Fig. 9 top right).

o
BOUNDLAY.PICTURE (boundary layer thickness results, form factors etc. are plotted
against body contour as in OFFSET.VECTORS, Fig. 9 top left).

o
POINT.PICTURE (marks special points like separation points by markers, Fig. 9 top left).

o
RESISTANCE.RESULTS (displays drag results, Fig. 9 top left).

o
MESH.PICTURE (displays mesh, Fig. 8 bottom, upper half).

o
MARKER.PICTURE (displays the state variable at the mesh nodes represented by
polymarkers) .

o
CONSTANT.SHADING (provides constant tone shading of mesh area).

o
CONTINUOUS.SHADING (provides continuous tone shading of mesh area, Fig. 9 bottom
right).

o
CONTOUR.FILLING (provides contour fill map of state variable field, Fig. 8 bottom, lower
half).

o
ISO.LlNES (displays isolines of scalar quantities, Fig. 8 top, lower half).
77

o
VECTOR.PICTURE (draws vector field, Fig. 8 bottom, upper half).

STREAMLlNE.PICTURE (displays streamlines traced by initial value solver as in Fig. 8 top,


upper half).

Every contact operation controls what is to happen (sometimes with several options), how is
to be performed (method, mesh etc.) and how it is to be visualized (representation style,
display attribute etc.). The options are controlled by the style and place of interaction and by
object-specific rules. This is why the user-interface is referred to as object-oriented ..

A special feature of the INVISCID system, at least for this application, is the incremental
updating capability for model as well as visualization image. This local updating method was
developed in the interest of rapid response times in interactive work.

The basic idea is derived from the observation that in many physical phenomena a small
perturbation in a boundary condition or shape feature results in a similar small perturbation of
the state variables in the physical system, concentrated mainly in the local vicinity of the
primary perturbation and decaying rapidly with distance. This certainly holds for potential
flows due to the asymptotically vanishing character of induced effects with increasing
distance. It also holds for boundary layer flows in the sense that changes mainly propagate
downstream, of course, except for instability phenomena where small causes may trigger
great effects. However, in most situations, it is safe to assume at the design stage that local
changes will have mainly local effects and hence to limit the size of the regime where updating
of the flow needs to be performed.

In addition to this, there are certain formal procedures, based on perturbation theory, whereby
linearized (or other low order) approximations can be taken to calculate incrementally by
means of a linear transfer function what changes in shape will cause which effects on state
variables like velocity. For the case of the current foil design application, a corresponding
perturbation scheme was developed by Kraus (1989).

Once the incremental update of the model is performed by these computational methods, the
updating of the image can quickly follow by assigning new state variables to the nodes which
have changed and locally regenerating the picture. A close resemblance of the model data
structure (mesh) and the rendering data structure simplify this task, but they do not have to
fully match. In fact, maintaining two independent data structures is most advisable in this
context.

The contact operation for updating in INVISCID is as follows:

UPDATE (This object incrementally updates that part of the model which was previously in
existence, i.e., not necessarily all flow calculations and regenerates local elements of the
image which are affected by the model update. So it updates the actual situation in the
flow regime).

A second example of the use of INVISCID is shown in Figs. 10 to 15. This sequence of
pictures illustra~es different styles of representation for the flow calculations for a ship hull in
potential flow (pressure distri§utions) without free surface waves and in viscous flow at high
Reynolds number (Re = 10 ). The design goal is to reduce resistance and to avoid stern
separation of any great extent. Velocity and pressure fields in potential flow are calculated by
panel methods, drag and separation are predicted by three-dimensional integral boundary
layer techniques.
78

Fig. 10: Pressure Coefficient in Ship Forebody, Flat Shading

Fig. 11: Pressure Coefficient in Ship Forebody Continuous Tone Shading


79

Fig. 12: Pressure Coefficient in Ship Forebody, Contour Fill Map and Mesh Lines

Fig. 13: Pressure Coefficient in Ship Forebody, Constant Tone Shading and Isolines
80

Fig. 14: Velocity Field, Ship Forebody, Contour Fill Map

Fig. 15: Momentum Thickness, Superimposed in Ship Afterbody Surface (Offset Surface Plot)
81

The interaction is of more limited scope here. The geometry of the shape can be changed.
The style of representation and visualization is controlled by contact objects but the local
updating feature is more difficult to implement and is not available yet, though it would be of
great interest to designers.

The designer would use this model in INVISCID by examining the flow pattern, the viscous
drag prediction and the separation tendencies, Fig. 15 shows a certain dent in the
momentum thickness in the upper part of a section in the afterbody of a ship. This would
suggest that one would like to alter the shape along the streamline ahead of this point. In this
way such a visualization provides at least qualitative guidance for design decisions.

3.3 Visualization, Interaction, and Design Methodology

The experience gained in developing and using the INVISCID system might be generalized
and expressed in the following concise form:

"The conceptual elements of visualization, interaction, and design methodology


are a necessary common platform for any successful approach to visualization
for engineering design".

These methodical instruments must be combined in a concerted fashion. Design


methodology provides an approach to formulating design objectives and defining how they
can be achieved. Visualization provides direct feedback to the designer of quantities which are
relevant to his decision. Modern interaction techniques support an effective working style.
They form a bridge between the product model and its state visualization and enable the
designer to work in direct contact with model and derived picture.

All three elements of this approach benefit from a pervasive object-oriented perspective at all
modeling levels: Object-oriented user interfaces with intelligent contact objects as primitives,
object-oriented data types for the design model as well as for the visualization model, and
finally an object-oriented definition of design objects and operations. This unified perspective
greatly simplifies the development of open, application-independent system architectures for
engineering design and visualization systems.

4. SUMMARY

An effective approach to visualization for engineering design results from the concerted
application of modern developments in modeling, state analysis, and visualization. This paper
has provided a unified perspective for some of the mainstreams in this development which are
now converging toward new systems for "Interactive Visual Computing in Design".

Requirements for visualization in design include such postulates as :

Design relevance of the displayed pictures,

High semantic level interaction between designer and product model,

Fast enough updating for interactive design.

Systems for interactive visualization in design should be based on a modular architecture,


preferably with an application-independent visualization module, and on a unified object-
oriented approach with regard to the user interface primitives, the modeling data types, and
the data base structure.
82

The prototype system INVISCID realized at TU Berlin illustrates many of the features and
functional capabilities of the intended type of interactive visualization system for design. It
supports a high level style of interaction, design relevant representation styles for applications
in geometric design and continuum mechanics, incremental updating of model states and
picture regeneration. It is based on an object-oriented perspective throughout, although it had
to be realized in FORTRAN 77 on Tektronix Workstations. A reimplementation in modern
object-oriented language and database environments would probably further improve its
architecture and performance.

Continuing developments in this field will undoubtedly further benefit from rapid progress in
hardware and software platforms. To reap the full harvest of these improvements engineering
design oriented systems require their own strategic goals. These can be found in many
modern developments in engineering design and computer science. Fig. 16 takes a final
broader view of the essential elements that will jointly contribute toward future integrated,
visually interactive design systems.

DESIGN ENGINEERING
METHODOLOGY ANALYSIS

OBJ. ORIENTED
INFORMATION AI TOOLS
MODELING

Fig. 16: Elements for Integrated, Visually Interactive Design Systems

REFERENCES

Anon (1988) Standard for the Exchange of Product Model Data (STEP). ISO TC 184/SC 4
Document N 284, ISO STEP Baseline Requirements Document (Tokyo IPIM)
Barth PS (1986) An Object-Oriented Approach to Graphical Interfaces, Describing GROW
(Graphical Object Workbench). ACM Transactions on Graphics, 5(2) pp 142-172
Eppler R, Somers OM (1980) A Computer Program for the Design and Analysis of Low-Speed
Airfoils. NASA Techn. Mem. 80210
Frenkel KA (1988) The Art and Science of Visualizing Data. Comm. ACM 31 (2) pp
110-121
Frenkel KA (1989) The Next Generation of Interactive Technologies. Comm. ACM 32(7) pp.
872-881
Greenberg DP (1988) Coons Award Lecture. Comm. ACM Vol. 31 (2) pp 123-134
Hartson HR, Hix 0 (1989) Human-Computer Interface Development:
Concepts and Systems. Compo Surveys 21 (1) pp 5-92
83

Hill RD (1986) Supporting Concurrency, Communication, and Synchronization in Human


Computer Interaction - The Sassafras UIMS. ACM Transactions on Graphics 5(3) pp
179-210
Klement K (1990) STEP Presentation, Committee Draft for Part 46 of STEP (General
Resources: Presentation Information Model). ISO TC 184/SC 4/WG 1 Document
Kraus A (1989) Perturbation Methods as a Means for the Interactive Design of Fluid Dynamic
Body Shapes. In German, Ph.D. Thesis, TU Berlin, D83
McCormick BH et al. (1987) Visualization in Scientific Computing. ACM SIGGRAPH, Computer
Graphics, 21, No.6
Meissner GA (1990) The Use of Fast Visualization Methods for the Visually Interactive Design
Process. In German, Ph. D. Thesis, TU Berlin, D 83
Meissner GA (1991a) User Instructions for the Visualization System VIP. In German, internal
report, Institut fOr Schiffs- und Meerestechnik, TU Berlin
Meissner GA (1991b) User Instructions for the Design System IDEA. In German, internal
report, Institut fOr Schiffs- und Meerestechnik, TU Berlin
Nowacki H (1989) Integration, Interaction, and Visualization for Engineering Design. Invited
lecture, unpublished notes, Eurographics '89, Hamburg
Quarendon P (1987) A System for Displaying Three-Dimensional Fields. IBM UK Scientific
Centre, Winchester
Rost RJ et al. (1989) PEX: A Network-Transparent 3D Graphics System. IEEE Compo
Graphics and Applic., 9(4) pp 14-26
Uphill S (1990) The Renderman Companion, A Programmer's Guide to Realistic Computer
Graphics. Addison-Wesley Publ. Co. , Reading, Mass
Upson C et al. (1989) The Application Visualization System: A Computational Environment for
Scientific Visualization. IEEE Compo Graph. and Applic., 9(4) pp 30-42
Ziegler M (1988) Conception and Development of Graphical User Interfaces for Geometric
Design. In German, Ph.D. Thesis, TU Berlin, D 83

Horst Nowacki is professor of ship design at the Technical


University of Berlin. His interests and responsibilities also
include computer aided design, ship hydrodynamics, fluid
dynamics and flow calculations, hydrodynamic and
geometric design methods. He currently directs two
research projects on surface modeling using object-
oriented methods and on interactive visualization for
scientific applications, both sponsored by the German
Science Foundation, DFG. He is also engaged in
international product definition modeling standards (STEP)
and in high-speed computer networking developments.

Prof. Nowacki received his diploma and doctorate in naval


architecture from TU Berlin in 1958 and 1963. He taught at
the University of Michigan between 1964 and 1974.

Address: Techn. Univ. Berlin, Schiffsentwurf, Salzufer 17-


19,
D 1000 Berlin 10, Germany.
Visualization Resources and Strategies for
Remote Subsea Exploration
W. Kenneth Stewart

ABSTRACT

Common resources and strategies are described for graphics and imaging applications in
remote subsea exploration. What is meant by resources are the hardware, software, and
human assets that constitute sea-going and shore-based systems; strategies encompass
the architectural, engineering, and practical aspects of making such a visualization
environment operational and productive. Emphasis is placed on current applications
within the oceanographic community for search/survey/mapping (towed, unmanned
systems), remotely operated vehicles and submersibles (man-in-the-loop systems), and
autonomous underwater robots (intelligent systems). For these applications, a common
goal is the acquisition and processing of underwater remote-sensor data to create a model
of the subsea terrain. Visualization tools offer an important means of conveying the
information contained in such a model.

Dominant requirements within this context are the management, processing, and
presentation of high-bandwidth, multi sensor data including optical and acoustic imagery,
laser and sonar bathymetry, and other physical data sets. Specific visualization tools are
used for image processing, volumetric modeling, terrain visualization, real-time operator
displays, and mapping and geographic information systems, as well as for scientific and
engineering research and development. An overview and selected examples reflect a
sampling of state-of-the-art approaches within the oceanographic community.

KEYWORDS

oceanography, remote sensing, sonar, underwater photography, underwater robotics,


image processing, computer graphics, visualization

INTRODUCTION

Rapidly evolving computational, graphics, and video technologies are having a profound
effect on the way marine science and engineering are being conducted. Such
technologies are a perfect companion for advanced, high-resolution sensors and remote
undersea vehicles that project our human senses to the deeper ocean regions. The new

85
86

visualization tools expand and accelerate our research efforts, afford more productive
time at sea, and enhance the way results are communicated to peers and public alike
(Stewart 1987).

This article offers an overview of visualization tools and techniques now being used for
remote subsea exploration. This includes applications in mapping, search, and survey
using mainly unmanned underwater vehicles (UUV's) equipped with remote sensing
devices. Within this broad category are examples derived from shipboard systems, towed
instrument platforms, and remotely operated vehicles (ROV's). Autonomous underwater
vehicles (AUV's), now coming into operation, promise to extend these capabilities and
provide lower-cost alternatives to those existing.

My aim is to offer the computer graphics and imaging community some insight to the
special problems and significant opportunities in visualization for undersea exploration.
As in other science and engineering domains, the main objective of such techniques is to
extract and convey information by taking advantage of our highest bandwidth sense.
Within this context, common resources and strategies for graphics and imaging are
described. By resources, I mean the hardware, software, and human assets that constitute
sea-going and shore-based systems; the strategies encompass architectural, engineering,
and practical aspects of making such a visualization environment operational and
productive.

The discussion begins with a brief description of remote sensing underwater, in an effort
to encapsulate a few of the more germane issues. Although there is considerable overlap
among the three categories, the main text deals separately with visualization in: research
and development; at-sea operations; and presentation and interpretation of scientific
results. A final section attempts to summarize trends and issues, and presents a few
thoughts on future directions.

REMOTE SENSING UNDERWATER

It has become a truism within the marine science community that the surface of the
moon, and now of other celestial bodies, has been better explored than the ocean floor of
our own planet. Perhaps less than a tenth of one percent of this vast undersea domain has
ever been seen by human eyes. In contrast with remote sensing through the atmosphere,
the relatively opaque and inhospitable medium presents formidable challenges to those
who seek to probe its depths. Impenetrable to most forms of electromagnetic radiation,
the ocean yields a picture of the seafloor mainly through acoustic and optical means
(Stewart 199Ib).

Whether using sound or light, the fundamental trade-off is range for resolution (Fig. 1).
Low-frequency acoustic systems survey a swath tens of kilometers wide, but can only
resolve features larger than several tens of meters; higher frequency sonars have better
resolution, but coverage is limited. Cameras and other optical systems see greater detail,
87

but image yet smaller areas. Special problems with both methods include attenuation,
scattering noise, and distortion. These are compounded by the difficulty in measuring
position underwater and deployment costs that rise sharply with depth.

DEEP SUBMERGENCE LABORATORY


COMPARISON OF UNDERWATER REMOTE·SENSING SYSTEMS

10km
I SeoMARC II
I Side/ook SOnor

lkm
~
0
~
~
~
<t)
100 m
)..

~
Qc
:::>
<t)
/ Exls'lng
10 m
. / EvoMn"

Oplicoi/AocouSIiC
Trons/non
ImL-______-+________+-______-+________+-______-+________+__
Imm lem 10cm 1m 10 m 100m
IMAGE RESOLUTION

Fig. 1 A comparison of underwater remote-sensing systems shows the trade-off between range
and resolution. Swath width is a measure of total lateral coverage (twice the range to either side)
for linear survey systems. Total areal coverage depends on speed through the water.

Remote-sensing platforms are under continuous development, promising cheaper, more


flexible sensor deployment. The sensors themselves are increasing in range, resolution,
and bandwidth, pushing physical limits in many cases. However, the ability of these
combined systems to conduct successful and efficient operations demands an acute
capability not only to sense and model the undersea environment but to do so in rt<al
time. Further, as our understanding of subsea processes is refined and as our questions
become more subtle, the limitations of individual sensors become more apparent
(Stewart 1988).

Considering the full scope of a detailed site survey, for example, a gamut of sensors
spanning different scales of range, resolution, and raw data types must be
accommodated. Such a mission is represented by Fig. 2, which shows an underwater
vehicle equipped with a suite of remote sensors. These might include different sonars
(obstacle avoidance, down-look, side-scan), cameras (video, film and digital still), a
88

scanning laser, and sensors to measure gravity, magnetic fields, temperature, salinity, and
so on. Though a tethered ROV is represented, the intended scenario also applies to a
free-swimming AUV or towed instrument sled.

!:

\ /\
I

\
\~ ..,» '; )
(

Fig. 2 A generic muItisensor vehicle exploring the undersea terrain. Modern platforms typically
carry a number of instruments including sonars, cameras, and different sensors to measure
temperature, salinity, and other parameters.

In all cases, this generic exploratory probe is capable of collecting an enormous amount
of multisensor data as it moves through the undersea terrain. The technology to generate
this information flow is here today; the challenge lies in developing new methods to
integrate the data and to construct high-level models of the environment that can be used
by man and machine alike. Though there are basic differences between sonar, video, and
laser scanning, there is still much common ground in data acquisition, signal processing,
digital representations, data storage and retrieval, and visual display. What we need to
take advantage of this commonality for the integration and visualization of multisensor
data is a consistent framework for information management.
89

RESEARCH AND DEVELOPMENT

As in many other fields, the need for visualization in undersea exploration begins with
the first stages of research and development. Aside from the advantages of good displays
in an analytical context, visual presentations usually make conceptual and programming
errors stand out during development. In many cases, visualization techniques offer the
only reasonable means of digesting the enonnous amount of data being generated by
modem high-perfonnance processors in numerically intensive applications. As an
example, consider the volumetric rendering shown in Fig. 3.
Because of the error,
ambiguity, and uncertainty
associated with remote
sensing underwater, an
effective way to process and
represent survey data is to
use a three-dimensional
probability distribution
(Stewart 1990). Figure 3
shows a translucent, ray-cast
rendering through a section
of such a stochastic model,
which was derived from a
survey of the USS Monitor,
the Civil War ironclad that Fig. 3 Translucent volumetric rendering of a probability
capsized and sank during a model shows a cross-section through the USS Monitor sonar-
stonn in 1862. During a 1987 survey data. The three-dimensional numeric model explicitly
survey conducted by the incorporates sensor resolution and ambiguity, position and
National Oceanic and attitude error, and processing uncertainty.
Atmospheric
Administration, a high-frequency scanning sonar was mounted on the Navy's Deep
Drone ROV and used to explore the sunken wreck (Arnold et al. 1988; Stewart 1991a).

Figure 4 shows how the data returned from a single sonar pulse (ping) are represented.
The conical probability distribution (exaggerated for illustration) is detennined by the
sonar's beam pattern and angular measurement error. Position error and ranging
uncertainty further smear the sensing envelope. Volume elements of the numeric model
are clearly visible in the figure. The representation indicates that a single ping only
defines a region containing one or more point scatterers (wann colors) whose position
cannot be precisely detennined. The volume through which the signal has passed without
causing a return (cool colors) likely contains only water, with a probability that depends
mainly on the acoustic signal-to-noise ratio (Stewart 1988).
90

By accumulating and
combining mUltiple returns
taken over time (as in Fig. 3),
the shape of the wreck
emerges. The stochastic
representation is useful to an
engineer developing such
remote-sensing techniques, or
to an archaeologist who
wants to understand how
good the sonar·
measurements really are.
From the volumetric model
can be extracted one- Fig. 4 Volumetric scattering probability for a single sonar
dimensional hull profiles with pulse. The warm colors represent a high target probability at
error bounds, two- a specific range. The cool colors represent a volume through
dimensional contour maps, or which the signal has passed without a return.
three-dimensional
perspective views (Fig. 5) showing the most probable shape of the wreck (Stewart
1991a).

An animated "swim-around"
has also been produced from
the data to give a better
overall picture of the
Monitor's condition. Such
visualization tools are
becoming more common
within the oceanographic
community as a means of
interpreting seafloor
morphology in geological and
geophysical research. Within
this category are animated
films produced by Dr.
Fig. 5 Three-dimensional estimate of most-probable
William Ryan and colleagues shipwreck form, extracted from USS Monitor probability
at the Lamont-Doherty model. The color-coded perspective offers redundant
Geological Observatory and information about shipwreck relief: reds indicate highest
by Dr. Robert Tyee at the elevation, blues lowest.
University of Rhode Island.
At the Woods Hole Oceanographic Institution, Dr. Ralph Stephen has assembled an
animation production facility for visualizing simulations of acoustic propagation through
the ocean sub-bottom. In this case, the dimension added to scientific perception is time.
At the University of Tokyo, more realistic renderings of seafloor exploration using the
91

ALVIN submersible have been animated as an educational tool showing the "big
picture."

As in the fine-scale research illustrated by the Monitor work, navigation is also a


problem for surface vessels carrying remote survey systems such as Sea Beam, a
multibeam bathymetric sonar (Tyce 1987; de Moustier 1988). Figure 6a shows four
overlapping swaths of contoured Sea Beam bathymetry with significant misregistration
caused by limitations of satellite and dead-reckoned navigation. In a collaborative effort
among researchers at the University of Maryland and the Naval Research Laboratory,
computer vision techniques were applied to the development of a semi-automated
system for contour matching and registration (Kamgar-Parsi et al. 1989).

(a) (b)

Fig. 6 Contoured swaths of Sea Beam bathymetry. Absolute depth information is encoded by
color; contour intervals give direct cues about seafloor slope: (a) uncorrected swaths; (b) a
second-order chain code is used to locally align contours before global matching (Kamgar-Parsi
et a!. 1989).

First, individual swath pairs are aligned using a modified chain-code method (Ballard
and Brown 1982), in which the curvature of line segments is represented as a rotation-
invariant difference of successive elements (second-order chain code). This local
matching finds the best registration of overlapping swaths using several encoded contour
lines. Then, a global match over multiple swaths is estimated from a cost function
determined by violations of local matching and by constraints on compression and
bending of swaths, physical limits imposed by a ship's dynamics. Figure 6b shows the
corrected swaths. An alternative approach to the Sea Beam registration problem is given
by Nishimura and Forsyth (1988)

Vehicle and system simulation is another area in which visualization of undersea


missions has an important role. In part because of the relatively large investment
associated with in-water testing of undersea systems, simulation offers a potentially
lower-cost alternative. Figure 7 is taken from a Mitre Corporation simulation of AUV
92

obstacle-avoidance. The
conical features represent the
beam patterns of a 14-channel
sonar configured in this
instance to look in directions
defined by the faces and
vertices of a cube, like
omnidirectional "cat's
whiskers." The 3-D
simulation permits robotics
researchers to interactively
evaluate system performance
under various combinations
of terrain features, vehicle
dynamics, and sonar
Fig. 7 SimUlation of obstacle avoidance for an autonomous
configurations. The underwater vehicle. The conical features represent multiple
simulation's purpose is to sonar beams used in a "cat-whisker" configuration (Mitre
help design an intelligent Corporation, unpublished).
controller for autonomous
operation. Added benefits in this case are that configurations can be varied more quickly
than for a real system, and that such testing can be conducted without risk to a valuable
AUV.

AT-SEA OPERATIONS

The real proof of the pudding for undersea visualization is when the systems get out of
the laboratory and head for the open (or under) sea. In terms of hardware, such obvious
issues as ruggedness, reliability, tolerance to a corrosive environment, and so on arise.
And there are other factors associated with software and the human interface. These are
not to be overlooked, but there are options and these are addressed briefly in a later
section of this paper. In terms of the benefits derived from taking such systems to sea,
however, a central issue is real-time performance. This is important in two main areas:
real-time piloting and engineering displays, and operational science feedback.

The pilot of an ROV, for example, usually relies on a view offered by one or more video
cameras, sometimes augmented by a scanning sonar display (Stewart 1988). Under good
conditions, low-light-level cameras can have a range of about 10 m, less for a color
image. Commonly, though, visibility can be restricted to less than a meter, especially
when working near the bottom or in strong currents. Under all conditions, the operator's
perception of distance is degraded by optical distortion and monocular vision. These
factors, along with a camera's narrow field of view and the apparent "sameness" of
underwater scenes, can quickly disorient a person at the controls.
93

SC"lar systems extend the range of perception, give a direct measure of distance, and add
another dimension under low-visibility conditions. Sonar, however, lacks the spatial
resolution of a camera and is less easily interpreted by a human pilot. In the absence of
strong acoustic reflectors with distinctive geometric properties, a vehicle's position can
be hard to judge from the sonar display alone. The problem is compounded by motion
artifacts introduced by a dynamic platform.

A drawback to both sonar and visual techniques is the transience of information


presented to the operator. Though recorded for later review, from the pilot's perspective
the data are continuously discarded. It is the human's burden to assimilate the
information and to form his own internal model of the surroundings. In a terrestrial
environment rich in sensory information, visual, tactile, aural, and other cues arrive in a
form readily integrable by a human processing system evolved to match the task. But
with already degraded sensor data collapsed to a two-dimensional form for video or
sonar display, the information-assimilation problem is formidable and worsened by the
need for attention to a complex system and to the immediate task at hand.

The navigation and status


display of the JASON ROV
shown in Fig. 8 is an example
of a graphical representation
useful to a vehicle pilot or
system engineer (Yoerger and
Newman 1989). Typical of
many industrial trends,
mechanical and electronic
gages can be replaced with
graphic analogs. When
superimposed or chroma-
keyed on live video, a "head-
up" display can be even more
useful to a pilot faced with Fig. 8 Real-time pilot and engineering display for the JASON
the constant hands-on, eyes- remotely-operated underwater vehicle. Iconic
on duties of maneuvering in representations replace gages and hard-wired instrument
tight quarters. Beyond this, displays.
however, new approaches to
underwater modeling are needed in which remote-sensing data are combined to generate
real-time, 3-D operator displays-techniques that furnish enhanced sensory cues for
more efficient human piloting. A cumulative sonar model, as in Fig. 5, for example,
could be used to generate a synthetic view of the underwater terrain with a representation
of the vehicle superimposed.

A step in that direction is illustrated by Fig. 9, which shows a 3-D perspective view of
the JASON ROV in relation to its target of interest, the USS Scourge. The sunken ship,
part of the U.S. Great Lakes fleet during the War of 1812, was the site of a remote
94

archaeological sUIvey during the spring of 1990 (Stewart 1991a). Although the displayed
data were taken from existing plans of the ship and the ROV, real-time views were driven
by navigation and attitude data from the vehicle. A step closer to the goal of real-time
modeling and visualization is illustrated by Fig. 10, a 3-D wire-frame view of
Scourge generated from
scanning-sonar data in near-
real time during the survey.
With the rapid rise in
computational and graphics
performance, that final goal
of combining the two
developments into a cost-
effective piloting aid is
nearly within reach.

Though hard real-time


requirements are less
stringent for science (or other
"customer") displays, a
capability for visualizing
data aboard ship has several
Fig. 9 Real-time piloting display showing vehicle position
benefits. In the first place, with respect to a shipwreck being explored. Perspective
operational feedback affords viewing parameters for the CAD models of the ship and ROV
an opportunity to modify are automatically generated from telemetry data (Marquest
search and survey strategy in Group, Inc., unpublished).
accordance with results. In
view of the high cost of at-
sea operations, this is an
important factor in making
more efficient use of costly
resources. In particular, a
viable approach is to conduct
a "coarse-to-fine" multiscale
survey in which wide-swath,
low-resolution sensors first
identify features of interest,
which are then investigated
at increasing resolutions in
more tightly-focused
regions. In the second place,
more processing can now be
undertaken at sea, reducing Fig. 10 Three-dimensional scanning-sonar view of USS
cost and delay in the post- Scourge created in near-real time. As processing and
processing tedium and graphics technologies evolve, real-time sensing will be used to
generate piloiting displays.
95

allowing results to be disseminated more quickly. A goal of many is to provide the


scientist a finished product to carry off the ship.

A side-scan sonar, often used in a search phase to locate targets, is a relatively low-
resolution sensor that generates an acoustic image similar to a photograph. Like its
optical analog, a side-scan image comprises a grid of intensity values. Variations in
intensity are determined by acoustic "lighting" geometry, angle of incidence, and
scattering properties of the target material; all are characteristic of an optical image. Also
like a photograph, a side-scan image is a two-dimensional projection of a three-
dimensional world. Apparent shape is extracted by our human brain, with half its mass
devoted to complex visual processing. Figure 11 shows a side-scan sonar image of
Scourge, generated in real time during the archaeological survey. In this image, the
prominent feature is an acoustic shadow; the bright region at bottom is the ship itself,
saturated by strong returns from a normal incidence near nadir. Outlined by the acoustic
source, Scourge's two masts, bowsprit, and a dangling spar are visible; bright rectangles
along the top rail are acoustic returns through the open gun ports.

The stereo pair of Fig. 12


illustrates how side-scan-
sonar imagery can be texture
mapped on a shape
description to provide even
further viewing realism. The
example is taken from a
survey of a sunken Liberty
ship off the coast of Florida
using a Klein high-frequency
side-scan sonar. Shape
"estimates" were derived
from a simple modulation by
intensity values, then
combined with the acoustic Fig. 11 Real-time, two-dimensional side-scan-sonar image of
imagery. The stereo USS Scourge showing acoustic shadow. The acoustic image is
representation is helpful to similar to an optical photograph because it contains no
an archaeologist interested in explicit shape information.
the overall shape of the
wreck, but could also be used by the pilot of an ROV if such a representation were
produced during field operations. Like the perspective views of Figs. 9 and 10, a real-
time stereo view could enhance the operators sense of presence during periods of low
visibility.

For a true definition of three-dimensional shape, though, a scanning sonar gives the
quantitative results needed by archaeologists. Though less refined at this stage of
processing than the model of the Monitor (Fig. 5), which used the same sonar, the wire-
96

Fig. 12 Stereo side-scan-sonar image of a shipwreck. The acoustic imagery is augmented by a


shape "estimate" derived from modulation by the image values (Matthias and Newton 1990).

frame projection (no hidden-line removal) in Fig. 10 was produced on site in near-real
time, with only 20 minutes of survey data. The view was generated during repeated
traversals along Scourge's starboard side, clearly delineating the forward mast, bowsprit,
and spar seen in the side-scan image (Fig. 11). Such real-time visualization techniques
should become an important part of an underwater archaeologists's toolkit, allowing
scientists and engineers to develop strategy on-site based on remote-sensing results, and
to make more efficient use of costly survey time.

A photomosaic of the USS Hamilton, Scourge's sister ship, was digitally produced
aboard ship during the course of the same 2-week archaeological survey (Fig. 13). This
detail of the bow section and figurehead of the goddess Diana is part of a larger
composite of the entire ship comprising more than 100 electronic snapshots. The detailed
images used for the mosaic come from an electronic still camera with direct digital
output (Harris et al. 1987). The CCD device is cryogenically cooled to reduce thermal
noise, resulting in high sensitivity and wide dynamic range. The advantages of the digital
format are that there is no loss of information that would be caused by a separate
conversion from an analog signal or photograph, and that the information can be
immediately transferred to a computer for digital manipulation. In contrast with the
digital product of Fig. 13, a 1974 mosaic of the Monitor was produced using
photographic techniques, which required six man months of effort, many trips to the
darkroom, and great skill with an enlarger to remove distortion and equalize brightness
levels.

To generate the digital photomosaic, a custom application was developed on a Pixar


image computer to allow interactive drag, rotate, scale, blend, and tie-point warping, in
addition to more standard image enhancement (Stewart 1991a). As if it were on a light
table, a new image is selected by pointing with a mouse, dragged smoothly around the
display, panned, zoomed, blended with the underlying mosaic, and flickered to check
registration. Multiple tie points are selected and warp coefficients automatically
generated with a least-squares algorithm. The process is iterative and nondestructive
97

until the operator is satisfied and "commits" the final transformation. A virtual imaging
technique manages large, high-resolution mosaics that exceed display boundaries.
Adaptive histogram equalization helps compensate for uneven undeIWater illumination.

Fig. 13 Electronic still-camera photomosaic of USS Hamilton, created with interactive digital
techniques. This section of the bow, showing the figurehead of goddess Diana, is part of a larger
composite of the entire ship.

PRESENTATION AND INTERPRETATION OF RESULTS

As in other fields, ocean scientists and engineers have much the same needs for
visualizing and interpreting data sets. The issues surrounding such "scientific
visualization" have been repeatedly addressed in various forums (for example, see
CG&A 1987), and will not be treated in detail here. In the context of subsea exploration,
a few examples are offered as representative of current "state of the art." In terms of data
interpretation, there is considerable overlap with the two preceding sections, which dealt
mainly with high-resolution sensors. In this section, examples are slanted toward coarse-
resolution, wide-area surveys and more polished products.

The widest scale acoustic survey system in use today is the GLORIA (Geological Long
Range Inclined Asdic) side-scan sonar (Chavez 1986). Capable of survey widths of up to
60 kIn (though rarely achieved in practice), GLORIA can economically generate
acoustic imagery covering large areas of the seafloor, and is used by the US Geological
Survey for mapping the US Exclusive Economic Zone. Figure 14 is a pseudocolor
acoustic backscatter image produced from GLORIA survey data (Mitchell and Somers
1989). Similar to optical and radar imaging, the level of backscattered acoustic intensity
is a function of material types and angle of ensonification (acoustic "illumination"). By
98

compensating for the effects of seafloor slope (derived from registered Sea Beam
bathymetric data), the acoustic map more closely represents seafloor material properties
than the raw, view-dependent imagery. A related approach to combining GLORIA and
Sea Beam data has been demonstrated by Twitchell (1988).
Figure 15 shows completed
acoustic maps derived from a
SeaMARC II survey of the
Eastern Pacific Rise, a
portion of the mid-ocean
ridge system and a site of
active spreading between
tectonic plates (Edwards et al.
1991). SeaMARC II is a
wide-swath sonar (refer to
Fig. 1) that produces detailed
acoustic imagery and coarser
shape information (Hussong
and Fryer 1983; Davis et al.
1986). The acoustic imagery Fig. 14 GLORIA acoustic backscatter image. The wide-swath
offers more information sonar is used by the Geological Survey for mapping the US
about the texture, or fine- Exclusive Economic Zone (Mitchell and Somers 1989).
scale fabric of the seafloor.
The acoustic bathymetry, though coarser in resolution, provides explicit shape
information. These examples produced at the Lamont-Doherty Geological Observatory
illustrate how geometric corrections for vehicle navigation and heading can be applied to
create an acoustic "mosaic" similar to the optical example of Fig. 13, and facilitate
comparison of long- and short-wavelength geomorphology. Other approaches to
SeaMARC II processing and display are given by (Reed 1987; Stewart 1988).

(a) (b)

Fig. 15 Completed acoustic maps derived from a wide-swath (10-km), side-scan sonar survey: (a)
acoustic imagery provides detail of the finer scale seafloor fabric; (b) acoustic bathymetry
provides explicit shape information for morphological analysis (Edwards et a!. 1991).
99

Unless side-scan imagery is corrected for inherent geometric distortions (similar to the
layover distortion in radar imagery), the shape and structure of seafloor can be
misinterpreted by geologists. Figure 16 illustrates how researchers at Scripps Institute of
Oceanography use registered Sea Beam bathymetry to relocate image pixels (Cervenka
et al. 1990). The center stripe in each image represents a portion of the along -track data
that is gated out by the sonar hardware because of the poor imaging geometry near nadir.
Although not obvious in the raw data (Fig. 16a), the layover-corrected image in Fig. 16b
more accurately reflects the true seafloor structure. Radiometric enhancement also brings
out more image contrast. Similar radiometric and geometric enhancements of side-scan-
sonar data are reported by (Chavez 1986; Stewart 1988; Reed and Hussong 1989).

(a) (b)

Fig. 16 SeaMARC II pixel relocation in a region of the Fieberling Guyot: (a) an uncorrected
image; (b) the radiometrically enhanced image has been layover corrected by shifting image
pixels perpendicular to the vehicle's track (Cervenka et al. 1990).

Figure 17 illustrates how the two kinds of information can be combined by texture
mapping the acoustic imagery onto bathymetric shape data. In this case, higher
resolution SeaMARC I (Kosalos and Chayes 1983) side-scan-sonar imagery has been
100

combined with lower resolution Sea Beam bathymetry from the Clipperton Transform
area (Gallo et al. 1986; Kastens and Ryan 1986). Taken together, the two kinds of
information offer a geologist a picture of the whole that is, in a sense, greater than the
sum of its parts. In addition, this image results from a real-time process in which the data
are geometrically and radiometrically corrected, then texture mapped and displayed in
perspective view as the data are sequentially acquired (Stewart 1988).

In Fig. 18, even finer-scale


bathymetry collected by the
deep-submersible ALVIN has
been compiled by scientists at
the University of Washington
into a descriptive illustration
of a hydrothermal vent field
in the northeastern Pacific.
The color-coded, 3-D
perspective view offers
geologists and geophysicists
a "gestalt" impression of the
morphology within the
geologically active area. The
twin peaks are large sulfide Fig. 17 Perspective view of Sea Beam bathymetry with
structures (10-20 m high) texture-mapped SeaMARC I side-scan sonar imagery. The
where black smokers spew image results from a real-time approach to processing that
forth high-temperature water can be carried out aboard ship.
rich in minerals, which
precipitate to form the
sulfide mounds. The
overlying contour map and
accompanying scale give a
more quantitative descriptioI1
of the seafloor morphology.

Geographic information
system (GIS) technology is
also being used to manage
and visualize spatial
databases for undersea
applications. The advantages
are being recognized not only
by oceanographers but by the
defense community as well Fig. 18 Fine-scale bathymetry from ALVIN submersible
(Breckenridge 1989). On the survey with color-coded perspective view and overlying,
scaled contour map (Sempere JC, U. Washington,
left in Fig. 19 is shown an
unpublished).
101

area of the Hood Canal near Seattle, WA, which has been coarsely categorized by bottom
type (sand, mud, etc.). On the right in the figure, cultural features (piers, buoys, etc.) are
shown in the context of the shoreline and bathymetric contours. In contrast with gridded
images, such databases can be interrogated in terms of point features, linear boundaries,
or polygonal areas, and displayed as multicolored layers on a graphics console or chart.

Fig. 19 GIS example showing areal classification of bottom types (left), and cultural features
within the context of the shoreline and bathymetric contours. Newer spatial databases combine
such features with raster imagery and bathymetry.

With the growing complexity and sophistication of marine geophysical research, three-
dimensional representations are being used more often as an aid to understanding the
Earth's complex internal structure. Figure 20 shows the results of an experiment by
researchers at the Woods Hole Oceanographic Institution, Massachusetts Institute of
Technology, and University of Durham, England, to image the subsurface composition
of Iceland's Hengill-Grensdalur volcano field (Toomey and Foulger 1989). By
measuring the relative delays in seismic arrival times of earthquake events at 20
distributed monitoring stations, researchers compiled a data set that was inverted (similar
to computer-aided tomography in medical imaging) to estimate the subsurface velocity
structure. The three distinct volumes in the figure are anomalously high-velocity bodies
102

that correspond to extinct volcanoes observed on the surface, and are believed to be
solidified intrusions of upwelling volcanic magma.

Beyond the utility of


visualization techniques for
data interpretation are the
opportunities for conveying
interesting results to peers
and public alike. Similar to
the results just described, Fig.
21 conveys a picture of the
velocity structure and
overlying morphology of the
East Pacific Rise, a site of
active seafloor spreading in
the Pacific Ocean. The lower
Fig. 20 Three-dimensional tomographic inversion of
portion of the figure is a
earthquake arrival times. The model shows constant-velocity
horizontal, subsurface cross- cubic blocks of dimension 0.25 km. The color scale represents
section through the three- percentage difference in seismic P-wave velocity (Toomey
dimensional seismic-velocity and Foulger 1989).
structure imaged by
tomographic inversion of
arrival times from explosive
sources. In the top of the
figure, a wire-frame
representation of the
overlying morphology shows
the relationship of the central
ridge axis to subsurface
features. The "hot spot," a
high-velocity region
illustrated by warm colors in
the seismic contour,
corresponds to observed high
temperatures and an upward , .
injection of magma centered
---------------------
.... 'I•..,.' • •hc"
Aft~.IAMc;""",,,"'t.,, ... ,

on the ridge crest.

The East Pacific Rise, like Fig. 21 Combined depiction of seismic-velocity structure
other parts of the 75,OOO-km- (lower) with overlying seafloor morphology (upper). The
velocity "hot spot" corresponds to high temperatures and
long oceanic-ridge system magma injection at a site of seafloor spreading on the East
that winds around the Pacific Rise (Toomey et at. 1990).
globe-from the Arctic
Ocean to the Atlantic Ocean, around Africa, Asia, and Australia, under the Pacific Ocean
and to the west coast of North America-is being intensively studied by marine
103

'Ill(' ~Iid 0 'l'ClIl Hidg'

Fig. 22 Color depth-coded, perspective view of a I,OOO-km section of the East Pacific Rise. This
segment is part of a 7S,OOO-km system wrapping around the globe (Macdonald and Fox 1990).

geologists and geophysicists in an effort to


understand the underlying mechanisms of
plate tectonics (Macdonald and Fox 1990).
Over the last decade, considerable effort has
been expended in mapping this complex
system with such wide-swath survey tools as
Sea Beam and SeaMARC TI. Figure 22 comes
from a Scientific American article by marine
scientists at the University of California, Santa
Barbara, and the University of Rhode Island,
who have compiled bathymetric maps of the
region from multiple surveys (Macdonald and
Fox 1990). Such detailed maps offer
researchers new insights to the evolution and
formation of ridge segments.

Figure 23 shows a more detailed, color depth-


coded, wire-frame perspective view of an Fig. 23 Color-coded, wire-frame view of
an overlapping spreading center. The
interesting geological feature from the region
feature is associated with complex
recently identified and explained by motions at the intersection of plate
Macdonald et al. (1987). The central basin, boundaries (MacDonald et al. 1987).
104

about 500-m deep, arises from a discontinuity in an overlapping spreading center, a


complex feature associated with the movement of tectonic plates that make up the
Earth's dynamic crust.

RESOURCES AND STRATEGIES

The preceding text offers an overview of some issues associated with visualization in
undersea exploration, and examples that illustrate techniques now coming into common
practice within the community. The organization is that of the "classical" research-and-
development-to-operation-to-product stages of a pipeline. What remains is to extract and
summarize overlapping issues in terms of resources and strategies. While I hope to
capture elements of importance to the broader subsea community, I point out here that
my perspective partly is a product of the last eight years experience at the Deep
Submergence Laboratory, which encompasses the pipeline just mentioned.

In terms of resources, visualization hardware is increasingly less a problem than in years


past. Because of its small market size, undersea applications are not a significant driver
in this area of technological development, but share the benefits accrued from the more
global emphasis increasingly placed on visualization. As hardware performance rises, of
both general-purpose computers and specialized visual engines, more applications come
within reach. The bottom line really is cost. Such issues as ruggedization can be
addressed for sea-going systems, but are often not a problem in modern, c1imate-
controlled research vessels. This is not to say that reliability is not carefully considered,
but that a somewhat higher failure rate at sea can be offset by the lower prices of off-the-
shelf or commodity equipment.

Software is much more of an issue, often representing the major investment in


visualization capabilities. As in other fields, the trade-offs are among: purchase price and
support costs of commercial packages; hidden costs of a steep learning curve for
complex packages (usually proportional to flexibility and number of features), whether
commercial or public-domain; and the explicit costs of custom development. These are
complex issues that cannot be treated fully here. The positive outlook, though, is that the
same trends driving hardware development will benefit the subsea community in the
software arena; the lower cost, enhanced features, and ease of use associated with
graphical user interfaces are all welcome. The downside is that certain more specialized
needs of the subsea community will not be represented in the general market. For this
reason, custom software development will likely remain the preferred alternative for a
significant number of applications.

In terms of human resources, again, the needs largely parallel those of other fields. I
share the view that, to the extent possible, visualization tools should be put into the
hands of the end users themselves. At the beginning and end of the pipeline, scientists
and engineers can benefit from having more direct control in shaping the products of
their data and of their imagination. I leave aside the (often valid) counter-arguments for
105

centralization of resources and focus of expertise; the subtleties have been better
addressed elsewhere.

With respect to operations, the foregoing applies as well to the needs of sea-going
scientists and engineers. An added item on the needs list is that sea-going and shore-
based environments be as similar as possible; the benefits in efficiency and productivity
are apparent. For certain real-time aspects, particularly those involving the hands-onl
eyes-on responsibilities of an ROV pilot, there are additional human-factors
considerations. To a large extent, these are being addressed in other fields, such as
aircraft cockpit design and teleoperation for space applications. All the accoutrements of
"virtual reality" systems may become useful in easing the load on an ROV pilot as well.

At the Deep Submergence Laboratory our architectural and engineering strategies for
incorporating visualization capabilities focus on standards, distributed network access,
and the use of off-the-shelf building blocks. In the first case, problems often arise not
because "standards" are lacking but because there are many from which to choose. To
facilitate the pursuit of distributed network access, the X-Windows System is a standard
of increasing importance in our laboratory. Although this provides a much-needed and
timely focus for efforts, many choices remain.

At the user interface level, the "wars" among the various camps-window managers,
graphical user interfaces, different flavors of look and feel-have not produced a
decisive victor, and likely will not. From a software engineering perspective, there are
additional choices among the different toolkits, application programmer interfaces, data
formats, and so on. In terms of graphics- and imaging-specific tools, the options include
the Graphical Kernel System (GKS), Programmer's Hierarchical Interactive Graphics
System (PHIGS), PHIGS Extension to X (PEX), Programmer's Imaging Kernel (PIK),
and the list goes on (for further detail, see CG&A 1986). Hardware options are as
numerous. Our only solution, if it can be called that, is to support multiple standards-a
select few that best serve our wide range of needs.

The many elements touched on in the preceding discussion will ultimately determine the
constitution of a practical system for our purposes. Because of our special needs, the best
approach appears to be that of developing specialized "vertical" applications with
"horizontal" building blocks. The technological advances encompassing the various
hardware and software components are important to our work, but that work does not
include advancing state of the art in visualization technology. In our world visualization
is a tool, not an end in itself. But it will play an important role in reaching our goals of
extending the frontiers in remote subsea exploration.

THE FUTURE

The outlook for marine scientists and engineers, like other consumers of underwater
remote-sensing and visualization technologies, is rosy. New platforms, autonomous
106

underwater vehicles as well as teleoperated systems, are under continuous development,


promising cheaper, more flexible sensor deployment. The sensors themselves are
increasing in range, resolution, and bandwidth, pushing physical limits in many cases.
Scanning lasers are now available for underwater applications (Fig, 24). Capable of
higher angular resolution and faster scanning than sonars, lasers can also generate
monochromatic imagery combined with precision range maps (Coles 1988; Klepsvik et
al. 1990).

(a) (b)

Fig. 24 Scanning laser test using a Chinese mask: (a) contoured range map; (b) gray-scale laser
imagery. These new devices for underwater use will allow much higher resolution for survey
work and extend the range of optical imaging (KJepsvik et al. 1990).

At the same time, rapidly evolving computational, graphics, and video technologies are
having a profound effect on the way underwater science and engineering are being
conducted. Analytically and conceptually, they extend our reach. Applied to work at sea,
they give more timely and more complete feedback, reducing cost and delay in the post-
processing tedium. And with the new information technologies, interesting results and
techniques will be communicated more quickly, widely, and effectively.

ACKNOWLEDGMENTS

Although credit has been given in the text for a number of contributed images, these
were made available through the courtesy of many individuals: Larry Rosenblum, Naval
Research Laboratory (Fig. 6); Pete Bonasso, Mitre Corp. (Fig. 7); Dave Mindell, DSL
(Fig. 8); Martin Bowen, Marquest Group, Inc. (Fig. 9); Fred Newton, Triton Technology,
Inc. (Fig. 12); Neil Mitchell, Lamont-Doherty Geological Observatory (Fig. 14); Margo
Edwards, Lamont-Doherty Geological Observatory (Fig. 15); Pierre Cervenka, Scripps
Institute of Oceanography (Fig. 16); Jean-Christophe Sempere, U. Washington (Fig. 18);
Doug Toomey, U. Oregon (Figs. 20 & 21); Steve Miller, U. California, Santa Barbara
107

(Figs. 22 & 23); and Svein Winther, Seatex A/S (Fig. 24). All other images were created
by the Deep Submergence Laboratory (DSL) of the Woods Hole Oceanographic
Institution, and were made possible by contributions from the DSL imaging team: Jon
Howland, Marty Marra, Steve Gegg, and Steve Lerner. Hamilton and Scourge data came
from a field survey conducted under auspices of the JASON Foundation for Education
and the Corporation of the City of Hamilton, Ontario. This is contribution 7620 of the
Woods Hole Oceanographic Institution.

REFERENCES

Arnold JB III, Jenkins JF, Miller EM, Peterkin EW, Peterson CE, Stewart WK (1988) USS. MONITOR
Project: Preliminary Report on 1987 Field Work, in Proc. Coni Historical Archaeology, Society for
Historical Archaeology
Ballard DH, Brown CM (1982) Computer Vision. Prentice-Hall, Englewood Cliffs, NJ, pp 235-237
Breckenridge J (1989) U.S. Navy Applications for Geographic Infonnation Systems, GIS World, Nov/
Dec, pp 38-39
Cervenka P, Moustier C de, Lonsdale PF (1990) Pixel Relocation in SeaMARC II Sidescan Sonar
Images Based on Gridded Sea Beam Bathymetry, EOS Trans. American Geophysical Union 71(43):
1407-1408
CG&A (1986) special issue on graphics standards, IEEE Computer Graphics and Applications 6(8)
CG&A (1987) Visualization in Scientific Computing-A Synopsis, IEEE Computer Graphics and
Applications 7(7): 61-70
Chavez PS (1986) Processing Techniques for Digital Sonar Images from GLORIA, Photogrammetric
Engineering and Remote Sensing 52(8): 1133-1145
Coles BW (1988) Recent Developments in Underwater Laser Scanning Systems, SPIE Underwater
Imaging 980: 42-52
Davis EE, Currie RG, Sawyer BS, Kosalos JG (1986) The Use of Swath Bathymetric and Acoustic
Image Mapping Tools in Marine Geoscience, Marine Technology Soc. J. 20(4): 17-27
Edwards MH, Fornari DJ, Madsen JA, Malinverno A, Ryan WBF (1991) The Regional Tectonic Fabric
of the East Pacific Rise from 12° 50' N to 15° 10' N.J. Geophysical Research, in press
Farre JA, Ryan WBF (1985) 3-D View of Erosional Scars on U.S. Mid-Atlantic Continental Margin,
American Association of Petroleum Geologists Bulletin 69(6): 923-932
Farre JA, Ryan WBF (1987) Surficial Geology of the Continental Margin Offshore New Jersey in the
Vicinity of Deep Sea Drilling Project Sites 612 and 613. In: Poag CW, Watts AB, et al. (eds) Initial
Reports of the Deep Sea Drilling Project, Vol. XCW. U.S. Government Printing Office, Washington,
DC
Gallo DG, Fox J, Macdonald KC (1986) A Sea Beam Investigation of the Clipperton Transfonn Fault:
The Morphotectonic Expression of a Fast Slipping Transfonn Boundary, J. Geophysical Research
91(B3): 3455-3467
Harris SE, Squires RH, Bergeron EM (1987) Underwater Imagery Using an Electronic Still Camera, in
Proc.IEEE Oceans '87, pp 1242-1245
Hussong DM, Fryer P (1983) Back-Arc Seamounts and the SeaMARC II Seafloor Mapping System,
EOS Trans. American Geophysical Union 64(45): 627-632
Kamgar-Parsi B, Rosenblum LJ, Pipitone FJ, Davis LS, Jones JL (1989) Toward an Automated System
for a Correctly Registered Bathymetric Chart, IEEE J. Oceanic Engineering 14(4): 314-325
108

Kastens KA, Ryan WBF (1986) Structural and Volcanic Expression of a Fast Slipping Ridge-
Transfonn-Ridge-Plate Boundary: SeaMarc I and Photographic Surveys at the Clipperton Transfonn
Fault, J. Geophysical Research 91(B3): 3469-3488
Klepsvik 10, Torsen HO, Thoresen K (1990) Laser Imaging for Subsea Inspection: Principles and
Applications, in Proc. MTS ROV '90
Kosalos 10, Chayes DN (1983) A Portable System for Ocean Bottom Imaging and Charting, in Proc.
IEEE Oceans '83, pp 649-656
Macdonald KC, Sempere JC, Fox, PJ, Tyce R (1987) Tectonic Evolution of Ridge-Axis Discontinuities
by the Meeting, Linking, or Self-Decapitation of Neighboring Ridge Segments, Geology 15(11):
993-997
Macdonald KC, Fox Pl (1990) The Mid-Ocean Ridge, Scientific American 262(6): 72-79
Malinverno A, Edwards MH, Ryan WBF (1990) Processing of SeaMARC Swath Sonar Data, IEEE J.
Oceanic Engineering 15(1): 14-23
Matthias PK, Newton FL (1990) A Practical 3-D Seafloor and Sub-Bottom Mapping System, in Proc.
Offshore Technology Conference '90, pp 307-313
Mitchell NC, Somers ML (1989) Quantitative Backscatter Measurements with a Long-Range Side-
Scan Sonar, IEEE J. Oceanic Engineering 14(4): 368-374
Moustier C de (1988) State of the Art in Swath Bathymetry Survey Systems. In: Wolfe OK, Chang PY
(eds) Current Practices and New Technology in Ocean Engineering, OED-13. ASME, New York,
NY, pp 29-38
Nishimura CE, Forsyth DW (1988) Improvements in Navigation using Sea Beam Crossing Errors,
Marine Geophysical Researches 9: 333-352
Reed TB (1987) Digital Image Processing and Analysis Techniques for SeaMARC II Side-Scan Sonar
Imagery, PhD Thesis, U. Hawaii
Reed TB, Hussong DM (1989) Quantitative Analysis of SeaMARC II Side-Scan Sonar Imagery, J.
Geophysical Research 94(B6): 7469-7490
Stewart WK (1987) Computer Modeling and Imaging Underwater, Computers in Science 1(3): 22-32
Stewart WK (1988) Multisensor Modeling Underwater with Uncertain Infonnation, PhD Thesis,
Massachusetts Institute of Technology and Woods Hole Oceanographic Institution loint Program in
Oceanographic Engineering, WHOI-89-5
Stewart WK (1990) A Model-Based Approach to 3-D Imaging and Mapping Underwater, ASME J.
Offshore Mechanics and Arctic Engineering 112: 352-356
Stewart WK (1991 a) Multisensor Visualization for Underwater Archaeology, IEEE Computer Graphics
and Applications 11(2): 13-18
Stewart WK (1991b) High-Resolution Optical and Acoustic Remote Sensing for Underwater
Exploration, Oceanus 34(1)
Toomey DR, Foulger OR (1989) Tomographic Inversion of Local Earthquake Data from the Hengill-
Orensdalur Volcano Complex, Iceland, J. Geophysical Research 94(BI2): 17,497-17,510
Toomey DR, Purdy OM, Solomon, SC Wilcock WSD (1990) The Three-Dimensional Seismic Velocity
Structure of the East Pacific Rise Near Latitude 9° 30' N, Nature 347(6294): 639-644
Twitchell DC (1988) Erosion of the Florida Escarpment: Eastern Oulf of Mexico, PhD Thesis, U.
Rhode Island
Tyce RC (1987) Deep Seafloor Mapping Systems-A Review, Marine Technology Soc. J. 20(4): 4-16
Yoerger DR, Newman JB (1989) Control of Remotely Operated Vehicles for Precise Survey, in Proc.
MTS ROV '89, pp 123-127
109

w. Kenneth Stewart is an Assistant Scientist at the Deep


Submergence Laboratory of the Woods Hole Oceanographic
Institution. His research interests include underwater robotics,
autonomous vehicles and smart ROV's, multisensor modeling,
real-time acoustic and optical imaging, and precision underwater
surveying. Stewart has been going to sea on oceanographic
research vessels for 19 years, has developed acoustic sensors and
remotely-operated vehicles for 6000-m depths, and has made
several deep dives in manned submersibles, including a 4000-m
excursion to the Titanic in 1986. He is a member of the Marine
Technology Society, Oceanography Society, IEEE Computer
Society, ACM SIGGRAPH, and NCGA. Stewart received a PhD
in Oceanographic Engineering from the Massachusetts Institute of
Technology and Woods Hole Oceanographic Institution Joint
Program in 1988, a BS in Ocean Engineering from Florida
Atlantic University in 1982, and an AAS in Marine Technology
from Cape Fear Technical Institute in 1972.
Address: Deep Submergence Laboratory, Woods Hole
Oceanographic Institution, Woods Hole, MA 02543, email:
kens@jargon.whoLedu
Chapter 2
Animation
A Particle-Based Computational Model of Cloth
Draping Behavior
David E. Breen, Donald H. House, and Phillip H. Getto

ABSTRACT

We report on a particle-based model that we have used to reproduce the draping behavior of cloth.
The model utilizes a microscopic representation that directly models the interactions between the
yarns in the weave of the material, rather tha.n using a macroscopic continuum approximation to
the material. Because the model incorporates the micro-structure of the material, it can be easily
extended to incorporate important material nonlinearities such as the frictionally-based mechanical
interactions between fibers that give cloth its ability to be shaped, pressed, and formed.

Every time a tablecloth is draped over a table it will fold and pleat in unique ways, but nevertheless,
each tablecloth will have its own characteristic "signature". Since our model exhibits this same
type of behavior, visualization was our primary means for experimental verification and evaluation.
\Ve provide a description of how visualization was used in this research, and include sample
vis ualizations.

Key Words: cloth modeling, :\[etropolis algorithm, particle-based modeling, physically-based


modeling, visualization.

1 INTRODUCTION

If modeling and visualization are to become standard engineering design tools we must develop
models for treating the full range of materials commonly encountered in construction, manufac-
turing, and fabrication. :\Jaterials that have been successfully modeled so far exhibit mechanical
behaviors that, ovC'r a wide range of stresses and strains, are adequately characterized by con-
tinuum representations. Such "engineering materials" as steel, glass, and many plastics fit this
criterion. Their behavior can be simulated very successfully using finite-element or finite-difference
methods, and a Viuiety of readily available visualization tools ca.n be used to view results of these
simulations. Unfortunately, mallY other cornmonly tlsed materials do not admit to this form of
analysis, and thus lie outside of the range of current design automation technology. Such materials
include plastic foams, certain composites, and process materials that undergo phase changes. One
of the most commonly used ma tf'rials t.hil1 has e\'ac!ed precise engineering analysis is woven fabric.
Perhaps it is because of t.he lack of a good engincering model of cloth that most of the design work
for materials-handling in the gill'lTIent manufacturing industry has been a "seat-of-the-pants" art.

113
114

0---.---.----
~- -t-U-J
I
•I • •I •I
~-l-l-~-l.--~-
t--tt+-~-J!
L-J I I-l
Thread crossings map into particles
a b

Figure 1: l\'1apping a plain weave to a particle grid

Over the past year, as part of a team involved in an automated garment handling project, we have
been developing a model of woven fabric that ultimately is to be capable of predicting the dynamic
draping behavior of woven cloth. Our project goal can be summarized by a question. Given a
piece of fabric and a geometric environment with which it interacts, how will the fabric fall on the
geometry, and what will its final equilibrium configuration be? Besides this immediate pragmatic
goal, we are also intrigued by the potential for developing a physically-based model of cloth useful
for computer graphics and animation. This technology could have possible application in apparel
design, where currently available CAD tools allow the designer to work only in two-dimensions.

Our model of cloth is based on the fundamental idea that the macroscopic behavior of complex
materials arises out of the interactions between its microscopic structures. In some materials it is
sufficient to statistically aggregate these microscopic interactions and then to treat the aggregate
as a continuum that can be described by partial differential equations. In the case of cloth,
however, the microstructure consists of threads and yarns interlaced in a particular weave pattern.
Much of cloth's unique character comes from this underlying structure, with its various highly
nonlinear geometric constraints, frictional interactions, and anisotropy. Therefore, modeling the
basic structure of threads and their interactions is essential to simulating cloth's true macroscopic
behavior. Modeling the full detail of the underlying micro-structure of woven fabric is clearly
beyond current computational capacities, but there is an intermediate position that can capture
the most important interactions.

A piece of woven cloth consists of two sets of parallel threads, called the warp and the weft, running
perpendicular to each other. If as the weft threads travel through the weave, they alternately cross
over, then under the warp threads as diagramed in figure la, they form a plain weave. Many of
the important interactions that determine the behavior of plain woven fabric occur at the point
where the warp and weft threads cross. ·For example, usually the tension is so great at the thread
crossing points that the threads are effectively pinned together providing an axis around which
bending can occur in the plane of the cloth. The pinned thread crossings also hold the threads in
place when they are pulled along the direction of the thread.

Given that thread crossings play such an important role in influencing the local behavior of cloth,
115

our model treats the thread crossings as the fundamental modeling unit. We call each such unit
a "particle", and it is at the level of these particles that we maintain constraints, in the form
of potential energy functions, on the relationships between the threads. Figure 1b shows the
topologically two-dimensional particle network we use to represent a plain weave in our model.
The thread-relationship constraints maintained in the particle grid embody four basic mechanical
interactions occurring in cloth at the thread level. They are thread collision, thread stretching,
thread bending, and thread trellising, with trellising being in-plane bending of a thread around a
crossing point. If these simple interactions are properly accounted for in the grid, local interactions
acting on the microscopic level aggregate to produce a macroscopic behavior that is convincingly
close to cloth.

Simulations of draping cloth from our particle grid are obtained using a two step process. At each
discrete time step in a simulation, we first remove the inter-particle constraints and the particles
are allowed to free-fall. Collisions with and partially elastic reflections from solid models are
calculated at this point. After this, the inter-particle constraints and relationships are reenforced
by applying an energy minimization process to the inter-particle energy functions. This pulls
the particle grid together and produces the complex buckling and folding that is characteristic of
draped fabric.

2 PREVIOUS WORK

The initial pioneering work in fabric modeling was conducted by Peirce (1937) over 50 years ago.
He developed and analyzed a basic modeling cell of fabric geometry. The modeling cell detailed the
geometric relationships between threads at a thread crossing. The model consists of two thread
cross-sections constrained by a third threa.d segment running perpendicular to the cross-sections.
Using simplifying assumptions, Peirce derived a set of equations that define the relationships
between the geometric parameters of the modeling cell. Over the years the Peirce model has
been extended and enhanced (Olofsson 1961; Leaf 1985). Graphical methods for evaluating the
equations defining the parameters have also been developed (Hearle 1969).

Another significant area of research in fabric modeling has been to apply the theory of elasticity,
continuum mechanics and finite element techniques to modeling the mechanical properties of cloth
(Clapp 1990a; Clapp 1990b; Kilby 1963; Lloyd 1978; Mecha.nics 1980; Phan- Thien 1980; Shanahan
1978a). This research has modeled threads as elastic rods and fabrics as continuous plates and
shells. These approaches have produced limited successes. They have been used to predict in-
plane deformation of fabric and the associated stress-strain relationships. Recent work has begun
to look at buckling out of the plane. This approach unfortunately has severe limitations. Firstly,
applying conventional analysis techniques to in-plane deformations is of little practical value since
cloth quickly buckles out of the plane when stressed. Methods for keeping cloth in a plane during
measurement and analysis simply produce mechanical data for cloth in an unnatural state with
questionable value. Secondly, by assuming that fabrics are continuous sheets of material, these
methods ignore the very basic fact that fabric is a complex mechanism. They are limited in that
they usually assume and are only applicable for sma.!l deformations and linear behavior and have
difficulty in capturing the inherent anisotropy of woven fabrics.

Another approach to modeling the behavior of cloth involves describing and minimizing the strain
energy defined for basic modeling cells (Amirbayat 1989; de Jong 1977a; de Jong 1977b; de Jong
1978; Hearle 1978; Ly 198.5; Shanahan 1978b). In this approach, energy functions are based on
116

the structure and deformation of a single modeling cell. One or more partial differential equations
are derived and must be minimized in an iterative fashion. The approach is promising, but in
most cases energy methods have been used only to calculate a few of the conventional mechanical
parameters of woven cloth and some of the geometric parameters of just the modeling cell.

Most of the fabric modeling work of the textile community, with a few exceptions (Amirbayat
1989; Ly 1985), has been devoted to relating the behavior of materials to traditional mechanical
parameters, such as Young's modulus and Poisson's ratio. The work focuses on calculating stress-
strain curves, load-extension relationships, the relationships among geometric parameters, and
the dependence of bending moment on curvature. Very little of the work has actually used this
mechanical information to predict the overall shape of a piece of draped fabric.

Surprisingly, it is mainly in the the computer graphics community where the problem of simulating
the complex shapes and deformations of fabric in three dimensions has been tackled. Weil (1986)
defined a geometrical approach that approximates the folds in a constrained piece of square cloth.
Terzopoulos and Fleischer (1988), and Aono (1990) have developed constitutive equations for cloth
and have applied finite element methods to create 3- D cloth-like structures that bend, fold, wrinkle,
interact with solid geometry, and even tear. Baumann and Parent (1988) present a method similar
to ours in their Behavioral Test· Bed. They define simple behavioral actors and hook them together
to produce complex flexible objects, such as a waving flag. However, the primitive actors they
propose are not sufficient for the modeling of complex drape and buckling in fabric, because their
hinged triangular meshes do not accurately represent the true structure of cloth. Bassett and
Postle (1990), in a recently published paper in the textile literature, propose a method similar
to Haumann and Parent's that models fabric as a collection of simple geometric elements. This
work, once again, does not address the issue of overall fabric shape, but rather the stress-strain
relationships at thread crossings for a fabric stretched over a sphere. Feynman (1986) describes a
technique for modeling the appearance of cloth. His computational framework is the same as ours,
minimizing energy functions defined over a grid of points. His and our approaches differ in the
basic assumptions used to derive the energy functions. Feynman derives his functions from the
theory of elasticity and the assumption that cloth is a flexible shell, whereas the interactions in
our model are derived from our view of cloth as a complex mechanism, rather than a continuous
material.

3 THE MODEL

Our model represents a piece of woven cloth as a topologically 2-D network of 3-D points arranged
in a rectangular grid, as shown in figure I, where each point represents a thread crossing in a
plain weave fabric. We map all of the interactions occurring between threads into a set of energy
functions whose independent variables are the geometric interrelationships between points in the
2- D grid. Each point is effectively connected to and influenced by a small set of neighboring
points through the definition of four energy functions. These functions embody local mechanical
relationships and geometric constraints thitt exist at the thread crossings. The four thread-level
features of woven fabrics that we attempt to capture in these functions are non-interpenetration,
thread stretching, thread bending, and thread trellising. Eitch is represented by a term in the
energy equation,
Utotal = [lTepel + UstTetch + U bend + [ltTellis. (3.1)
Utotal is the total energy of a particle. U Tepei is the energy due to repulsion; every particle effectively
repels every other particle, helping to prevent interpenetration. UstTetch is the energy function that
117

R.+S·
I J

......1 - - - - r -----t.~
• •

r
a b
Figure 2: Combined repelling and stretching function

connects each particle with four of its neighboring particles. Ubend is the energy due to threads
bending over one another out of the plane. Utreilis is the energy due to bending around a pinned
thread crossing in the plane of the cloth.

Our approach to modeling cloth is an application of the ideas presented by Witkin et al. (1987).
They describe a technique for enforcing geometric relationships on parameterized models, where
geometric constraints are embodied in an "energy" function whose variables are the geometric
parameters of the objects being constrained. The function is formulated in such a way that the
constraints are met when the ensemble of the parameter values minimizes the function. Finding
the parameter values which minimize the "energy" function imposes the geometric constraint. In
our approach, we attempt to capture the geometric and mechanical relationships occurring at
thread crossings.

Our energy functions were developed by first analyzing the relationships that we wished to enforce,
in order to determine the general shape of the energy function. Next a function that matched the
desired shape was derived. In the case of the repelling potential Urepe/ we needed a function that
would not allow one particle to pass through another, but that would. have no effect on particles
distant from each other. Any energy function that rises from zero at a prescribed distance and
goes to infinity at zero meets these needs. The positive portion of the Lennard-Jones potential,
which is used in molecular modeling, is one such energy function and is given by

R,='!E(eT
[ / 1',)
12
-(eT/r;)
6
+41J,r,S2 1 eT,
6 (3.2)

where t effectively acts as a scale factor that regulates the strength of repulsion R, and stretch-
ing Sj, eT may be used to assign the equilibriulll distance between particles, and 1'i is the dis-
tance between the currently evaluated pa.rticle and some other particle i as seen in figure 2. The
function Urepel effectively provides collision avoidance between particles and helps to prevent self-
intersection of the fabric grid as a whole. It is calculated by summing over all particles as given
118

t
a

b c
Figure :3: The bending potential

by
n
Urepel = L R,. (3.3)
i=l

In order for connected particles to maintain a standard distance between each other, they must sit
in a steep energy well. Such a well was created by simply evaluating the Lennard-Jones potential,
reflecting these values about the zero point, and fitting a polynomial through them using regression
techniques. The resulting polynomial is

(3.4)

Figure 2b presents the general shape of the combined functions. Both Urepel and Ustretch were
defined to produce an energy curve of this shape. The well is produced by directly connecting
each particle with the stretching potential to four neighboring particles, except along the edges of
the grid. It is calculated by summing Sj for each neighbor as given by

Ustretch = L Sj. (3.5)


j=l

The combined repelling and stretching functions enforce the distance constraint between neigh-
boring particles in the grid. The equations that we present here are our first pass at representing
the true energies acting at the thread level. They most certainly will evolve as we continue our
studies, refine our model, and make them more accurate and realistic. At this point it is their
general shape and the role they play in ma.int.a.ining geometric constraints that is important.

As seen in figure 3b, a single thread can bend out of the plane around crossing threads. We have
represented this phenomenon by modeling the angle formed between each set of three particles.
Figure 3a shows that moving one particle, which represents a thread crossing, changes three
i>ending angles along a "thread" line. The energy associated with this angle should be zero when
119

nl2
c
e

Figure 4: The trellising potential

the angle is 11' and should go to infinity as the angle goes to O. This defines the equilibrium state of
the thread as flat, and does not allow the thread to bend into itself. tan( (11' - O)j2) has the general
shape necessary to enforce this constraint and is shown in figure 3c. The function that represents
the energy of thread bending is based on the positions of eight of a particle's neighbors, two in
each of the horizontal and vertical directions. These eight particles define six bending angles. We
define the energy of bending as a function of the angle formed by three particles along a horizontal
or vertical "thread line". The complete bending energy function for each of these angles is

Bk = Co tan ( (11' -2 Od) (3.6)

6
U6end = L Bk, (3.7)
k=!

where Co is the scale factor for bending energy Bk that determines its relative importance to the
other energy functions. For each particle, Bk is calculated 6 times for the six bending angles
produced at each thread crossing.

Figure 4b presents the phenomenon that we call trellising. It occurs when threads are held fast at
their crossings and bend to create an "S-curve" in the local plane of the cloth. The overall behavior
produced by this kind of ddormation is related to shearing in a continuous sheet of material. Since
our model treats cloth as an interwoven grid of threads rather than a continuous sheet, trellising
is a more accurate descriptive term than shearing. 'vVe have mapped this phenomenon into the
120

• • • EH •
• 0~0 •
EH~
• 0~0 •
• ~EH

• • EH • •
Figure 5: Particles needed to calculate energy change, 0 - stretching, + - bending, 0 - trellising,
• - repelling.

trellising modeling cell defined in figure 4a. In the cell, two line segments are formed by connecting
the four neighboring particles surrounding the central particle. The two segments cross to define an
angle. An equilibrium angle for the cell is predefined. Currently, we assume that the equilibrium
of the cell is at 7r /2. This may change over the course of a simulation as we model friction
related slippage. The trellis angle is then defined as the angle formed as one of the line segments
moves away from the equilibrium configuration, as shown in figure 4a. In order to enforce the
trellising constraint, i.e. keeping the crossed segments at 7r /2, a function is needed which is zero
at the equilibrium angle 7r /2, but that goes to infinity as the angle between the crossed segments
approaches zero. This prevents the two crossed threads from crossing over each other. The general
shape of the curve is given in figure 4c. The complete function for our energy of trellising is

(3.8)
4

Utrellis = LT I, (3.9)
1=1

where C 1 is the scale factor for trellising energy TI • 1/ is calculated 4 times for the four trellising
model cells associated with each thread crossing. Again, the tangent function does not necessarily
accurately represent the true strain energy due to trellising. It does, however, give the correct
characteristic shape.

Figure 5 summarizes the dependencies that exist between a particle and its neighboring particles.
As a particle moves, it changes the energy of the system as a whole. The energy change is based
on that particle's position relative to its neighbors' positions. Each component of the total energy
function depends on a different set of neighboring particles. These sets are presented in figure
.5. The particle being moved is represented by the large black dot in the center. It repels every
other particle in the grid. here represented by the small black dots (.). The stretching potential
is defined between the current particle and its four immediate neighbors, signified by a diamond
shape (0). To calculate the change in the energy of bending after a. particle is moved, requires
the positions of the eight nearest neighbors along the "thread" lines. These eight neighbors are
marked with crosses (+). Every particle affects the trellising energy of four neighboring trellis
modeling cells. The particles of those cells are marked with boxes (D). The positions of all of
121

• • • •

• •
• • j ~ • •
t t ~ 1
Particles in free-fall
a

Inter-particle constraints enforced


b

Figure 6: Two phase evaluation method

these particles are needed when calculating the change of energy of the particle system as a whole,
when one particle is moved.

4 EVALUATION METHOD

The four inter-particle energy relationships defined in our model embody the mechanical and
geometric relationships between threads at their crossing points in a woven fabric. Structuring
these thread particles in a 2-D grid, placing them into a simulated gravitational field, and allowing
the grid of particles to interact with solid models, produces a simulation of buckling and draping
cloth.

'vVe have implemented our cloth simulation as a two-phase process operating over a series of small
discrete time steps. The first phase of the simulation for a single time step models the effect of
gravity, and accounts for collisions bet\\"('en the cloth model and a geometric model that defines
the objects with which the cloth is interacting. The second phase enforces interparticle constraints
and moves the configuriltion into a local energy minimum before continuing with the next time
122

step. One step of this process is schematized in figure 6.

Phase one is simply an expedient way to produce large-scale motions of the grid in the direction of
gravity, thus quickly placing the cloth in an approximately-correct draping configuration. During
this phase, the inter-particle constraints are removed and each particle is allowed to drop as if in
free fall. This fall is modeled by the differential equation
Mx+Cx = Mg, (4.10)
that takes into account the particle's acceleration X, velocity x, mass M, gravity g, and viscous
drag resistance due to air C. Our simulations use the analytical solution to equation 4.10, but
there is no reason that a more complex equation could not be used and solved numerically.

During this phase, we pass the ray-segment formed from the starting and final positions of each
particle to the intersection routine of our ray tracer (Getto 1989). If the ray-segment intersects
any of the geometric objects, the particle most probably collides with that object during the time
step. However, as shown in figure 7, the ray intersection point is only an approximation to the
actual curved path that the particle might have followed during the time step. We assure an
accurate intersection point by applying an iterative process similar to those proposed by Moore
and Wilhelms (1988), and Hahn (1988). First, we reduce the time step to the time of collision
estimated by the distance along the ray-segment at which intersection occurs, and then repeat
the free-fall calculation for the particle in question. This process is repeated until we have an
intersection point that lies exactly at the end of the ray-segment, or we have determined that
no collision actually occurred. When an accurate collision point is determined, a partially elastic
reflection of the particle is calculated and used to adjust the particle's velocity, and the "bouncing"
particle is allowed to continue its fall for the remainder of the time step. If another collision is
detected, the process is repeated until the time step is exhausted.

Once the first phase is completed, and every particle has been moved one time step according to
equation 4.10, the inter-particle constraints, as defined by our energy functions, are enforced by
subjecting the entire particle system to an energy minimization process, in which each particle's
position is adjusted so as to achieve a local energy minimum across the model. When the entire
process is complete, particle velocities are adjusted to account for each particle's position change.
This minimization process pulls the grid together and maintains the geometric constraints and
relationships (Witkin 1987). The first phase quickly moves the grid through space and into contact
with the objects in the environment, giving the grid its general draping shape. The energy mini-
mization phase fine-tunes the model producing its more complex and detailed folding structures.

We have implemented a stochastic technique for the minimization step in our simulations that
approximates the process modeled by the Metropolis algorithm (1953). We call this technique
stochastic gradient descent (SGD). The Metropolis algorithm randomly perturbs each state in a
system. For each perturbation, the change of energy in the system is calculated. If the energy
change is negative, the new perturbed point is accepted, but if the energy change is positive,
the point is probabilistically accepted based on the Boltzmann factor associated with that en-
ergy change. Our experiments with this method showed that it was much too inefficient for our
purposes, requiring very large numbers of iterations to bring the particle grid into a satisfactory
minimum energy <;.onfiguration.

SGD has proven to be more effective. In this method, the gradient of the energy function is cal-
culated numerically and is used to construct a box around the region of space that approximates
the volume into which new positions that would be accepted by the Metropolis algorithm would
123

-----Actual particle path

intersection intersection

Figure 7: Intersection of particle path with geometry

Gradient

Small arrows designate possible new particle locations

Figure S: New position selection in SeD

fall. This construction is shown in figure S. Once this box is constructed, we use a uniform distri-
bution to stochastically select a new particle position in this region of probable minimum energy
configuration. This technique will generate position changes that are drawn from a probability
distribution quite similar to the Boltzman distribution that the Metropolis algorithm generates.

Our intention in developing SeD was to produce an algorithm with the same characteristics as
the Metropolis algorithm, but with improved efficiency. The Metropolis algorithm produces a
distribution of energy states by accepting all lower energy states and it percentage of the higher
energy states based on the Boltzmann distribution. It randomly generates new potential states
with no regard for the underlying energy functions in a system and then picks appropriate states
and throws away the rest. SeD stochastically generates new states in a region determined by the
negative gradient, increasing the likelihood that these states will lower the energy of the system,
and accepts all of them. By adjusting the parameters of the algorithms, both can produce the
same ratio of higher energy states to lower energy states. Our experiments have shown that SeD
is at least 10 times more efficient than rVletropolis for minimizing our energy functions, but still
gives a natural looking stochastic variation to the resulting configurations.

During the energy minimization phase, it is again necessary to prohibit particles from moving
through the surface of geometric objects. We do this in two ways. First, when estimating the
energy gradient about a particle we associate a very high energy penalty with movement into a
124

geometric object (Platt 1988). This makes it unlikely that we will generate a new particle position
inside a geometric object. Second, we do another ray-segment intersection test on each potential
new particle position and simply throwaway any position change that crosses the boundary of an
object.

5 THE ROLE OF VISUALIZATION

Visualization is an essential part of the experimentation with and evaluation of our cloth model.
Our model has been implemented in a testbed visualization and animation system, The Clockworks
(Getto 1990). By conducting our simulations within this system, we have access to a variety of
visualization tools. Since the particles in our model are arranged in a topological grid, it is easy
to construct a polygonal surface for visualization. The Clockworks provides interactive wireframe
and Gouraud shaded polygon viewing capabilities, allowing us to quickly inspect the details of the
particle grid. Our models are fairly large and we are currently unable to follow the progression
of a simulation interactively on our Silicon Graphics Iris 4Dj60T. Therefore, the system has
been interfaced to a single-frame videotape system. This allows us to record animations of our
simulations progressing through time directly from the SGI. These animations have given us insight
into the workings of the model and how it interacts with solid objects. Finally, in order to fully
inspect the detailed behavior of our model we generate high quality ray traced images from the
simulation data. These images give us useful information concerning the shape, continuity and
curvature of the folds generated by our model.

We rely heavily on visualization for evaluation of the model. This has given us important insights
into the various components of the particle energy functions that would have been difficult to
obtain in any other way. For example, the first attempts at minimizing the energy of the particle
grid with the Metropolis algorithm produced an amusing "mozzarella cheese" model where inter-
particle distances were not maintained. As particles struck a solid cubic object, they stopped,
while those not striking the object cont.inued to move toward the floor, stretching out the model
and making it look like melting cheese over a block. Once the Metropolis algorithm parameters
were tuned and the stretching function adjusted, we could create a grid that maintained the correct
equilibrium distance between all connected particles. At this point, the importance of trellising
became apparent. At the corner of the cube where one would expect folds and buckles, we found
high trellis angles, which in essence were decreasing the surface area at that point. Instead of the
cloth buckling and folding to adjust to the excess surface area collecting at the corners, it was
"trellising" out to produce a tongue of cloth that simply drooped to the floor. Once stochastic
gradient descent was implemented and applied to the same simulations, we found that the trellising
constraint was properly enforced.

Two successful simulations have been completely caJculated and visualized. In the first simulation
we drape a 50x50 square particle grid over a cube. In the second simulation, the 50x50 grid is
trimmed to produce a grid with a circular outline and draped over a cylinder. Figure 9 presents four
instances in time from the first simulation and figure 10 presents four instances from the second
simulation. All of these images are ray-traced and vividly demonstrate the folding produced by
the model.

In the second simulation, the mesh becomes highly perturbed in its early stages. These perturba-
tions disappear as the mesh strikes the cylinder. Producing line drawn images of the simulation
data allowed us to see that the perturbations were not completely random but contained certain
125

Figure 9: Simulation of a square cloth draping


126

Figure 9: (cont.) Simulation of a square cloth draping


127

Figure 10: Simulation of a circular cloth draping


128

Figure 10: (cont.) Simulation of a circular cloth draping


129

sawtooth patterns. The visualization of these patterns has lead us to re-examine the trellising
potential, and it may be reformulated at a later time.

Animations demonstrating the complete simulation for both cases have been recorded. For the
circular case, the animation shows the stiffness of the cloth in the direction of the threads. The
grid remains fairly flat during most of the simulation in those outer regions where the threads run
approximately radially out from the center. On the contrary, the cloth folds quite easily at the 45
degree angles from the thread directions. The animation shows that once the cloth comes to rest
the folds become fairly evenly distributed.

We are also developing other visualization tools to assist us in our research. One will generate a
pseudo-color texture map on the polygons that approximate our cloth model. The coloring will
display how the energy functions vary over the grid surface. This capability will allow us to isolate
each component of our energy function and to determine the relative importance of each of them
at different locations on the grid. Another texture map feature is being developed to allow us to
map an arbitrary surface texture onto the polygons defined by the grid in order to enhance visual
realism.

The importance of all of these visualization tools cannot be underestimated. Currently, we are
evaluating the correctness of our model by examining the previously described visualizations. In
the future we intend to recreate experiments conducted in the 1960's on draping cloth (Cusick
1962; Hearle 1969). We will need to "see" our results and visually compare them to the results
obtained in these earlier experiments.

6 DISCUSSION

We have described the interactions between threads with energy functions defined over a topologi-
cally 2- D grid of 3- D points. Minimizing these energy functions produces the draping and buckling
we are attempting to model. At this point in our work, we are not claiming that the interaction
functions that we have defined accurately model the energy of threads, although they certainly
have reasonable boundary values. What we are currently doing is testing our basic assumption
that by representing a material's microstructure, and by applying qualitatively correct geometric
constraints, characteristic macroscopic behavior will emerge. Our visual results show that our
assumptions are correct and encourage us to continue our research in this area.

Once the model is refined, extensive simulations will be conducted in order to recreate studies by
Cusick (1962). His numerous experiments on the draping behavior of cloth produced quantitative
characterizations of this behavior. We will consider our models correct once they give the results
sufficiently similar to those that Cusick produced through experimentation. Given that the func-
tional relationships of our model have stabilized, it will be necessary to tune the parameters of
the energy functions, so that the generic models may mimic specific types of cloth.

Of previous studies of cloth, the model that is most similar to ours is Feynman's (1986). He presents
a method for modeling the appearance of cloth that uses the same computational framework as
our technique. He simulates some of the mechanical properties of cloth by defining a set of energy
functions over a 2-D grid of 3-D points. He minimizes the energy of the grid with a stochastic
technique somewhat related to ours, and a multigrid method. This is where the similarities end.
Feynman assumes that cloth is a continuous flexible material and derives his energy functions
130

from the theory of elastic shells. We do not believe that this is the correct starting point for
modeling cloth. Feynman's energy equations also do not take into consideration a phenomenon
that we see as essential for modeling the mechanical behavior of cloth, the trellising of threads.
By modeling cloth's thread structure and its trellising, our model naturally buckles, removing the
need for a special and questionable energy of buckling defined by Feynman. Two additional issues
presented in our work that are not present in Feynman's are self-intersection and interaction with
constructive solid geometric objects.

One significant component missing from our approach is the modeling of frictional interactions
between threads. Our model already incorporates parameters that should allow us to represent
the slippage and sticking of crossed threads due to trellising. It needs to be extended to do the
same for frictional slippage during thread bending. Once frictional slippage is implemented, the
model should be able to represent the ability of cloth to be shaped and formed.

An important issue that has not been addressed so far is the issue of computation time. Both
the square and circular simulations required on the order of one CPU-week of computation on
a DECstation 3100 class workstation. Clearly these are not computation times conducive to
design automation. Simulating the mechanical behavior of complex materials requires enormous
numbers of computations. However, if we assign each particle to a processor, our particle-based
approach maps naturally onto the architecture of a massively-parallel computer. It was this fact
that motivated us to explore the concept of particle-based modeling in earlier work (House 1989),
where we investigated the task of modeling human cartilage for a surgical simulator. The future
implementation of particle-based models on a massively-parallel machine holds out the hope for
real-time simulations of complex materials. We have begun to explore this idea by implementing a
simplified version of our model on the Connection Machine (Hillis 1985). So far, our investigations
have revolved around experimenting with different implementations of the Metropolis algorithm
on the CM-2, and have produced encouraging results (House 1990).

7 CONCLUSION

We have proposed a model that simulates the draping behavior of woven cloth. The model
attempts to capture the thread structure of cloth, and assumes that cloth is a mechanism, not a
continuous flexible material. The model consists of particles (points) placed in a 2-D grid, where
each particle represents a thread crossing, and energy functions embody the interactions of the
threads. By minimizing these functions while allowing the particles to interact with solid objects,
the draping and buckling behavior of cloth may be simulated.

Our current results are extremely encouraging and provide a convincing demonstration of the
correctness of the assumptions upon which our model is based. There is still much work to
be done. The most obvious addition that needs to be made to the model is frictional slippage.
Second, we need to take advantage of the considerable literature on the experimentally determined
mechanical properties of threads and yarns to refine our energy functions. Finally, in order to
achieve reasonable simulation speed, and to take full advantage of the architecture of the model,
we need to implement a complete version of the model on a massively parallel computer.
131

8 ACKNOWLEDGMENTS

This study was conducted at the Rensselaer Design Research Center, Dr. Michael J. Wozny,
director, as part of a team project lead by Dr. Leo E. Hanifin. It is partially supported by
DLA contract No. DLA900-87-D-0016, NSF grant No. CDR-8818826, and the RDRC's Industrial
Associates Program.

9 REFERENCES
Amirbayat J, Hearle JWS, (1989) The Anatomy of Buckling of Textile Fabrics: Drape and
Conformability. Journal of the Textile Institute 80: 51-69

Aono M (1990) A Wrinkle Propagation I\10dcl for Cloth. In: Chu TS, Kunii TL (eds) Computer
Graphics Around the World (CG International '90 Proceedings). Springer, Tokyo Berlin
Heidelberg New York, pp 9.5-115

Bassett RJ, Postle R (1990) Fabric Mechanical and Physical Properties, Part 4: The Fitting
of Woven Fabrics to a Three-Dimensional Surface. International Journal of Clothing Science
and Technology 2(1): 26-31

Clapp TG, Peng H (1990a) Buckling of Woven Fabrics, Part I: Effect of Fabric Weight. Textile
Research Journal 60: 228-234

Clapp TG, Peng H (1990b) Buckling of Woven Fabrics, Part II: Effect of Weight and Frictional
Couple. Textile Research Journal 60: 28.5-292

Cusick GE (1962) A Study of Fabric Drape. PhD Thesis, University of Manchester

de Jong S, Postle R (1977a) An Energy Analysis of Woven-Fabric Mechanics by Means of


Optimal-Control Theory, Part I: Tensile Properties. Journal of the Textile Institute 68: 350-
361

de Jong S, Postle R (1977b) An Energy Analysis of Woven-Fabric Mechanics by Means of


Optimal-Control Theory, Part II: Pure-Bending Properties. Journal of the Textile Institute
68: 362-369

de Jong S, Postle R (1978) A General Energy Analysis of Fabric ·Mechanics Using Optimal
Control Theory. Textile Research Journal 48: 127-135

Feynman CR (1986) Modeling the Appearance of Cloth. Master's Thesis, Massachusetts Insti-
tute of Technology

Getto PH (1989) Fast Ray Tracing of Unevaluated Constructive Solid Geometry Models. In:
Earnshaw RA, Wyvill B (eds) New Advances in Computer Graphics (CG International '89
Conference Proceedings). Springer, Tokyo Berlin Heidelberg New York, pp 563-578

Getto PH, Breen DE (1990) An Object-Oriented Architecture for a Computer Animation Sys-
tem. The Visual Computer 6(2): 79-92

Hahn JK (1988) Realistic Animation of Rigid Bodies. Computer Graphics (SIGGRAPH '88
Proceedings) 22(4): 299-308
132

Haumann DR, Parent RE (1988) The Behavioral Test-bed: Obtaining Complex Behavior From
Simple Rules. The Visual Computer 4: 332-347

Hearle JWS, Grosberg P, Backer S (1969) Structural Mechanics of Fibers, Yarns, and Fabrics,
Volume 1. Wiley-Interscience, New York

Hearle JWS, Shanahan WJ (1978) An Energy Method for Calculations in Fabric Mechanics,
Part I: Principles of the Method. Journal of the Textile Institute 69: 81-91

Hillis WD (1985) The Connection Machine. The MIT Press, Cambridge, MA

House DH, Breen DE (1989) Particles As Modeling Primitives For Surgical Simulation. 11th
Annual International IEEE Engineering in Medicine and Biology Conference Proceedings, pp
831-832

House DH, Breen DE (1990) Particles: A Naturally Parallel Approach to Modeling. 3rd Sym-
posium on the Frontiers of Massively Parallel Computation Proceedings, pp 150-153

Kilby WF (1963) Planar Stress-Strain Relationships in Woven Fabrics. Journal of the Textile
Institute 54: T9-T27

Leaf GAV, Anandjiwala RD (1985) A Generalized Model of Plain Woven Fabric. Textile Re-
search Journal 55: 92-99

Lloyd, DW, Shanahan WJ, Konopasek 1\1 (1978) The Folding of Heavy Fabric Sheets. Interna-
tional Journal of Mechanical Science 20: .521-527

Ly, NG (1985) A Model for Fabric Bucklillg in Shear. Textile Research Journal 55: 744-749

Mechanics of Flexible Fibre Assemblies. (El80) Hearle JWS, Thwaites JJ, Amirbayat J (eds).
Sijthoff & Noordhoff, Alphen aan elm Rijn, The Netlwrlands

Metropolis N, Rosenbluth AR, Rosenbluth MN, Teller AH, Teller E (1953) Equation of State
Calculations by Fast Computing I\Tachines. Journal of Chemical Physics 21(6): 1087-1092

Moore M, Wilhelms J (1988) Collision Detection and Response for Computer Animation. Com-
puter Graphics (SIGGRAPH '88 Proceedings) 22(4): 289-298

Olofsson B (1964) A General Model of Fabric as a Geometric-Mechanical Structure. Journal of


the Textile Institute 5.5: T.541-T.5.57

Peirce FT (1937) The Geometry of Cloth Structure. Journal of the Textile Institute 28: T45-T97

Phan-Thien N (1980) A Constitutive Equation for Fabrics. Textile Research Journal 50: 543-547

Platt JC, Barr AH, (1988) Constraint Methods for Flexible Models. Computer Graphics (SIG-
GRAPH '88 Proceedings) 22(4): 279-288

Shanahan W J, Lloyd OW, Hearle JWS (1978a) Characterizing the Elastic Behavior of Textile
Fabrics in Complex Deformation. Textile Research Journal 48: 495-505

Shanahan W J, Hearle JWS, (1978b) An Energy Method for Calculations in Fabric Mechanics,
Part II: Examples of Application of the Method to Woven Fabrics. Journal of the Textile
Institute 69: 81-91

Terzopoulos D, Fleischer K (1988) Deformable Models. The Visual Computer 4: 306-331


133

Weil J (1986) The Synthesis of Cloth Objects. Computer Graphics (SIGGRAPH '86 Proceed-
ings) 20(4): 359-376

Witkin A, Fleischer K, Barr A (1987) Energy Constraints on Parameterized Models. Computer


Graphics (SIGGRAPH '87 Proceedings) 21 (4): 225-232

David E. Breen is a research engineer at the Rensselaer Design Research


Center (formerly the Center for Interactive Computer Graphics). He has
been on the full-time staff of the RDRC since 1985. From August 1987
to July 1988 he was a visiting research engineer at Zentrum fiir Gra-
phische Datenverarbeitung in Darmstadt, Germany. His research interests
include particle-based modeling, dynamic simulation, computer animation,
geometric modeling and object-oriented programming. Breen has exhibit-
ed several computer-generated images in the SIGGRAPH Art Show and
technical slide set, along with animations in the SIGGRAPH Animation
Screening Room, over the past five years. His papers and images have
been published in various proceedings, books and calendars in the United
States, Europe and Japan. He is a member of ACM SIGGRAPH and the
IEEE Computer Society.
Breen received his AB in Physics from Colgate University in 1982. He
received his MS in Electrical, Computer and Systems Engineering
(ECSE) from Rensselaer Polytechnic Institute in 1985 and is currently
pursuing his PhD in ECSE at RPI.
Address: Rensselaer Design Research Center, cn 7015, Rensselaer Po-
lytechnic Institute, Troy, NY, 12180, USA, david@rdrc.rpi.edu.

Donald H. House is an Associate Professor of Computer Science at Wil-


liams College, and a Visiting Research Scientist at the Rensselaer Design
Research Center. He holds a Bachelors Degree in Mathematics from Un-
ion College, a Masters in Electrical Engineering from Rensselaer, and a
PhD in Computer Science from the University of Massachusetts. Dr.
House was with the General Electric Company for ten years, working
mostly in process and industrial automation. His early research interests
were in Computational Neuroscience, investigating the computational stra-
tegies of depth perception in frogs and toads. His current research in-
terests are in computer graphics and animation, focusing on particle-based
physical models of flexible materials. He is a member of ACM SIG-
GRAPH.
Address: Department of Computer Science, Williams College, Williams-
town, MA, 01267, USA, house@cs.williams.edu.
134

Phillip H. Getto is a research engineer with Rasna Corp., where he is


member of the Tools and Applications Group. His interests focus on
computational geometry, sampling theory, realistic image synthesis,
object-oriented computer graphics, computer animation and user interface
design.
Prior to joining Rasna, he was co-leader of the Visual Technologies Pro-
gram at Rensselaer Polytechnic Institute's Design Research Center.
While there he produced several computer generated animations, which
have been shown at the SIGGRAPH conference. He is also the responsi-
ble for the SIGGRAPH '89 poster image and print logo.
Getto is also a PhD candidate in the Department of Electrical, Computer
and Systems Engineering at Rensselaer. He has received a BS and ME in
computer and systems engineering from Rensselaer. Getto is a member
of the IEEE, IEEE Computer Society, the ACM and ACM SIGGRAPH,
and Computer Professionals for Social Responsibility.
Address: Rasna Corp., 2590 North First St., Suite 200, San Jose, CA,
95131, USA, phil@rasna.com, ( .. !uunet!rasna!phil).
Physically-Based Interactive Camera Motion
Control Using 3D Input Devices
Russell Turner, Francis Balaguer, Enrico Gobbetti, and Daniel Thalmann

ABSTRACT

The newest three-dimensional input devices, together with high speed graphics workstations, make il
possible to interactively specify virtual camera motions for animation in real time. In this paper, we
describe how naturalistic interaction and realistic-looking motion can be achieved by using a physically-
based model of the camera's behavior. Our approach is to create an abstract physical model of the camera,
using the laws of classical mechanics, which is used to simulate the virtual camera motion in real time in
response to force data from the various 3D input devices (e.g. the Spaceball, Polhemus and DataGlove).
The behavior of the model is determined by several physical parameters such as mass, moment of inertia,
and various friction coefficients which can all be varied interactively, and by constraints on the camera's
degrees of freedom which can be simulated by setting certain friction parameters to very high values. This
allows us to explore a continuous range of physically-based metaphors for controlling the camera motion.
We present the results of experiments with several of these metaphors and contrast them with existing ones.

Keywords: 3D Interaction, Motion Control, Dynamics, Virtual Cameras

1. INTRODUCTION
Specifying virtual camera motion is an important problem in a number of different computer graphics
areas. Animation, scientific visualization, CAD and virtual environments all make considerable use of
virtual camera motion in a three-dimensional environment (Brooks et al (1986), Baum et al (1990),
Magnenat-Thalmann and Thalmann (1986), Shinagawa et al (1990)). Computer animation in particular,
was quick to take advantage of the visual impact (and ease of programming) of complicated camera motions
through relatively static scenes. This was, and still is, the basis for much of the commercial computer
animations produced. In scientific visualization, CAD, and virtual environments applications, virtual
camera motion is often the most important form of three-dimensional interaction (Watson (1989)).
Now, with the existence of graphics workstations able to display complex scenes containing several
thousands of polygons at interactive speed, and with the advent of such new interactive devices as the
Spaceball, Polhemus 3Space, and DataGlove, it is possible to create applications based on a full 3D
interaction metaphor in which the specification of the camera motion is given in real-time. For example, in
a virtual environment application the camera becomes the virtual "eyeball" with which the user inspects the
virtual reality; in architectural CAD applications, the user has the ability to walk through virtual buildings
and inspect them from any angle; in computer animation systems, the animator can specify the camera
motion for a scene interactively in real time; for scientific visualization, large multi-dimensional data sets
can be inspected by walking through 3D projections.
In such systems, camera control is a key technique which must be as natural and intuitive as possible so
that the user is no longer conscious of it and can concentrate on the application task. In fact, interactive
camera control might be thought of as a critical component in establishing a new 3D user-interface metaphor
which will do for the 3D workstation what the desktop metaphor did for the 2D workstation.
However, the relationship between device input and virtual camera motion is not as straightforward as
one might think. Usually, some sort of mathematical function or "filter" has to be placed between the raw
3D input device data and the virtual camera viewing parameters. This filter is usually associated with some
sort of real-world metaphor such as "flying vehicle" metaphor, or "eyeball-in-hand" metaphor (Ware and
Osborne 1990). Several recent papers have proposed and compared different metaphors for virtual camera

135
136

motion control in virtual environments using input devices with six degrees of freedom (Ware and Osborne
1990, Mackinlay et al 1990). These metaphors are usually based on a kinematic model of control, where
the virtual camera position, orientation, or velocity is set as a direct function of an input device coordinate.

One way to obtain more appealing motions in computer animation is to shift from a kinematic to a
dynamic model. Recent publications describe how very naturalistic motions for animation can been
obtained using physics-based models of rigid and flexible bodies (Terzopoulos and Platt (1989), Hahn
(1988». Although these models are usually extremely CPU-intensive, it is possible to use certain simplified
models which can be calculated in real-time for 3D interaction. This can lead not only to improved
interaction techniques, but may also shed light on the intuitiveness of physically based models. One such
physical model that we propose is an interactive camera control metaphor based on physical modeling of the
virtual camera, using forward dynamics for motion specification. This model is motivated by several
hypotheses:

• the interaction is natural because humans are used to physical "Newtonian" behavior
• the movements have a natural look for the same reason
• it is a general parametric model so that a continuously variable set of behaviors can be obtained by
varying the parameters
• the parameters have physical meaning and are easy to understand

In this paper we present a physical description and mathematical derivation of the physical camera
model. We then give our subjective impressions of the interactive "look and feel" of the model with
different parameter settings using the Spaceball as an input device. We then give some examples of how
this model could be used to aid interactive camera motion in a specific domain. Finally, a description of the
numerical technique and implementation on a Silicon Graphics Iris workstation is given.

2. THE PHYSICAL MODEL


Like all physically-based modelling in computer graphics, the physical camera model is motivated by the
assumption that human beings are best-equipped to deal with environments that resemble the natural world.
In our case, since we are using video display terminals for visual input, the "natural world" is, we propose,
the world of film and video imagery. The natural virtual camera model is therefore most appropriately a
real movie camera held by hand or mounted on some type of device for controlling camera motion.

Although there exist a variety of machines and vehicles used by the film industry for controlling camera
movement, and it would obviously be possible to physically model all of these, the large number of camera
movements in the "real" film/video world are created by one or more cameramen moving a mounted camera
by hand. Therefore, we have chosen to model an idealized real-world camera which is manipulated by a
human being who exerts forces and torques on it. In this case, the behavior of the camera--the virtual
camera metaphor--is determined by the mechanical properties of the camera. These properties can then be
considered the parameters of our parametric virtual camera model.

2.1 The Parametric Camera Model


Although real cameras and mounts are complicated mechanical devices, their motions are determined for
the most part by a few simple gross physical properties. Therefore, we have constructed an idealized
physical model consisting of a single rigid body attached to a massless camera mount. The camera mount
consists of three gimbals and three rails, one in each of the x, y, and z local coordinates, resulting in three
Cartesian and three rotational degrees of freedom. Each of the gimbals and rails exerts friction and elastic
forces on the camera.

The important mechanical properties of this model which affect its motion are its mass, its moments of
inertia, and the coefficients of friction and elastic forces imposed by the camera mount. The mass
parameter specifies the amount by which the various forces will change the camera's linear velocity over
time. The moment of inertia parameters affect, in an analogous way, the camera's response to the various
torques. Usually a real camera's degrees of freedom are constrained in some geometrical way. For
example, it may be placed on a dolly or a railway or tripod, constraining its linear motion to two, one and
zero dimensions respectively. Likewise, the angular degrees of freedom may be restricted in one, two, or
three axes by locking the gimbal bearings on the camera mount. Friction forces tend to reduce the linear
and angular velocity of the camera over time and to oppose the applied forces and torques. There are
several types of phenomenological friction forces used by physicists to model the dissipation of kinetic
137

energy. We have found two, viscous friction and static friction, to be useful in our model. Viscous
friction is proportional and opposite to the direction of motion, bringing the camera eventually to rest.
Static friction is a constant force opposing the applied force, active only when the camera is at rest or below
a threshold velocity. Although real camera mounts usually try to minimize vibrations, we have found it
useful to add an elastic parameter, in the form of a Hookian spring, to each degree of freedom. This
elasticity is only important when the camera is at rest (i.e. static friction is active) and results in a more
naturalistic transition from the static state to the dynamic state and back.

2.2 Motion Under External Driving Forces and Torques


Fortunately, the simplified camera model is quite easy to analyze physically and, from the point of view
of classical mechanics, is a well understood problem (Feynman et al (1963)). The general motion of a rigid
body such as a camera can be decomposed into a linear motion of its center of mass under the control of an
external net force and a rotational motion about the center of mass under the control of an external net
torque.

Fig 1: The virtual camera driven by a force and a torque

2.2.1 Linear Motion


The linear motion under a total net force F can be computed by solving the differential equation
.. F
X=-
m (1)

where m is the mass of the camera and X the position vector of its center of mass.

To compute the total net force F we have to take into account the driving force Fd and the forces that are
generated by all the various frictions Fr.

When the camera is moving at a speed below a given threshold velocity Vs and the driving force is
smaller than a specified limit Fs, we consider that the camera is in a static situation and that the frictions are
caused by the springs that are present in the camera mount. Therefore we have

(2)

where Xo is the position where the camera first entered the static situation, kvs is the damping factor of
the springs and kss the spring constant.
138

When we are at speeds higher than Vs or the driving force is bigger than F s, we consider the dynamic
situation, where the frictions are mainly viscous. In that case we have

(3)

where kVd is the viscous friction coefficient.

It is useful to control the behavior of the camera separately for each of its principal axes. For this
reason, the equations will be solved by projecting them onto the body-fixed reference frame, and a different
value of the friction parameters will be specified for each one of the local axes.

2.2.2 Angular Motion


The rotational dynamics is expressed in a body-fixed reference system by the equation

(4)

e
where is the orientation of the camera expressed using Euler angles, I is the moment of inertia tensor
of the body and is constant in the body-fixed reference frame, and T d is the total net torque applied to the
camera.

The modeling of the friction is analogous to the translational case. We can define o)s to represent the
threshold angular velocity and T s to represent the threshold torque. In a static situation the friction torque
Tris given by

(5)

where eO is the orientation where the camera first entered the static situation, k vs-rot is the damping
factor of the springs and kss-rot the spring constant.

In the dynamic case we have

Tf=kvd_ro~ (6)
where kvd-rot is the viscous friction coefficient.

As for the translational case, it is useful to control the behavior of the camera separately for each of its
principal axes, and a different value of the friction parameters will be specified for each one of the local
axes.

3. THE PARAMETERS
The behavior of the virtual camera in response to user actions is completely specified by the parameters
of the physical model of the camera. A different value of the friction parameters can be specified for every
dimension in the reference frame attached to the camera. By this means we provide an easy and intuitive
way for controlling the behavior of the model separately for each of the six degrees of freedom of the
camera and several interesting camera motions or "camera metaphors" can be specified.

3.1 Mass and Inertia Tensor


The camera mass parameter determines the degree of acceleration control by the user. If all other
parameters are set to zero, the camera maintains a steady velocity and changes its velocity only in response
to user input, resulting in a pure acceleration control metaphor. Higher mass results in a smoother, more
continuous motions and a higher degree of acceleration control, although it also makes it more difficult to
bring the camera to rest at a given location. A high mass parameter is useful in situations where a smooth
139

camera motion is desired, or where continuous motion is wanted with minimal input from the user, for
example, tasks such as surveying a large scene or moving along a straight path.

In an analogous way, the inertia tensor determines the degree of torque control by the user over the
camera orientation. A large inertia tensor results in smooth panning and tilting motions. This is often
desirable because jerky camera rotation can be disorienting. Without a correspondingly high rotational
friction parameter, however, it can be difficult to stop the camera from rotating. This is usually much more
disorienting than the analogous translational situation. High inertia tensors are useful for the same sorts of
tasks that high mass parameters are useful for: slow steady examination of a large scene. For more precise
control up close to objects, a lower inertia tensor is preferable.

3.2 Viscous Friction Coefficient


The viscous friction parameter specifies the degree of velocity control. If the other parameters are set to
zero, the camera metaphor becomes a pure velocity control one. Velocity control is useful for tasks where
quick stopping and changes of direction are necessary, such as avoiding obstacles or inspecting objects up
close. A typical application where a high viscous friction coefficient would be useful is a three-dimensional
modeler. One problem with high viscous friction camera metaphors is that the camera motions are not
usually very smooth and the user must give continuous input while the camera is moving. For many
interactive tasks, this is not a problem. For other tasks, for example an architecture walk-through
application, a suitable combination of velocity and acceleration metaphors can be formed by adjusting
various amounts of the mass and viscous friction parameters.

The rotational analog to the viscous friction coefficient is the rotational viscous friction coefficient,
which determines the amount of angular velocity control. Angular velocity control is particularly useful
because of the need to stop camera rotation quickly. The balance between the moment of inertia and the
rotational viscous friction coefficient determines the smoothness of camera panning motions, and many
real-world camera mounts have adjustable angular viscous friction controls.

3.3 Threshold Force and Velocity


A small static friction parameter establishes a threshold force below which the camera will stay
stationary. A small threshold velocity provides a breaking force that brings the camera to rest more rapidly
than the viscous friction forces. This can be useful for tasks in which the user wants to alternate between
motion along different degrees of freedom. For example, moving the camera along only one axis at a time
is easy as long as the threshold force is not exceeded in the other axes. Likewise, a task that involves
hopping to a fixed location, looking around in different directions without moving, then hopping to the next
location, is much easier with a small amount of static friction.

By setting the static friction parameter extremely high, its possible to, in effect, lock a particular degree
of freedom, resulting in a constrained motion. For example, by locking one of the rotational degrees of
freedom, the camera is forced to always maintain the same "up" direction, and by locking one of the
translational degrees of freedom, the camera is forced to move in a plane. This could be useful in an
architectural walk-through application to simulate a walking person's point of view. Locking the vertical
and horizontal translational degrees of freedom results in a flying-vehicle camera metaphor.

3.4 Static Behavior: Damping Factor and Spring Constant


The spring constant and damping factor parameters control the vibrational behavior of the camera mount
when it is in the static state. A small amount of damped vibration smooths out the jerkiness in the transition
between dynamic and static states. It also provides a small degree of position control feedback while the
camera is in the static state. In this way, a small applied force will move the camera slightly, but it will pop
back to its rest position. A larger force, above the static friction threshold, will set the camera in motion.
This allows the user to get an idea of what direction an applied force will act in before actually moving the
camera's position. If the static friction parameter is set extremely high, then the camera becomes locked in
the static state and a position control camera metaphor results. Finally, if the damping factor is set low, a
genuinely bouncy camera motion can be created. This can be used to simulate a hand-held camera motion.
140

4. CONTINUOUS VARIATION BETWEEN DIFFERENT BEHAVIORS


It is generally accepted that no one particular type of camera motion or camera control is appropriate for
all tasks (Ware and Osborne 1990). For example, in a scene editor application, the user might require
acceleration control for moving rapidly and smoothly as he surveys the overall organization of the scene,
while the task of inspecting and moving an individual object in detail, where the camera viewpoint is close
to the object, might require more precise velocity control.

Unfortunately, genuinely useful tasks often require a combination of metaphors or a sequence of


alternating metaphors. A common way to deal with this is to allow the user to continually change camera
control metaphors by swapping through the different interaction modes. Modes can be changed by selecting
menus, striking keys or mouse buttons. There are two problems with this technique. First it is obviously
inconvenient and unnatural to continually change modes. The user has to stop performing his task to swap
computer and mental modes. After a while, this can be distracting and will often inhibit the user from
changing metaphors until the current one becomes really impracticable. Secondly, it is not necessarily true
that there are only a finite number of discrete useful metaphors. There are, in fact, a lot of in between
situations where no pure metaphor suits the task best (Mackin lay et al 1990).

The parameterized physical camera model provides a solution to this problem by giving us a way to
control the camera behavior through its parameter values. For a given task, the user can experiment with
the camera metaphor and tune the parameters interactively, through valuators, until a subjectively "best" set
of parameters is found. These parameters can then be saved and restored, either automatically or by the
user, whenever that particular task is encountered. Alternately, the user can interactively control certain
camera parameters while he is performing his task. For example, a mouse button or foot pedal can be set to
momentarily increase the amount of viscous friction, acting as a kind of break.

Ideally, however, the user should only be required to use a minimum of input and should not have to be
overly aware of the camera metaphor at all. This can be achieved by having the application adjust the
camera control parameters algorithmically as a function of position or some aspect of the task at hand. This
is potentially the most powerful use of the parametric camera model.

For example, we have experimented with creating a scalar viscosity field within a scene such that the
camera viscous friction parameter increases in the vicinity of objects. This results in a camera metaphor that
continuously varies from mainly acceleration control when the camera is far away from objects, to mainly
velocity control when the camera is close to an object. In this' way, the camera's behavior varies as a
function of its distance from to objects.

Fig 2: Exploring a scene


141

Fig. 3: Inspecting an object

5. USE OF 3D DEVICES
Our dynamic camera control system is implemented on a Silicon Graphics Iris workstation in C using an
object-oriented style of programming, on top of the Fifth Dimension 3D interaction toolkit (Turner et al
1990). In addition to the typical sons of 3D classes, such as lights, hierarchical models and cameras, the
toolkit abstracts every input device as an instance of an input device class. These input device objects
communicate their data through a unifonn event message protocol.

Clock

Raw Input Virtual


I--f-ro-m-d-c-v-ic"c~ SpaceBall
Devicc
Physical Tran formation CWllCr.!
Model
Controller

Raw Input Virtual


1-----_."1 Polhemu
from device Device

Polhemus
Digitizer

Fig. 4: Event communication diagram for camera controller


142

The interface between the Input Device objects and the Camera object is implemented by a Camera
Controller object. This object receives events from input devices, interprets the data according to a
particular camera metaphor, and updates the camera object's position and orientation accordingly. Since
our Camera Controller involves a dynamic simulation, it also receives tick events from a Clock object at
regular time-step intervals.

One of the advantages of this kind of software architecture is that various other devices, such as the
Polhemus 3D, can be interchanged with the Spaceball as input to the Camera Controller. The use of this
kind of devices is essential for providing an intuitive way of controlling the dynamic camera. Obviously,
pressure-sensitive input devices are usually more appropriate because they provide a passive form of
"force-feedback". In our case, the device that gave the best results is the Spaceball. Also, different types of
Camera controllers with different behaviors can be swapped in and out, and the same controllers can be
used to control Light objects or hierarchical models.

6. CONCLUSIONS AND FUTURE WORK

We believe that the physically-based camera control model provides a powerful, general-purpose
metaphor for controlling virtual cameras in interactive 3D environments. Because it is based on a real
camera model, it is natural for the user to control. Its parameters are physically-based and, therefore, easy
to understand and intuitive for the user to manipulate. Its generality and control parameters make it
configurable to emulate a continuum of camera behaviors ranging from pure position control to pure
acceleration control. As it is fully described by its physical parameters, it is possible to construct more
sophisticated virtual camera control metaphors by varying the parameters as a function of space, time,
application data or other user input. Also, when used with force-calibrated input devices, the camera
metaphor can be reproduced exactly on different hardware and software platforms, providing a predictable
standard interactive "feel".

We are currently working on extending our model to specify camera paths for computer animation.
Currently, the interactive camera metaphor permits the generation of position, orientation and acceleration
information simultaneously along a path. For animation purposes, it is possible to record this camera
motion and then edit it interactively by replaying and selectively re-recording parts of the input data. For
example, on the first pass we might be happy with the recorded position path, but not with the camera
orientation. Therefore, on the second pass we can re-record only the orientation information.

We are also continuing to explore different kinds of algorithmic control of the camera parameters by
creating various types of parameter fields within the scene space, and we are planning to use the dynamic
model approach for other kinds of interactive tasks such as modeling, assembling, and specifying the
animation of different kinds of objects.

ACKNOWLEDGEMENTS

The authors are grateful to Angelo Mangili for his technical help.
The research was partly sponsored by "Le Fond National pour la Recherche Scientifique".

REFERENCES

Baum R., Win gel J.W (1990) Real Time Radiosity Through Parallel Processing and Hardware
Acceleration Proceedings 1990 Workshop on Interactive 3D Graphics ACM: 67-75
Brooks F.P. Jr (1986) Walkthrough - A Dynamic Graphics System for Simulating Virtual Buildings
Proceedings 1986 Workshop on Interactive 3D Graphics ACM : 9-22
Feynman R.P., Leighton R.B., Sands M. (1963) The Feynman Lectures on Physics Addison-Wesley
Reading, Massachusetts.
Hahn J.K. (1988) Realistic Animation of Rigid Bodies Computer Graphics 22(4) : 299-308
Mackinlay J.D., Card S.K, Robertson G. (1990) Rapid Controlled Movement Through a Virtual 3D
Workspace,ComputerGraphics 24(4): 171-176
Magnenat-Thalmann N., Thalmann D. (1986) Special Cinematographic Effects Using Multiple Virtual
Movie Cameras, IEEE Computer Graphics & Applications 6(4): 43-50
Shinogawa Y., Kunii T.L., Nomura Y., Okuno T., Young Y. (1990) Automating View Function
Generation for Walkthrough Animation Using a Reeb Graph, Proceedings Computer Animation 90
Springer, Tokyo: 227-238
143

Terzopoulos D., Platt J. (1989) Physically-Based Modeling: Past, Present and Future SIGGRAPH 1989
Panel Proceedings ACM: 191-209
Turner R., Gobbetti E., Balaguer F., Mangili A., Thalmann D., Magnenat-Thalmann N. (1990) An Object
Oriented Methodology Using Dynamic Variables for Animation and Scientific Visualization Proceedings
Computer Graphics 1nternational90 Springer-Verlag: 317-328
Ware C., Osborne S. (1990) Exploration and Virtual Camera Control in Virtual Three Dimensional
Environments Proceedings 1990 Workshop on Interactive 3D Graphics ACM: 175-183
Watson V. (1989) A Breakthrough for Experiencing and Understanding Simulated Physics ACM
SIGGRAPH Course Notes on State of the Art in Data Visualization, IV -26 - IV -32

APPENDICES

A. MOMENT OF INERTIA TENSOR

The rotational equivalent of the mass is the moment of inertia tensor I. It can be represented by a three-
dimensional symmetric matrix whose elements are given by (Feynman, 1963):

Ixx= f (/ + z2}dm
I yy = f (x 2 + z2}dm

I zz = f(x 2 + y2}dm
Ixy=lyx=-f xydm

lyz=lzy=-f yzdm

Ixz= Izx= - f xz dm
(AI)

The diagonal elements of this matrix represent the moment of inertia of the body, and the non-diagonal
elements represent the products of inertia of the axes. When the reference frame where the moment of
inertia is specified corresponds with the principal inertia frame, all the products of inertia are null, and I
becomes a diagonal matrix. In such conditions the equation of rotational motion (number (4) in the text)
simplifies to the well known Euler equations (Feynman, 1963):

Ixx9 x +(I zz - Id9 fty=T x


.. . .
lyy9y + (Ixx - 1J9 x9z = T y
I zz9 z + (I yy - I xx)9 y9 x = T z (A2)

We approximate the camera model with a rectangular box of homogeneous distribution with dimensions
d x, d y, and d z to obtain the values of the moments of inertia:

(A3)

If we consider that the box has nearly the same dimension on the three axes, then equation (A2) can be
linearized.
144

B. NUMERICAL INTEGRATION OF THE EQUATIONS OF MOTION


To simulate the behavior of our virtual camera, the equations describing its motion in response to the
external driving forces and torques have to be integrated through time. If we project these equations on the
local axis, we obtain six ordinary scalar differential equations of the second order, under the assumption
that equations A2 can be linearized by considering their second term piecewise constant. These equations
are of the form:

aX + bx + cx + d = f(t) (Bl)

Although an analytical solution exists for many of the forms of this equation, the driving function in this
case is not analytic but rather strictly data-driven, being the instantaneous user force or torque input over
time. Therefore we must use a numerical method to find the solution.

To solve the equation, time is subdivided into equal time steps ~t. We use the second order accurate
approximations

(B2)

that we substitute into equation (1) to find the explicit integrator:

(B3)

This explicit procedure evolves the dynamic solution from given initial conditions Xo and X-I . The
current and the previous value of x are used to solve for the value at a small time ~t later. No oversampling
is necessary, because the precision required is not high (the user interacts with the solver), and ~t
represents the time increment between displayed frames. Our current implementation allows us to have an
interactive display rate (more than 10 Hz) on a Silicon Graphics Iris 4D/80 with fully shaded scenes
containing up to two-thousands of polygons. The time spent for the dynamic computations is negligible
with respect to the redraw time.

Russell Turner is a researcher at the Computer Graphics Laboratory of


the Swiss Federal Institute of Technology in Lausanne, Switzerland. He
received his B.S. in Physics and his M.S. in Computer and Information
Science from the University of Massachusetts at Amherst. He has also
worked as a software engineer for V.1. Corporation of Amherst,
Massachusetts. His research interests include computer animation,
physical modeling, user-interfaces and object-oriented programming. He ,
is a member of IEEE and ACM.
E-mail: turner@elma.epfl.CH
145

Francis Balaguer is a researcher at the Computer Graphics Laboratory


of the Swiss Federal Institute of Technology in Lausanne, Switzerland.
He received his dipl6me d'ingenieur informaticien from the Institut
National des Sciences Appliquees (INSA) in Lyon, France. His research
interests include 3D interaction, computer animation, user-interfaces, and
object-oriented programming.
E-mail: balaguer@ligsg2.epfl.CH

Enrico Gobbetti is a researcher at the Computer Graphics Laboratory of


the Swiss Federal Institute of Technology in Lausanne, Switzerland. He
received his dip16me d'ingenieur informaticien from the same institute. His
research interests include visualization, computer animation, human-
computer interaction and object-oriented programming.
E-mail: g o bbetti@elma.epfl.CH

Daniel Thalmann is currently full Professor and Director of the


Computer Graphics Laboratory at the Swiss Federal Institute of
Technology in Lausanne, Switzerland. Since 1977, he was Professor at
the University of Montreal and codirector of the MIRALab research
laboratory. He received his diploma in nuclear physics and Ph.D in
Computer Science from the University of Geneva. He is coeditor-in-chief
of the Journal of Visualization and Computer Animation, member of the
editorial board of the Visual Computer and cochairs the
EUROGRAPHICS Working Group on Computer Simulation and
Animation . Daniel Thalmann's research interests include 3D computer
animation, image synthesis, and scientific visualization. He has published
more than 100 papers in these areas and is coauthor of several books
including: Computer Animation: Theory and Practice and Image
Synthesis: Theory and Practice. He is also codirector of several computer-
generated films: Dream Flight, Eglantine, Rendez-vous d Montreal ,
Galaxy Sweetheart, lAD, Flashback.
E-mail: thalmann@eldi.epfl.CH

The authors may be contacted at: Computer Graphics Laboratory


Swiss Federal Institute of Technology
CH-1015 Lausanne, Switzerland
Tel: 41.21.693.52. 14
Fax: 41.21.693.39.09
Aspects of Motion Design for Physically-Based
Animation
David Haumann, lakub Wejchert, Kavi Arya, Bob Bacon, Al Khorasani,
Alan Norton, and Paula Sweeney

Abstract
We explore ways in which physically-based simulation can generate the motion of objects
for computer animation. We model flexible and brittle objects and their interaction with
wind fields using classical mechanics. We have enhanced our animation environment with
a capability that allows quick visual preview of simulations. This is an essential tool for
prototyping motion and previewing scenes before they are converted into high quality images
with a ray tracer. Complex motion can be both specified and controlled by designing wind
fields and by making use of the preview capability. Examples are given of how fields and
preview are used to create animated scenes that involve hundreds of objects in wind fields.
Keywords: animation, physically-based modeling, flexible objects, vector fields, simulation.
dynamics.
CR Categories and Subject Descriptors: I.3.7-Three dimensional graphics and realism
(Animation); I.6.3-Simulation and modeling (Applications).

1 INTRODUCTION
Physically-based computer animation uses physical principles to describe objects and their
motion in a simulated world. The two main advantages to such an approach, are realism and
automated motion generation. This latter advantage allows the motion of many interacting
objects to be generated by simulation as opposed to the traditional method of keyframing which
requires the animator to specify explicitly the object positions at each key frame. The current
research in physically-based modeling for animation explores ways to control the simulations
to produce the motion desired by the animator.
A good physically-based animation model should incorporate a wide range of objects and
motions, being general enough to simulate ordinary objects in typical everyday situations.
The models should give a plausible visible presentation of objects responding to forces in their
environment. Although physics is the basis of such models, physical accuracy is only measured
at a visual and phenomenological level. Furthermore, empirically defined physical laws may
be approximated for motion control or exaggerated for visual effects. Thus, the objective of

147
148

physically-based animation is to simulate a wide range of phenomena at a coarse level within


one environment, and also to incorporate user controls over such simulations.
A variety of methods have been used to simulate the motion of rigid bodies, flexible bodies
and fluid-like behaviors. Particle systems have been used to simulate fluid-like activity of fire
(Reeves 83), ocean foam (Fournier 86) and to model streams and fountains (Sims 90). Miller
(Miller 89) uses a "globular" particle based method to simulate fluid flow. The behavior of
collections of animate objects such as birds (Reynolds 87, Amkraut 89), individual motion
of worms (Miller 88), and articulated figures (Girard 86, Hahn 88, Wilhelms 87, and Is5acs,
87) have all drawn upon physical models to some extent. Hahn (Hahn 88), Barzel (Barzel
88) and Witkin (Witkin 88) have all been concerned with realistic simulation and control of
rigid bodies. Platt (Platt 88) and Terzopoulos (Terzopoulos 87,88) have been concerned with
modelling and controlling the motion of flexible objects. "Ve have attempted to incorporate
simulation of flexibility, fracture (Norton 90,91), and fluid effects into one general purpose
animation environment.

2 PIPELINE DESCRIPTION

----
~
MODELING
r---
SIMULATION
i"'I
RENDERING

'-- PREVIEWING ~
RECORDING
EDITING

Figure 1: Animation Pipeline

Creating a physically-based animation requires several diverse stages: tllf' creation of a story-
board, computer simulation, image rendering. video editing and the making of a sound track.
Typically these stages are interconnected and do not occur sequentially. Figure 1 shows a
diagram of the animation system pipeline we employ encompassing the stages of modelling,
simulation, previewing, rendering, and recording. The modeling phase assigns geometric and
physical properties (such as mass, stiffness and damping) to objects to be animated. The
models are input to the simulator along with initial and environmental conditions (such as
position, velocity, and wind fields). The simulator generates the motion of the objects over
time. The previewer is used to obtain fast visual playback from the simulation so that the
simulation may be modified. Once motion is acceptable, it is rendered using a ray-tracer and
recorded onto one-inch video.
As an example of the animation pipeline in practice, consider a scene that consists of a col-
lection of leaves blowing in the wind. A first step is to build a leaf shape out of masses and
149

springs. Then one has to design a set of wind fields, decide how many leaves to use and set their
initial positions. A test simulation can be run and the results quickly previewed. Changes to
some of the free parameters (such as wind velocities, position of wind fields, number of leaves)
can then be made. Finally, this cycle is repeated until a satisfactory motion results. Only
then is the polygonal description rendered with the ray-tracer and recorded. vVe restrict the
following discussion to the simulation and previewing stages.

3 SIMULATION

3.1 Flexible objects, breakage and wind fields


In our simulations, flexible and brittle objects and their interactions with wind fields have been
modeled using Newtonian mechanics. To model a flexible material a 3D mesh of interconnected
masses and springs is used. The masses accelerate according to the sum of internal and external
forces applied to them. The internal forces in the sheet are modeled by the stretching of the
springs. External forces acting on objects can include gravity, friction, or wind. This material
can be molded into many geometrical shapes such as cylinders or teapots. To simulate the
breaking or tearing of objects, a threshold is associated with every spring. A spring breaks
if it is stretched beyond a threshold, and on a macroscopic scale the breaking of many bonds
causes a fracture or tear.

"-...L/ "L/
/1"
@
/!"-...
SINK SOURCE

VORTEX

1 1 1 t t t
UNIFORM BOUNDED UNIFORM

Figure 2: Field Primitives

The introduction of general vector fields into the existing model allows us to simulate the
motion of objects in wind. vVe define a wind field as a function that maps position in spa.ce
to the velocity of the wind at that point. Given the complexity of solving the nOll-linear
:\avier-Stokes equations we make several simplifying assumptions. First, we assume that the
wind fields are not affected by the objects that are placed in them. Thus, our simulations will
not exhibit wind "shadowing" effects where an upwind object shields objects downwind. In
addition, by assuming inviscid, incompressible, and irrotational steady flow, we can linearly
150

superimpose a set of simple analytic solutions of the Navier-Stokes equations. We use vortices,
sinks, sources, uniform and bounded uniform flows as our primitive wind fields (figure 2). A
bounded uniform field has a zero velocity in one half space and a non-zero velocity in the
other. By linear addition of these basic types, complicated flows can be easily constructed and
used to affect the motion of objects in an animated scene.
In summary, our simulation consists of masses, springs, and wind fields, all obeying classical
mechanics. These basic building blocks can be assembled together to create a sophisticated
animation environment.

3.2 Dynamics
The evolution of the system through time is carried out by integrating Newton's second law
F = rna. At each time step l:!.t, and for each mass point mi, the total force Fi acting on the
point is computed. This determines the acceleration ai of that mass point and using a first
order difference approximation (Euler's Method) the extrapolated velocity Vi and position rj
at time t + l:!.t is given as follows:

ai = Fdmi. (1 )

Vi(t + l:!.t) = Vi(t) + ail:!.t, (2)

rj(t + l:!.t) = rj(t) + vi(t)l:!.i. (3)

Once all the positions have been updated the same cycle is repeated for the next time step.
A full simulation repeats this procedure for N time steps. As fields are the latest addition
to our simulation environment, we discuss these in more detail here. A description of other
contributions to the sum of forces, such as springs and gravity, may be found in (Norton 90).

3.3 Field-Object Interaction


The use of fields for motion control is not new to computer animation. Karl Sims has used
velocity and acceleration operators to control the motion of particles (Sims 90). Terzopoulos
suggests using potential energy fields to prevent interpenetration of flexible models (Terzopou-
los 87). In the film Eurhythmy (Amkraut 89), force fields are used to direct the motion of a
flock of birds and to simulate collision avoidance behavior between individual birds. Pintado
(1989) uses non-physical fields to interactively control the motion of objects. Our app~ach
is an extension to the method presented in (Haumann 1988) and differs from those described
above by having fields interact with the surfaces of flexible objects to simulate the effects of
wind.
We have experimented with two different types of fields: force fields and velocity fields. Force
fields apply forces directly to the mass points in an object. These may be used to simulate
magnetic or electrostatic effects. Velocity fields are used to represent fluid velocities for simu-
lating the effects of wind on either particles or surfaces. We have concentrated our efforts on
151

velocity fields wherein the magnitude of the force on a particle is related to the difference be-
tween the particle velocity and the field velocity. Given a velocity field G(x, y, z), the relative
velocity of the mass particle with respect to the field velocity is:

v; == G; - Vi (4 )

where G i refers to the field at the x, y, z position of the particle, and Vi is the particle velocity.
The force of the wind acting on that particle is then:

F; == avi, (5)

where a is a chosen constant relating force to relative velocity.

I KI \ \'(,ll \K 'I KI \( I I<dgL on)

F - WIND FORCE

Figure 3: Wind Force Resolution Diagram

Representing an object as a single mass particle will not result in any rotational effects due to
the wind. To achieve this effect, a more sophisticated (and computationally more expensive)
surface model is used. Wind forces acting on a surface depend on the surface area and the
orientation of a triangular surface with respect to the relative velocity (see figure 3). Given a
mass particle whose position defines one corner of a triangular surface of area A, we resolve
the relative velocity of the particle into the normal and tangential components with respect
to the surface. Thus:

(6)

where vf is the normal component and v; is the tangential component. We use these to
compute normal and tangential forces as follows:
152

F;' = anAv;' (7)


F; = alAv; (8)

Fn is the force experienced by a surface facing into the wind, while Ft is due to the viscous
effects of fluid flowing across the surface; A is the area of the triangular element. Normally,
an is chosen to be much larger than at because surfaces facing into the wind experience much
larger forces than surfaces parallel to the wind. Note that a particle in a triangulated mesh
usually forms a corner of several adjao'irl triangles, and hence will receive force contributions
from each.

4 PREVIEW
Interactive previewing of simulation output is an important capability that was added to
the simulation pipeline. Circumventing rendering and recording saves time, and allows for
a greater level of interactivity between the user and the simulation output. The previewing
program allows the motion produced by the simulator to be animated at acceptable playback
speeds, and the viewpoint can be interactively changed so that the entire space of simulation
data can be explored in detail.
The previewer was developed on a Silicon Graphics 4D /20G workstation which has graphics
hardware capable of rendering several thousand polygons per second. By keeping the entire
animation sequence resident in memory and limiting the scene complexity to a few hundred
triangles per frame, we achieve an acceptable fives frames per second playback speed. The
amount of physical memory limits the length of an animation sequence: typically between one
and two seconds of animation can be stored per megabyte of memory.
The previewer was first used to make visual verifications that the prototype field models were
working correctly. Subsequently, the previewer was used to help design complex wind fields
which were made up of linear combinations of the field primitives. Since the shape of an object
affects its motion, part of the motion design process included changing object geometry. The
previewer quickly allowed us to relate changes in geometry to changes in motion.
Once final simulation data was produced, it was merged with static (non-simulated) back-
ground objects for final rendering. The previewer was used to ensure correct alignment between
the dynamic simulation data and the static background, and for choosing viewpoint positions
for the final animation. Viewpoints were chosen, in part, to convey the most information
about the motion.

5 EXAMPLES OF MOTION DESIGN


5.1 Designing Geometry for Motion:
We designed our animation to show leaves blowing in the wind. An initial prototype leaf was
made with seven triangles (in the final animations these were covered with texture maps). We
used the simulator experimentally to refine this prototype structure into a leaf that exhibited
realistic yet interesting and controllable motion. The initial leaf prototype was duplicated 100
times with slight random variations in geometry, topology, and physical properties such as
153

mass distribution and stiffness. Tests were performed on this varied collection by dropping it
in a zero velocity uniform field. The resulting motion was examined with the previewer and
leaves with desirable motion properties were identified for use in successive tests. Those leaves
which exhibited a lilting, fluttering motion, (appeared aerodynamically unstable) and fell at
an an average speed of 1 meter per second were selected for use in the final simulations.

5.2 Designing Fields for a Specific Scene.


The animated story for this project was concerned with leaves being chased about a playground
by a yard bin. In one scene it was required to have the leaves rise up in surprise from a playful
hovering configuration, and then fall "dead" to the ground. This sequence is depicted in
figures 5-7. The hovering configuration (figure .5) was achieved by using a uniform field acting
vertically to exactly cancel the normal rate of fall of the leaves under gravity. This field
was linearly combined with a weak "volumetric" sink field to keep the leaves hovering within
a desired volume. This "volumetric" sink is constructed from six bounded uniform fields,
forming the sides of a confining cube.

~~~
Hovering leaves

UNIFORM t t t
BOUNDED UNIFORM

t t t
(lower half blows up)
boundaries travel upward

!!!
BOUNDED UNIFORM
(lower half blows down)

Figure 4: Field Combination Csed in Surprise/Fear Scene

The surprise/fear seqllPnCe was designed to appear as a "shockwave" moving rapidly upward
through the hovering leaves. This effect was achieved by successively moving the boundaries
of two uniform fields up through the collection of leaves (see figure 4). The first field had a
strong upward component causing the leaves to rise momentarily and to collect near its upper
boundary. The second field contained a strong downward component designed to pull the
leaves rapidly down to the ground. Figure 6 shows the collecting effect caused by the first
field's boundary as well as a few leaves that have begun to fall as the result of the second.
The combination of these fields causes successive layers of leaves to first rear up, then hurtle
towards the ground. Once the leaves have all landed on the ground, a third uniform horizontal
field causes the leaves to be blown towards the viewer. Figure 7 shows these leaves rolling
along the ground (to escape the pursuit of the yard bin in the background).
154

Figure 5: An upward vertical field keeps leaves hovering; leaves are not flat but slightly
distorted , giving rise to their interesting fluttering motion.

Figure 6: Successive uniform bounded fields moving up through the leaves gives the appearance
of a "shock wave" moving up through leaves, spreading them out and then pulling them to
the ground.
155

Figure 7: Leaves tumbling along the ground (towards viewer) to escape the pursuing garbage
bin (background) . Leaves are blown by a horizontal uniform field.

As mentioned , it is through the linear combination of fields that complex motion paths can
be set up for multiple objects. In the following example t he addition of a vortex, sink and
uniform flow were used to create a scene with a garbage bin inhaling in some freely floating
leaves. Figures 8, 9, and 10 show snapshots from this sequence. In the first shot (figure 8) the
leaves have just begun to be inhaled by the garbage bin. For dramatic effect they were made
to swirl around before being funneled in and down (figure 9). At the end of the scene the
vortex is so strong that the main component of motion is downwards into the open mouth of
the can (figure 10). The previewer was an essential ingredient in setting up the above motion,
because the adjustment of the field strengths affects the overall movement of the leaves.

5.3 TECHNICAL DETAILS

The graphics lab contains many interconnected components operating under a uNIX/ AIX
standard. Modelling, simulation and rendering were done on IBM :3084, 3090 and the new
RS/6000 workstations. Rendering was carried out on a network of RS / 6000's so as to reduce
the computational bottle-neck for this process. Rendering time was usually 1/2 hour per
frame. Previewing was done on a S ilicon Graphics 4D/20G. Recording was done via a Vista
Frames frame buffer and Sony BVH 2.')00 1" recorder. A Matrix Instruments camera was used
to make slides.
156

Figure 8: Leaves begin to be inhaled by the garbage bin. This motion was generated by the
addition of vortex, sink and uniform flows.

Figure 9: Inhalation proceeds; the swirling funnel shape caused by the vortex can be clearly
seen.
157

Figure 10: Inhalation finale: the leaves converge into the bin .

6 CONCLUSIONS

In this paper we have described ways in which physically- based animation can be used to
generate the motion of objects for computer animation. We defined it as a discipline and dif-
ferentiated it from other simulation approaches. We discussed the simulation stages involved in
making animations and gave a pipeline description of modelling, simulation, preview , render-
ing and recording. We described how masses , springs and wind fields, coupled with Newtonian
mechanics can be successfully used to simulate flexible objects, brittle objects and the inter-
action of these with wind forces.
Fields have proven to be an extremely useful way to control multiple objects in the physically-
based paradigm. This coupled with a simulation preview facility enabled us to design motion
of hundreds of leaves blowing in the wind for scenes in the animation "Leaf Magic" (Norton
90). Stages in this work included prototyping and simulation verification using the preview
capability in an iterative fashion .
Given the practical success of using fields we are confident that there are many more physically-
based methods that could be integrated into our existing simulations, such as the descript.ion
and behavior of fluids . We also believe that more tools such as the previewer would further
enchance our animation environment. These tools should be workstation based; designed
to alleviate the rendering and recording bottleneck. Investigations for parallelizing and dis-
tributing the ray tracing processes are obvious candidates. Other enhancements include fi eld
visualization, and interactive field design as suggested by the interesting work of Pintado
(1989) .
158

7 ACKNOWLEDGEMENTS
We wish to thank all the members of the Animation Systems Group, especially Jane .Jung,
without whom this work would not have been possible. Thanks are also due to Tim Kay, Greg
Turk, John Snyder, John Hart and Mike Henderson for software support and consulting.

8 REFERENCES
Amkraut S, (1989) "Flock: A Behavioral Model for Computer Animation", Masters Thesis,
Art Education, The Ohio State University.
Amkraut S, Girard, M, "EURHYTHMY" (1989) Siggraph Video Review, Issue .52 (SIG-
GRAPH '89 Film and Video Show), selection 8. (Video supplement to Computer Graphics).
Barzel R, Barr A, (1988) "A Modeling System based on Dynamic Constraints", Computer
Graphics (SIGGRAPH 88 Proceedings) 22 (4) 179.
Fournier A, Reeves W, (1986) "A Simple Model of Ocean Waves", Computer Graphics (SIG-
GRAPH '86 Proceedings) 20 (4) 7.5.
Girard M, Maciejewski A, (1986) "Computational Modeling for the Computer Animation of
Legged Figures", Computer Graphics (SIGGRAPH 86 Proceedings) 19 (3) 263.
Hahn J., (1988) "Realistic Animation of Rigid Bodies", Computer Graphics (SIGGRAPH 88
Proceedings) 22 (4) 299
Haumann D, Parent R, (1988) "The Behavioral Test-Bed: Obtaining Complex Behavior from
Simple Rules" The Visual Computer 4 (6) 332.
Issacs P, Cohen M, (1987) "Controlling Dynamic Simulation with Kinematic Constraints,
Behavior Functions and Inverse Dynamics", Computer Graphics (SIGGRAPH 87 ProceedingR)
21 (4) 215.
Miller G, (1988) "The Motion Dynamics of Snakes and Worms" Computer Graphics Ir'lG_
GRAPH 88 Proceedings) 22 (4) 169.
Miller G, Pearce A, (1989) "Globular Dynamics: A Connected Particle System for Animating
Viscous Fluids", Siggraph '89 Course Notes, Topics in Physically-Based IVlodeling.
Norton A, et. al., (1990) "Leaf Magic", Computer Generated Film, IBM Research, Yorktown,
N.Y.
Norton A, Turk G, Bacon R, (1991) "Animation and Fracture by Physical Modeling", The
Visual Computer (to appear). See also RC 1.5371 (#68412) 1/11/90 (IBM Computer Science
Research Report).
Pintado X, Fiume E, (1989) "Grafields: Field-Directed Dynamic Splines for Interactive Motion
Control", Computers and Graphics 13 (1) 77.
Platt J, Barr A, (1988) "Constraint Methods for Flexible Models", Computer Graphics (SIG-
GRAPH 88 Proceedings) 22 (4) 279.
Reeves W, (1983) "Particle Systems - A Technique for Modeling a Class of Fuzzy Objects"
Computer Graphics (SIGGRAPH 83 Proceedings) 17 (3) 3.59.
Reynolds C, (1987) "Flocks, Herds, and Schools: A Distributed Behavioral Model", Computer
Graphics (SIGGRAPH 87 Proceedings) 21 (4) 2.5.
159

Sims K, (1990) "Particle Animation and Rendering Using Data Parallel Computation" , Com-
puter Graphics (SIGGRAPH '90 Proceedings) 24 (4) 405.
Terzopoulos D, Platt, J, Barr, A, Fleischer, K, (1987) "Elastically Deformable Models" Com-
puter Graphics (SIGGRAPH '87 Proceedings) 21 (4) 20.5.
Terzopoulos D, Fleischer K, (1988) "Deformable Models" The Visual Computer (1988) 4 306.
Witkin A, Kass M, (1988) "Spacetime Constraints", Computer Graphics (SIGGRAPH 88
Proceedings) 22 (4) 159.
Wilhelms J , (1987) "Using Dynamic Analysis for Realistic Animation of Articulated Bodies" ,
IEEE Computer Graphics and Applications, 7 (6) 12.

The Authors

The Authors from left to right: (standing) Bob Bacon, Alan Norton , David Haumann , Paula
Sweeney, Kavi Arya. (seated) Jakub Wejchert, Al Khorasani.
David Haumann is currently a Research Staff Member at IBM T. J. Watson Research Cen-
ter in Yorktown Heights, N.Y. He received his Ph.D. in Computer S cience at The Ohio State
University in 1989, and his BS in Applied Mathematics from Brown University in 1977. His
experience in computer graphics spans the fields of radiation treat ment planning, flight simula-
tion and commercial computer animation production. His research interests include computer
graphics, animation, and physically-based modeling. He co-produced the award winning short
films "Dynamic Simulations of Flexible Objects" and "Balloon Guy" , and contributed to sev-
eral others, including "Bragger Boppin' in Bean Town", "Broken Heart" and "Dirty Power".
David is a member of Phi Kappa Phi Honor Society, ACM (SIGGRAPH) and IEEE. David
can be reached at : IBM T. J. Watson Research Center, Yorktown Heights, New York 10.598.
(Email: haumann@ibm.com)
160

Jakub Wejchert is currently a Research Fellow working with the European Visualization
Group at the IBM Scientific Centre, Winchester, England. He obtained his BA in theoretical
physics from Trinity College, Dublin, Ireland (1983). He then went to University College
Dublin to do an MSc (by research) in simulational physics. He then returned to Trinity to do
his PhD in simulational physics, which he received in 1988. He has worked at Centro Comune di
Ricerca, Italy (86-88) and at the Computer Sciences department, IBM T. J. Watson Research
Center, New York (88-90). Dr. Wejchert's current interests are scientific visualization and
computer animation.
Kavi Arya joined IBM in a post-doctoral position after receiving his MS and PhD from
Oxford in 1988 and his BS from the Imperial College in London in 1983. His main interest is
the use of functional programming techniques together with formal methods to make tools for
designing graphics and animation systems. He ha.s applied this to rapid prototyping of scenes
and animated sequences. Kavi's other interests include AI and parallel processing techniques
in graphics.
Bob Bacon only recently discovered that he has been working on visualization throughout
most of his professional career :) Following graduate studies at the University of Chicago. Mr.
Bacon began working with computers in such diverse endeavors as process control, machine
tool control, computer typesetting, image synthesis, and computer generated animation. He
has been a frequent contributor to animation exhibits since 198.5. He is a member of IEEE
Computer Society and SID.
Al Khorasani has been a programmer at IBM for the past six years. He received his BS in
computer science from Pace University, and has worked on various projects ranging from office
automation to graphics and visualization systems. Prior to joining research, he was employed
at IBM Poughkeepsie doing hardware simulation and design.
Alan Norton was born in Salt Lake City, UT on August 20,1947. He received the BA degree
from the University of Utah in 1968, and the PhD from Princeton University in 1976, both in
mathematics. He was instructor at the C niversity of Utah, 1976-79, and Assistant Professor at
Hamilton College 1979-80, before joining IBM Research. He first worked with B. Mandelbrot,
developing algorithms for generating fractals and making images of them. Then, from 1982 to
1987 he worked on the RP3 project, doing research on parallel algorithms, architectures and
performance analysis. Currently, he manages the project in computer animation and rendering.
His research interests include computer graphics and animation, parallel architectures. and
fractals. He is a member of the IEEE computer society, ACM and the American Math Society.
Paula Sweeney has been working at IBM since 1984. She has worked on the design and
implementation of operating systems and animation systems. Currently her interest is in
realistic animation using physics and control theory. Paula has a BA in Mathematics from
Manhattanville College and an MS in computer science from New York University.
Chapter 3
Parallel Processing
Terrain Perspectives on a Massively Parallel
SIMD Computer
Guy Vezina and Philip K. Robertson

Abstract

Massively parallel single instruction multiple data (SIMD) algorithms for rendering
perspective views of terrain surfaces, and their implementations on a 1024-processor
MasPar MP-l computer, are presented. The algorithms generate views of regularly
gridded digital terrain models, and more generally height field surfaces, at rates
approaching that required for interaction. Algorithms for rendering and perspective
projection of single, multiple and intersecting surfaces are introduced and incorporated
within a broader SIMD parallel approach based on separable operations. Correctness of
resampling filters for interpolation is also addressed. Integration with graphically
modelled objects using depth buffers and visibility maps is demonstrated. The approach
is applicable to height fields, empirical digital terrain models or range-finder data.
Examples showing the perspective visualisation of multiple and intersecting surfaces
from physically-based empirical and fractal surface modelling are given.

Keywords: SIMD parallel algorithm, interactive visualisation, perspective viewing,


image processing, computer graphics.

1. INTRODUCTION

Digital terrain models (DTM) can comprise very large amounts of data. Rendered
perspective viewing of terrain surfaces forms a compact representation that allows
intuitive interpretation in analysis and planning. For effective appreciation of data
characteristics, interactive manipulation of terrain surface views is also desirable.
Surface representations also arise in many other fields, such as stochastk or physically-
based scientific data modelling and visualisation, and may in addition involve varying
data sets as a computational simulation or model progresses. In each case, there is a
demand for fast generation of rendered surface views in perspective or orthogonal
representation, ideally at interaction rates, of data sets that can be as large as
lOOMBytes. This paper addresses this demand with a massively parallel single
instruction multiple data (SIMD) image-space algorithm and its implementation.

163
164

Small (screen) sized surface perspectives can be generated at interaction rates using
surface tessellation (polygonisation) and pipelined transformation hardware found in
many graphics workstations. Problems arise, however, with very large data sets and
with artifacts introduced in tessellation, particularly if interaction with the surface
model generation parameters is required. Pixel-based, or image-space, algorithms need
not suffer these constraints.

Two basic operations underlie image-space approaches: the first is data processing,
comprising projection, resampling and associated filtering; the second is data handling,
comprising data formatting and access according to algorithm and data-dependent
requirements.

Surface views can be generated using forward or reverse projection techniques, as


shown in Fig. 1. In forward projection the mapping takes points in the input image
(2.50 height field) and projects them into the 20 output view plane. In reverse
projection every point in the output view plane is determined by following rays into the
input domain, traversing all required objects or elements to test for intersection. For
large data sets, access to elements can dominate processing times for determining
visibility using this approach (Oubayah 86; Robertson 87). Resampling techniques,
usually based on interpolation, are in general required in any geometrical
transformation of data. Reverse projection allows correct resampling and associated
filtering within the limits of physical constraints; forward projection, on the other hand,
poses interpolation problems that may require approximations. This problem is
described in greater detail in Section 2.

j- . . . . . . . . . . . . 20¢ 2.50 forward


2.5D height field
I ..................
I
I
I
I

Fig. I Forward and reverse projection approaches.

Efficiency in data handling with large terrain or surface models relies heavily on two
factors: regularising data access requirements, and reducing to predictable levels the
data-value dependence of access requirements. Achieving the latter factor substantially
eases the requirements for providing the former. Data access has long been a bottleneck
in image processing and viewing algorithms, and early image display hardware design
has reflected this problem. Graphics workstation architectures still have internal
165

bandwidth constraints despite their provision of many geometrically inspired


capabilities local to the display processor.

The massively parallel approach described in this paper follows the principles of
regularisation of data access and localisation of data-value dependent computation; these
are increasingly common in generic approaches to parallelising complex problems. This
is to guarantee that the computation and memory complexity scale gracefully, if not
linearly, with data size. The limits to linearity are imposed only by processor
parallelism limits, and not by data dependent factors.

The core algorithms on which the perspective parallelisation is based are described by
Robertson (87; 89a; 89b) for uniprocessor implementation. These algorithms exploit
scan-line parallelism, which is a broad class that has seen a number of applications and
designs for real-time systems (Fisher 85; Evemy 89; Schmitt 88). Section 2 summarises
the pertinent aspects of this algorithm, discusses its potential parallelisations, and in
particular describes the scan-line approach chosen for this implementation. It also
discusses a SIMD massive parallelisation of rendering to complement these algorithms.
Section 3 treats the modifications to the algorithm necessary for a massively parallel 2D
mesh-connected SIMD computer, the MasPar MP-l (described in Section 3).

Extensions to the basic algorithms are described and demonstrated, including the
generation of multiple stacked or intersecting surfaces, the incorporation of
transparency in the rendering, and the integration of this system with graphically
modelled objects to allow interactive placement of objects on terrain surfaces.

An analysis of performance and complexity issues is given in Section 4, together with a


discussion of architectural constraints both specific to the MasPar MP-l, and general to
the class of algorithms. The approach taken to maintain generality in the
implementation across this class of architecture is also discussed.

2. CORE ALGORITHMS FOR VIEW GENERA nON

View generation involves two basic operations: projection according to an appropriate


viewing transformation and visibility determination. Rendering may be incorporated at
several stages.

2.1 Projection

Computation of viewing projections depends on the chosen viewing paradigm, such as


planar or spherical perspective, orthogonal or oblique parallel, or other less regular
projections. Projection computation can be performed independently for each point in
the input surface height field, although a subset only of the projected points will be
visible in the output view. Projection is hence a highly parallel operation, to the order
of granularity of the resolution in the scene. In this paper the most demanding case of
166

planar perspective projection is treated; other viewing projections are simplifications of


this.

In general, projection computation is performed as a forward projection operation: that


is, points in the input 3D (often termed 2.5D due to a height field being single-valued at
any point in the 2D spatial field) data set are projected to the output 2D view plane. This
is because of the non-uniqueness of points in the output view plane; a given output point
may be generated by more than one input point. Unfortunately forward projection
poses interpolation problems that are not necessarily solvable, and may require
approximation that can introduce artifacts. For example, holes in the output view plane
may arise if no input points are mapped to specific output points, with insufficient
information to fill them with anything other than a reasonable approximation in the
interpolation process. Systems involving physically-based approximations can provide
robust solutions.

The process of reverse projection, on the other hand, is potentially artifact-free. If the
reverse projection can be formulated correctly, then correct resampling in the output
domain can also be performed. If the view geometry is regularised such that the output
resampling domains have a constant Nyquist limit then the filtering required to prevent
aliasing is known (Fraser 87).

It is thus desirable to segment the view generation process to limit the domain of
application of forward projection to one in which physical constraints can be applied to
minimise artifacts. As will be seen later, this domain is chosen to also correspond to the
domains for visibility determination. For all other components of the view generation,
reverse projection may be performed.

2.2 Visibility Determination

Visibility is determined by the geometry of the view, the data values, and the rendering
constraints such as opacity or transparency of scene components. Figure 2 shows the
approach taken to localising the domains for visibility computation. Localised visibility
domains are determined by planes perpendicular to the image base plane, and containing
the viewpoint.

".-+ Plane of sight


".-+a.------]r----i
Vie~oint or
light source

Fig. 2 Localisation of domains for visibility computation.


167

In each case, localisation of the visibility domains in a symmetrical manner guarantees


well-balanced parallelism to the order of approximately the horizontal resolution in the
view plane. Affine spatial transformations of the data are necessary to localise the
visibility domains to any chosen format. Full details of these transformations for the
case of scan-line localisation are given by Robertson (87), as are details of the exact level
of parallelism in relation to the view resolution. It should be noted, however, that for
particular processor configurations or data views, scan-line localisation may not be
optimal for computation. .

A variation of this approach, using planes co-linear along an axis perpendicular to the
base plane and passing through its centre, can be more efficient for spherical perspective
projections or for varying views computed at the same resolution in the data domain
(Miller 86). It may, however, pose more stringent anti aliasing filter requirements.

2.3 Choice of Intermediate Domains for Processing

The chosen approach causes the local domains required for projection components,
visibility, and data access regularisation to be identical for each stage of the process.
This involves segmentation of the steps to a level of fragmentation required at any stage
for each of the conditions prescribed in Sections 2.1 and 2.2 to be satisfied. The result is
that some operations may be less efficient independently that they might otherwise have
been, but that effectively the "lowest common denominator" domains guarantee that no
operation can introduce severe penalties that dominate the results, either in data-
dependent time complexity or in the potential introduction of artifacts.

This provides a robust and potentially efficient algorithm. The real benefits, however,
arise from ensuring that the operations performed within the localised domains at any
stage are identical for each domain, and that the number of domains is high compared
with the number of distinct operation steps. These conditions provide for effective
SIMD parallelism, or if the operations are not identical but are predictable in scale (and
largely independent of data values), MIMD parallelism. Flexibility in the exact
allocation of domains provides flexibility in the mode, granularity and implementation
of parallelism.

This paper describes a massively parallel SIMD implementation of domains that are
scan-lines of spatially 2D raster images, the first and last of which are respectively the
iuput and output data sets, with intermediate images representing transformations
necessary to localise domains as required. Figure 3 shows an overview of the stages in
the algorithm. At intermediate stages useful data sets, such as depth buffers or visibility
maps, are available. The exact algorithms for perspective viewing and shadowing are
described by Robertson (87; 89a).

In brief, the image is first rotated within its 2D base plane to align the view direction
with one of the rectangular axes of the data storage and processing array geometry.
This rotation is performed using scan-line techniques; Fraser (79; 85) and Catmull and
Smith (80) describe essentially the same approach to rotation, while Paeth (86) and
Tanaka et a1. (86) describe an alternative approach. Both techniques are explored for
168

their efficiency on a massively parallel SIMD machine in later sections. Following


alignment of the view geometry, the image is spatially "squeezed" to localise visibility,
and then projected along the orthogonal direction to determine visibility (either as a
mask, or destructively on non-visible regions). Horizontal projection and
"unsqueezing" are then performed, and if shadow masking, rotation back to original
orientation. Variations on this algorithm are possible, but the above published form was
used for the SIMD implementation. Between stages transposition can provide efficient
row/column data access changes (a transposition involves exchanging the two axes).
The transposition algorithm itself is also SIMD parallel.

vertical horizontal
projection projection

LEP
Perspective
rotated to
view
aligned
domains ~ view

~ ~ irp detennined de-aligned

4J
visibility domains
Shadowed

~
Viewpoint
or Light image
source or visibility
map

Step 1 Step 2 Step 3 Step 4 Step 5

Fig. 3 Stages in sequential implementation of single surface perspective viewing and


shadowing.

Scan-lines are not the only domains possible; line-segments, or tiles of varying aspect,
can regularise processing and viewing operations in different ways, and in fact have
been used with the rationale behind their choice determined on a specific application
basis, rather than on a generalised basis, in many viewing algorithms in the past. Most of
those previous approaches are not SIMD parallel on a domain basis.

This paper explores the scope for SIMD parallel implementation based on scan-line or
scan-line-segment domains, and shows its scope for significant extensions to the core
algorithms within the same parallelism paradigm.

2.4 Rendering

Several approaches to rendering may be taken. The most straightforward is to pre-


render the surface view and carry the height field and colour or texture map
information throughout the process. This is generally satisfactory for terrain views, but
169

does not allow for more sophisticated scene rendering involving ray-tracing or lighting
in shadowed regions.

Pre-projection rendering can also be performed in several passes using shadow maps,
generated by the visibility techniques, as reference masks. Alternatively rendering can
be performed simultaneously with projection. Full treatment of rendering options, and
comparative performance figures under various approaches, are outside the scope of
this paper. Gradient (and hence normal) determination, however, is straightforward to
parallelise in either scan-line or rectangular tile domains. The simple pre-projection
rendering is used in this work because of its suitability to large terrain surfaces. For
most such applications efficiencies gained by simplified lighting models, and not having
to recompute the rendering for different views, offset possible advantages of more
sophisticated rendering.

3. MASSIVELY PARALLEL APPROACH AND IMPLEMENTATION

To achieve interactive viewing rates a significant speed-up of 2 to 3 orders of magnitude


over uniprocessor implementation is necessary. This suggests massive parallelism.
Although MIMD arrays could provide this speed-up, the scope for massive parallelism
and data regularisation, and the SIMD nature of the problem, suggest SIMD approaches.

The machine chosen for SIMD implementation was a 1024 processor MasPar MP-l 2D
array processor (Blank 90). This section provides first an overview of the salient
characteristics of the architecture, highlighting those that are unique to the MP-l and
those found in other similar architectures. The implementation of the algorithm on the
MP-l is then described, and followed by extensions for additional graphics
functionalities.

3.1. MasPar MP-l Architecture Model

The MP-l machine contains a 2D grid of processing elements (PEs) for parallel
computation and a sequential Harvard-style controller machine, the Array Control Unit
(ACU), for sequential processing and control. The PEs are organized as a grid of
between 32x32 and 128xl28 elements and the architecture is capable of supporting up to
256x:256 elements. Each PE is connected with its 8 nearest-neighbours. The MP-l PEs
are based on a RISC-style load-store architecture which has a significant advantage for
image processing in its capability for indirect addressing.

Communication between the ACU and the PEs is bidirectional, involving data
broadcasts from the ACU to the PEs, and data reduction operations from the PEs to the
ACU. Communication between PEs (or inter-PE communication) can be performed in
any of three ways. The easiest (and slowest) is simply to move one value at a time from
a PE into the ACU and into another PE. This method does not take advantage of
parallelism. For regular data movement, the 'Xnet' allows communication between
processors in any of eight compass directions with movement time proportional to
170

movement distance (i.e. number of PE to PE steps). The edges of the processor array
are connected toroidally, providing wrap-around. As a result, in an n x n array the
maximum distance between any two processors is n / 2 links. A third method for
communication, the router, allows for arbitrary inter-PE connection. This flexibility
comes at a price; the router is slower than the Xnet, consequently it is normally used for
problems with unpredictable data movement or problems requiring data transfers
across long distances.

Each processing element has fast access to its local memory in which the data to be
processed can be stored. This memory can be accessed in two ways: 'direct' addressing
is used when each processor wants to examine the same location in its local memory, and
'indirect' addressing used when each processor requires data from a different part of its
local memory (access time for indirect addressing is three times slower than direct
addressing access time). Indirect addressing provides for efficient data-value-
dependent access, and other index -based operations.

3.2 Implementation

An interactive surface rendering pipeline is first presented, showing the integration of


the different algorithm components as modules. The chosen data mapping onto the
array is described, and module implementation details are given.

Interactive Surface Rendering Pipeline

The full rendering and viewing pipeline used for the parallel implementation is
composed of several modules, as shown in Figs 4 and 5. The ordering of these modules
influences the interaction rates that can be achieved under manipulation of different
pipeline parameters. For example, for azimuth changes in viewing, rotation need not be
recomputed. By keeping strategic copies of images across the pipeline better
interactivity can also be achieved. The structure of the pipeline is similar to that of the
sequential implementation although it becomes apparent from the timing tests that
dominating constraints occur at different stages, pointing to different rationales
affecting pipeline implementation.

The surface data set comprises several identically sized regularly gridded 2D arrays of
values, or images. The first image in a set carries the surface height information over a
regular 2D rectangular grid. The other images in the set represent colours or textures
to be mapped onto the surface.

Multiple surfaces are most easily integrated using a z-buffer approach, as shown in Fig.
5. For applications that involve intersection-dependent surface modification, they can
also be handled in a composite projection approach. Architecture constraints such as
local memory size can also make it more efficient to integrate at the projection stage for
large data sets.
171

LJlLJl~Q
~
V
hoigh! field &
texture map
,haded
smface
roffitOO
smface
perspective
view
Step 1 Step 2 Step 3 Step 4

Fig. 4 Rendering and viewing pipeline.

Rendering, rotation, perspective viewing and multiple surface view composition thus
form the main modules. The parameters associated with the pipeline define the viewing
geometry, the light source position, the height exaggeration of surfaces, base-plane
height offsets, and surface opacity.

surface
#1 Q()
()
perspective z-buffer
view

surface
#n
Q()
perspective
view
z-buffer
z-buffer

Q
composed
swface parametrically defined perspective
#m object (with z-buffer) view

Multiple surface
composition step
(z-buffer comparison)

Fig. 5 Multiple surface viewing pipeline.


172

Data mapping

The data mapping is a 1D hierarchical mapping. Each PE memory stores a row or


column of the image as shown in Fig. 6. As described in Section 3.1, the MP-1 is a 2D
mesh of PEs. In the chosen mapping scheme, this mesh is viewed as a 1D array of PEs
indexed from 0-1023. Each PE index corresponds to either an image row or an image
column index. The data type corresponding to this structure in MPL (MasPar's parallel
extended C-Ianguage) is a plural array; the plural adjective means that the variable
allocation is replicated on each individual PE. The algorithms described above allow
most of the processing to be performed within a row or column of the image (scan-line
processing), and thus within each individual PE memory. Inter-PE communication is
needed only for image transposition, the process of exchanging image rows for
columns. Thus in fact the 2D connectivity of the array is only used for the transposition
stages.

Performing most of the processing within individual PE memory takes advantage of the
fast PE-memory data access rate (in comparison to inter-PE data access rates).
Communication required for the transposition stage is implemented using the MP-l
Xnet and indirect addressing.

11
PEO PEl PE2 PE3 PEO PEl PE2 PE3
23

8 B
E F

image row mapping column mapping

Fig. 6 Mapping the image scan-lines (rows or columns) to the 2D PE array. A 4x4 PE
array is used to illustrate the mapping.

Transposition

Because of the chosen scan-line based mapping, transposition is a critical part of the
pipeline. Several approaches to transposition are possible using the Xnet together with
indirect addressing. Full discussion of parallel transposition approaches is beyond the
scope of this paper, but the straightforward approach implemented in this work is as
follows.

The image A has elements A(i J) e A such that i, j are the row and column indices. Row
mapping corresponds to each element A(i,j) residing in PEj at memory address i.
Transformation to column mapping means that each A(i,j) must be moved into PEi at
memory address j, as shown in Fig. 6.
173

Assuming a square image of size N x N, there are N steps. At each step N elements (one
per PE) are moved simultaneously using the Xnet and dropped into PEs at their
respective destination addresses. Data movement is partially illustrated in Fig. 7. The
algorithm takes full advantage of the Xnet and the toroidal wrapping to maximize
performance.

Two algorithms have been implemented in this work: an optimized algorithm which
transposes lk x lk x Ibyte data sets stored in a scan-line order, and a more general
algorithm which transposes arbitrarily sized data sets. The tradeoff is one of speed
versus memory space. Memory space becomes an important consideration when
processing multiple multi-channel surfaces; for practical reasons it can be important to
allow use of several different approaches to the transposition. Improved timings are
possible under various optimisation approaches but scope for optimisation depends on
the exact algorithm and the image size. Kuszmaul (90) discusses transposition
algorithms for the MP-l and provides some timing information; our implementation of
optimisation has reduced times to 40 msec. for a scan-line mapping of a lk x lk image
(Fletcher 90). The timings given in Table 1, however, are for a general approach that
does not incorporate full optimisations to allow comparisons between algorithms and
across other SIMD 2D array machines.

PEa

2DPEmesh data movement

Fig. 7 Data movement for transposition on a SIMD 2D array.

Table 1: Performance of transposition.

transposition time (msec)

lk x lk x Ibyte (special case) 120


512 x 512 x lbyte (general algorithm) 200
174

Rotation

As described earlier, two scan-line rotation algorithms have been considered: the Fraser
/ Catmull and Smith algorithm, and the Paeth / Tanaka et al. algorithm, both shown
below for rotation by angle e.

Fraser/ shear & scale transpose shear & scale transpose


Catmull and Smith
[COSe 0 ]
-sine 1 [ ~ 6] [ tane 0 ]
l/cose 1 [ ~ 6]
Paeth! shear transpose shear transpose shear
Tanaka et al.
1
[ _tan (S/2) ~ ] [ ~ 6] [ si~e ~ ] [ ~ 6][ -tan~s!2) ~ ]
The first uses 2 shear/scale operations and 2 transposes. Fraser (79) provides an
implementation where for rotation angles between 45 and 135 degrees, and 225 and 315
degrees, one transpose is not necessary, being replaced by a scan-line reverse operation.
The shear/scale operation relies on resampling, and thus interpolation. The second
algorithm uses 3 shears with transpositions between them, but avoids the need for
scaling.

If nearest-neighbour resampling is sufficiently accurate for the mapping, the Paeth /


Tanaka algorithm may be faster because no interpolation is required. In the Fraser /
Catmul1 and Smith algorithm some form of interpolation is required because of the scale
operation. For other resampling schemes, however, this last algorithm is generally
faster because it requires one fewer shear.

Comparative timings based on nearest-neighbour resampling are shown in Table 2.


Results are comparable for the two algorithms. Individual operation timings are also
shown. As before, timings given are for a general approach that does not incorporate
optimisations; speed-ups of about a factor of 3 have been achieved by careful
optimization.

Surface Rendering

Full description of the rendering used in this system is beyond the scope of this paper,
but is given by Robertson (85), where height exagg~ration and illumination angle trade-
offs are also discussed. Gradient approximations are often application dependent and
they may also be dependent on data quantisation levels. Filtering in gradient
determination is often ill-specified and can introduce or modify artifacts. The SIMD
scan-line shading algorithm used is very straightforward, being based on an
approximation to gradient using a pixel's 4 nearest neighbours, with passes in row and
column directions to derive the gradient.
175

Table 2: Perfonnance comparison of two rotation algorithms using nearest neighbour


resampling.

transfonnation time (msec)


(512 x 512 x 1byte)

shear 11
shear & scale 40
nearest-neighbour resampling 5.5
rotation Fraser/Catmull 320
rotation Paethffanaka 270

Perspective viewing

The perspective viewing stages are described in Section 2. For a square image, the
perspective viewing algorithm is O(N) where N is the number of elements per line of
the image, for N less than or equal to the total number of PEs. When N is larger
virtualisation must be used; this requires looping over the virtual layers. Timings for
the viewing stage are shown in Table 3 for each stage of the algorithm. These timings
are for nearest-neighbour resampling. Note that timings are view-angle dependent
because for low view angles, later stages have fewer data values to process. A 400 view
elevation angle has been chosen because it represents an average load.

Table 3: Perfonnance for the perspective viewing algorithm using nearest-neighbour


resampling.

perspective view (40") time (msec)


(512 x 512 x 1byte)

squeeze 33
vertical projection 462
horizontal projection 263
transpose (2) 240
TOTAL 998

Optimization reduces the overall time to approximately 500 msec. Plate 1 shows two
perspective views of a rendered surface representation of a terrain model with viewing
elevation angles of 200 and 400 respectively. View (a) has twice the height exaggeration
of view (b).
176

Shadowing and Visibility Maps

Generating shadows in a view is a slight modification of the perspective algorithm, with


the light source substituting for the viewpoint. A visibility map in the original data is
generated by exactly the same process. The vertical projection stage is used to mark the
visibility (to the light source or viewpoint respectively for the shadowing and visibility
map) of each data element, but the projection is not actually generated. Unsqueezing
aligns the shadow or visibility map with the height data. Shadow map generation times
are not significantly different from perspective view generation times apart from the
additional final rotation.

Plate 2(a) shows a perspective view (elevation 400 ) of the terrain surface, with the
visibility map shown in 2(b). White regions in 2(b) indicate regions not visible in the
view of 2(a).

Multiple Surface Viewing

The combination of multiple surfaces can be achieved in several ways; the most efficient
depends on the size of the data set in relation to the array memory, and provides another
example of the trade-off between algorithm and data size. In this implementation
multiple surfaces are handled by comparing their z-buffers and compositing to produce
the resulting image. Note that while general and straightforward to implement, this is
not necessarily the best approach; this is because progression knowledge (the physical
information that under forward projection might allow correct interpolation of holes,
as discussed in Section 2) is not fully preserved.

The input to this multiple surface compositing can either be produced by the perspective
viewing stage or can be the rendered image and z-buffer from a geometrically modelled
object. Thus geometrically defined objects can be integrated with empirical height field
data. Table 4 gives performance timings for multiple surface compositing. As before,
the timings depend on the view angles and also depend approximately linearly on the
number of surfaces being composited.

Table 4: Performance for compo siting 2 surface views for opaque and translucent
surfaces.

multiple surface compositing time (msec)


(512 x 512 x 1byte)

opaque 27
translucent 315
177

Translucency

Surface opacity can be specified to allow for a simple model of translucent surfaces.
Each surface carries a transmittance attribute between 0 (opaque) and 1 (transparent).
Thus surfaces may be visible even if they are,behind other surfaces. For two surfaces we
have:

Ires = tr1 X 11 + ( I - tr1 ) X 12

This simple model has been implemented on the MP-l. Translucent surface
compositing performance is more expensive than opaque multiple surface
determination because of the extra computation involved, as shown in Table 4.

Plates 3 to 6 show examples of multiple and intersecting surfaces. 3(a) shows an


empirically derived rendered terrain model representation (in colour) and its
intersection with a parametrically defined sinusoid (in grey-scale) with a view elevation
of 35°. The intersection map is shown in white broken line in 3(b) on the terrain
surface. Plate 4 shows the two surfaces of Plate 3, but with the sinusoid represented in
translucent form, with opacity factors of 0.5 and 0.75 respectively in 4(a) and 4(b).
With surface movements, and viewing at video rates, the original surface structures are
easily distinguishable. Adjustment of transmittance factors can be performed at close to
interaction rates as can be seen from Table 4.

Plate 5(a) shows a subsection of a fractal-generated surface used in terrain modelling,


while 5(b) shows its intersection with the relevant section of the empirical terrain data;
again colour and grey-scale are used to distinguish the surfaces clearly in the static
views. This type of visualisation of the comparison between the fractal and empirical
models allows interactive fractal parameter modification to more closely model the
terrain. The fractal itself is also generated with an SIMD parallel algorithm.

Plate 6 shows the scope for multiple intersecting and stacked surfaces. In 6(b), the high
spatial frequency fractal surface (bottom) is used to add a high frequency component to
the empirical terrain model (middle) to generate a composite surface (top) as a basis for
small-scale simulation on the empirical model. The physical models involved lie outside
the scope of this paper.

4. ANALYSIS OF RESULTS

The timing results given must be considered preliminary at this stage. They represent
initial results from the first pass at understanding the dominant characteristics of the
data access versus computation performance trade-offs for a SIMD scan-line massively
parallel implementation. Optimisations have not been included in the timings because
they are less easily related to generic algorithm components. Specific performances can
be improved: for example, an optimal transposition can be performed in approximately
one third of the time given for the straightforward implementation, suggesting that it is
too early to be drawing conclusions on optimal overall performance.
178

Several initial conclusions can be drawn, however.. The first is that for even a modestly
sized massively parallel machine, performance on these algorithms is an order of
magnitude better than achieved on a medium-powerful workstation. Table 5 gives an
overall comparison of the MP-I performance with that of an implementation of the
same algorithm on a Sun SPARCstation 1. Exact timings depend on system status and
connection loads. Note that performance scales linearly with the number of scan-lines
in_an image up to the number of processors. Thus a perspective of a lk x lk image takes
twice as long (2 seconds) as a perspective of a 512 x 512 image (1 second).

This performance increase brings image viewing close to interactive rates. View
generation for a 512 x 512 image takes a second or so, but data access also affects update
rates. The MP-l does not currently have a frame buffer but, using an X-window on a
Sun SPARCstation with an ethemet link to the V AXstation host, overall perspective
view update rates are of the order of 2 seconds. With the planned MP-l frame buffer
connected to the MPIOC high speed data channel, this additional overhead will be
removed.

Table 5: Comparison of Sun SPARCstation 1 and MasPar MP-l performance.

single surface perspective time (sec)


(lk x lk x Ibyte)

SPARCstation 1 (single CPU) 80


MP-l (lk PEs) 2

Increasing the size of the array beyond the number of scan-lines in the image does not
increase performance linearly with the simple scan-line mapping described in this
paper. For a small penalty in line binding in some stages of the algorithm, scan-line
segments can be used to achieve close to linear performance increase; other
fragmentations are possible but have not been fully explored at this stage.

For example, on a Sun SPARCstation 1 a perspective view of a full map sheet (8k x 8k)
DTM takes several hours (a minimum of 8 x 8 x 80 seconds = Ihr 25mins if the full
image can be kept within memory, but longer if it cannot). On a 8k MP-l the projected
time is 16 seconds (8 x 2 seconds). This assumes that the transposition algorithm scales
in a close to linear fashion; although the algorithm is not linear, use of the router allows
close to linear scaling. The relatively low cost of additional processors on a SIMD
machine puts this scale of facility to within production time requirements for a large
mapping bureau.

Virtualisation must be used for images with maximum dimension larger than the
number of PEs. This is complicated by the image data format and the number of images
being handled in a view; the real constraint is the limit of local PE memory (currently
16kBytes per PE, although this will increase with greater DRAM density).
179

Nearest-neighbour resampling is simple and fast, but to avoid aliasing, improved


sampling schemes, such as linear, cubic convolution, or spline-based interpolation, must
be used. A fuller investigation of resampling alternatives, and their cost on the MP-I, is
currently being undertaken. Peak floating point performance of the MP-I is just under
100 Mflops (MasPar rating: 94 MFlops, 32-bit) for a lk PE configuration, while fixed
point (MasPar rating: 1875 Mips) performance is approximately 20 times this. On this
type of problem, close to peak rates can be achieved for local computation. Resampling
can be performed in fixed point, but the normalisation trade-offs are still under
investigation. It should be pointed out, however, that any form of interpolation
approximation may introduce artifacts, and a full analysis of the consequences of fixed
point arithmetic would be required.

Further consideration is also being given to the extent to which the same considerations
apply across other SIMD arrays; a Connection Machine (CM2) implementation is being
addressed. This is one reason for not emphasising specific optimisations at this stage.

5. SUMMARY AND CONCLUSIONS

A scalable SIMD parallelisation of algorithms for rendering and perspective viewing of


terrain surfaces or height fields has been described. This parallelisation preserves the
key aspects of the algorithm design: common localisation of domains for visibility and
processing, and regularisation of processing, data access and handling. These are also
key criteria for SIMD massive parallelism, and an implementation on the MasPar MP-I
computer is described.

Of particular interest is that the parallel implementation preserves the advantages that
the original algorithm offers for regularisation of the geometry to allow correct
resampling. There is thus no integrity penalty in the parallel implementation.

Exact matching of array size (number of PEs) to the number of scan-lines gives optimal
performance, but array sizes smaller or larger than the number of scan-lines can still be
exploited at some cost in complexity in scan-line segment binding.

Constraints arise when the scan-lines of data and associated structures become too large
to fit into local processor memory. Scan-line segments can again provide domain
localisations under these conditions, but further complications arise in virtualization and
data handling. With the current machine configuration, no fast data I/O is available, so
further inefficiencies arise for images which will not fit in local PE memory.
Otherwise scaling is graceful.

There is considerable scope for exploring alternative parallelisations that follow the
same basic principles of the algorithm but preserve in a SIMD manner, segment or
region binding information. Except for transpositions, the described approach does not
exploit the 2D mesh-connections on the array, but rather treats it as an unfolded linear
array. Potential exist for exploiting the 2D connections for region binding and possibly
also global communications mechanisms (the router on the MP-I).
180

Timing results support approximate projections on the algorithm performance based on


data movement and processor performance figures. Speed-ups can be obtained by using
fixed-point arithmetic, but the visual artifacts that might be introduced by aliasing in
this process have not been investigated, particularly for interactive viewing (see (Arnan
90) for discussion of temporal anti-aliasing filters).

The objective of this work was to investigate the scope of the surface perspective
viewing algorithm, and its extension to integrate graphics objects, on a massively
parallel SIMD computer. The results are highly encouraging, offering close to
interactive viewing of empirically-defined surface representations of digital terrain or
elevation models, and other geoscientific or more general surface representations of
images. Variations on the algorithm, resampling using different interpolation filters,
and more complex graphics integrations are being explored.

6. ACKNOWLEDGEMENTS

The authors would like to thank members of the Visualisation Project for their
comments on the paper and Peter Fletcher for generating the fractal surfaces.

REFERENCES

Amanatides, J and Mitchell, D. P. (1990) Antialiasing of Interlaced Video Animation.


Proc. Siggraph, Dallas, pp.77-85
Blank, T. (1990) The MasPar MP-1 Architecture. Proc. CompCon '90, pp.20-24
Catmull, E. and Smith, A.R. (1980) 3-D Transformations of images in scanline order.
Proc. Siggraph, pp.279-285
Dubayah, R.O. and Dozier, J. (1986) Orthographic Terrain Views Using Data Derived
from Digital Elevation Models. Photogr. Eng. and Rem. Sens. 52 (4), pp.509-518
Evemy, J.D., Allerton, D.J. and Zaluska, EJ. (1989) A Stream Processing Architecture
for Real-Time Implementation of Perspective Spatial Transformations. 3rd Int'l
Conf. on linage Proc. and its Appl., pp. 482-486
Fant, K.M. (1986) A Nonaliasing, Real-Time Spatial Transform Technique. IEEE
CG&A 6(1), pp. 71-80
Fisher, A.L. and Highnam, P.T. (1985) Real-time Image Processing on Scan Line Array
Processors. IEEE Compo Soc. Workshop on CAPAIDM, pp.484-489
Fisher, A.L. and Highnam, P.T. (1988) Programming considerations in the design and
use of a SIMD image computer. Frontiers of Massively Parallel Computing
Fletcher, P.A. (1990) personal communication, CSIRO Division of Information
Technology, Canberra, Australia
Fraser, D. and O'Brien, E. (1979) Fast Image Rotation Techniques Using a Colour
Image Display. Proc. DECUS Australis Symp., pp.1601-1604
Fraser, D., Schowengerdt, R.A. and Briggs, I. (1985) Rectification of Multichannel
Images in Mass Storage Using Image Transposition. Computer. Vision, Graphics,
and Image Proc. 29, pp. 23-36
181

Fraser, D. (1987) A Conceptual Image Intensity Surface and the Sampling Theorem.
Australian Compo Jour. 19(3), pp. 119-125
Fraser, D. (1990) Comparison of Image Rectification Algorithms. submitted to Compo
Vis. and Image Proc.
Friedmann, D.E. (1981) Two-Dimensional Resampling of Line Scan Imagery by I-D
Processing. Photog. Eng. and Rem. Sens. 47(10), pp.1459-1467
Kuszmaul, C. L. (1990) Rapid Transpose Methods on Massively Parallel SIMD
Computers. MasPar Technical Report, MasPar, Sunnyvale, CA
McDonnell, M. (1990) Scan-line methods in spatial data systems. Proc. 4th Int'l Symp.
on Spatial Data Handling '90, pp.971-980
Miller, G.S.P. (1986) The Definition and Rendering of Terrain Maps. Proc.
SIGGRAPH, 20(4), pp.39-48
Paeth, Alan W. (1986) A Fast Algorithm for General Raster Rotation. Graphics
Interface '86, pp.77-81
Robertson, P.K. and O'Callaghan, J.F. (1985) The Application of Scene Synthesis
Techniques to the Display of Multi-Dimensional Image Data. ACM TOG 4(4),
pp.247-275
Robertson, P.K. (1987) Fast Perspective Views of Images Using One-Dimensional
Operations. IEEE CG&A 7(2), pp.47-56
Robertson, P.K. (1989a) Spatial Transformations for Rapid Scan-line Surface
Shadowing. IEEE CG&A 9(3)
Robertson, P.K. (1989b) Parallel Algorithms for Visualising Image Surfaces. 3rd Int'l
Conf. on Image Proc. and its App!., pp.472-476
Schmitt, L. A., Wilson, S. S. (1988) The AIS-5000 Parallel Processor. IEEE Trans. on
PAMI 10(3), pp.320-330
Smith, A.R. (1987) Planar 2-Pass Texture Mapping and Warping. Proc. SIGGRAPH,
21(4), pp.263-272
Tanaka, A., Kameyama M., Kazama S., and Watanabe O. (1986) A Rotation Method for
Raster Image Using Skew Transformation. Proc. IEEE Conf. on Compo Vis. and
Patt. Recogn., June, pp.272-277
Wolberg, G. (1990) Digital Image Warping. IEEE Computer Society Press Monograph
182

Plate lea) Perspective view of a rendered surface representation of a digital terrain


model (empirically-derived) with viewing elevation angle of 200 .

Plate l(b) As above with viewing elevation angle of 400 and with half the height
exaggeration of view (a).
183

Plate 2(a) A perspective view of the terrain model (elevation 400 ).

Plate 2(b) Visibility map of (a). White regions in the visibility map indicate regions
not visible in the perspective view.
184

Plate 3(a) The intersection of a rendered terrain model representation (empirically-


derived) with a parametrically-defined sinusoid (in grey-scale).

Plate 3(b) The intersection between the two surfaces in (a) is shown in the terrain
representation with white broken lines.
185

Plate 4(a) The same two surfaces as in Plate 3 but with the sinusoid represented in
translucent form with opacity factor of 0.5.

Plate 4(b) As above with opacity factor 0.75.


186

Plate 5(a) Subsection of a fractal-generated surface used in terrain modelling.

Plate 5(b) Intersection of above surface with the relevant section of the measured
(empirical) terrain data. Colour and grey-scale are used to distinguish the
surfaces clearly.
187

Plate 6(a) The intersection of three surfaces (empirical, fractal and sinusoid).

Plate 6(b) A high spatial frequency fractal surface (bottom) adds a high frequency
component to the empirical terrain model (middle) to generate a composite
surface (top).
188

Guy Vezina was born in Quebec, Canada. He received the B.Sc. in


Electrical Engineering from Universite Laval, Quebec, in 1986,
and the M.S. in Electrical and Computer Engineering from the
University of Massachusetts, Amherst, in 1988. In 1988-1989 he
was a Visiting Scientist with the Division of Information
Technology of the Australian Commonwealth Scientific and
Industrial Research Organization (CSIRO). He is currently enrolled
in the Ph.D. program at the Australian National University under
CSIRO support. During the Fall of 1989, he was a Visiting Scholar
at Carnegie-Mellon University, Pittsburgh. His current research
interests are in image processing, visualisation of scientific data,
statistical analysis and parallel techniques and environments. He is a
member of IEEE.
Address: CSIRO Division of Information Technology, GPO Box
664, Canberra ACT 2601 Australia. e-mail: guy@csis.dit.csiro.au
fax: +61 6257 1052 tel: +61 62750911

Philip K. Robertson is a senior research scientist at the CSIRO


Division of Information Technology's Centre for Spatial
Information Systems, in Canberra, Australia. He leads the Centre's
Visualisation Group, which addresses graphics and image
processing techniques for interactive visualisation in complex
information systems. He lectures in Computer Graphics at the
Australian National University.
Robertson's research interests cover conceptual and computational
approaches to visualisation, including the development of
visualisation paradigms and methodologies, perceptual colour
interfaces and their application to complex data display, colour
device modelling and control, parallel and multidimensional
algorithms and architectures in graphics and image processing, and
the design of kernel image processing software.
Robertson holds a B. Eng. and M.Sc. degrees in Electrical
Engineering, and a Ph.D. in Computer Science from the Australian
National University. He is a member of IEEE and ACM, and II
member of the editorial board of IEEE Computer Graphics and
Applications.
Address: as above. e-mail: robertson@csis.dit.csiro.au
Surface Tree Caching for Rendering Patches
in a Parallel Ray Tracing System
Wim Lamotte, Koen Elens, and Eddy Flerackers

Abstract

Many ray tracing systems lack the availability of curved surfaces, since the calculation of the
intersection between a ray and such a surface is a computation-intensive algorithm. Several
speedup techniques have been presented already. But most of these methods don't use coherence
between neighbouring rays. In order to exploit this coherence in ray tracing patches, we present
a technique of caching subdivision trees, which we tuned for our particular implementation of a
parallel ray tracer on a network of transputers.

Keywords

Ray tracing - Patches - Caching - Coherence - Newton's algorithm - Curved Surfaces -


Parallelism - Transputers

1. Introduction

Over the past years, computer graphics have been focusing on generating more realistic scenes.
To this realism, there are two main aspects: a modeling aspect and a rendering aspect. Modeling
deals with several ways to represent the objects we want to show in our picture, whereas
rendering is concerned with how we can visualize these models on the computer screen. One has
to find a representation for the objects that fits best the rendering algorithm at issue. We have
chosen to use the ray tracing technique as our rendering algorithm. This implies that we have to
take recourse to an object representation that is well suited for ray tracing.

189
190

As we all know, simple objects like spheres, cylinders, cubes, etc. can be ray traced very easily,
since they can be defined with a relatively simple mathematical equation. Calculating the
intersection between one of these objects and a ray (which is, mathematically, a (3D) line),
narrows down to solving the system of two equations, which is a rather easily solved problem.
Nevertheless, when we want to create more complicated objects (cars, planes, character fonts,
... ), these simple building blocks don't suffice anymore. It turns out that an object representation
of higher order parametric surfaces ("patches"; see, e.g., [Farin 1988]) is a good candidate for
this kind of complex objects. However, there are some well-known problems in ray tracing
parametrically defined surfaces. On the one hand, we have to deal with "hard to control" round-
off errors, while on the other hand there is the performance issue. With respect to both
difficulties, we present some appropriate solutions by utilizing a surface tree caching
methodology implemented on a parallel network of transputers.

In the following section, we give a quick review of previous work done on the field of rendering
curved surfaces. The third section describes the patch intersection algorithm that we started with.
Section four summarises a method of using ray coherence. In the fifth section, we describe the
parallel ray tracing system on which we implemented our algorithm. Section six investigates the
implications that this system has on the usefulness of the algorithms introduced in sections three
and four. Then we are ready to present our own tree caching algorithm in section seven. Section
eight compares the performance of the tree caching method with our implementation of Toth's
algorithm. Finally, we draw some conclusions out of this work and show which research can
still be done within the proposed framework.

2. Rendering of Curved Objects

In polygonal scanline algorithms, one of the basic rendering operations is to clip a polygon
against a scanline on the computer screen. So the normal way of rendering objects in this kind of
algorithms is to approximate them by polygons. But this approximation has some consequences.
First of all, in order to produce a smooth surface for curved objects, the normals have to be
interpolated (Phong shading; see, e.g., [Hall 1989]). And on the other hand, sufficiently small
polygons have to be used, because otherwise the silhouettes of curved objects will not be smooth
enough. The results from this kind of rendering algorithms are reasonably satisfactory but the
degree of realism is not high enough yet: there are no reflections nor refractions; shadows are
rather hard to calculate (see, e.g., [Nishita et al. 1990]) ...
191

Raytracing algorithms (for an introduction, see, e.g., [Glassner 1989]) meet this need for
realism. In these algorithms the basic rendering operation is the intersection between a ray and an
object in the scene, which in fact boils down to solving the system of the ray's equation and the
object's "equation(s)". But inherent to the higher degree of realism that can be gained here, there
are some requirements to the accuracy of the intersection calculation. If this requirement is not
met, the result will not be satisfactory. Therefore, it is absolutely necessary to calculate the
intersections as closely as possible to the actual intersection points. As a consequence, the
modeling phase has to be more accurate, too: if we start with a rough approximation to the objects
we want to render, the intersection algorithm can never be precise enough. This is especially true
for curved objects (with a more complex equation than, e.g., a sphere) which will have to be
approximated as closely as possible. That's why one should not approximate these curved
objects with polygons, but with parametrically defined curved surfaces, e.g., Bezier patches (see,
e.g., [Farin 1988]). The problem with these patches, as with any third order function, however,
is to calculate the intersection, since the system of equations mentioned earlier cannot easily be
solved analytically; therefore we have to use an appropriate approximating method.

There are many possible ways to render these patches. In the most straightforward way, one
would subdivide patches in a preprocessing step into a fixed number of polygons, which will
then be rendered. This approach of course has nasty consequences, similar to the implications
we mentioned earlier with respect to scanline rendering of curved objects approximated by
polygons (normals, silhouettes, ... ). These consequences are in conflict with the concept of ray
tracing. Another consequence is the criterion for subdivision depth: shall we subdivide once,
twice, 10 times? There is not one answer to this question that is suitable for all kinds of object
we would like to render, since it depends on the scene and how precise we want the
approximation to be. E.g., if we zoom into an object which is subdivided a fixed number of
times, the inaccuracies of the approximation can become more and more visible. Moreover,
relatively flat parts will be subdivided exactly as much as very curved parts are.

Adaptive subdivision into polygons ([Lane et al. 1980]) is a better method, since this overcomes
the depth problem of uniform subdivision: now, more subdivisions can be done wherever the
need arises, while rather flat parts will not be subdivided that deep. The result of such an
algorithm can be seen in fig. 1. In this kind of technique, however, another artifact arises: if two
adjacent (sub)patches are not subdivided in an equal number of subdivisions (depending on the
flatness criterion), holes can appear (see, e.g., [Clark 1979]).
192

Fig . 1: Problems with uniform subdivision

Another kind of adaptive method is numerically approximating (by Newton iteration) the actual
curved surface of an object ([Blinn 1978], [Whitted 1978], [Toth 1985]). There are two main
groups of Newton-variants, depending on the convergence characteristics. Quadratic
convergence requires a matrix inversion for every step in the iteration. Linear convergence just
requires 1 matrix inversion at the beginning of the iteration but in order to be able to use this one
matrix for the whole iteration, the starting value has to be chosen close enough to the actual
intersection point (see [Toth 1985]).
193

3. Toth's Algorithm

In our implementation, we chose the algorithm described in [Toth 1985]. For an in-depth
explanation, we refer to this paper, since deepening the mathematics needed for this algorithm
would lead us too far in this work. A brief summary can be found in [Lischinski and
Gonczarowski 1990]. The most important concept introduced in Toth's algorithm, is the use of
the so-called Krawczyk-operator (K-operator). The algorithm basically consists of 2 phases.
The fIrst phase uses the K -operator to enable stepwise refInement of the search space that yields a
region where Newton iteration is possible. The second phase is of course the actual Newton
iteration on this region.

When implementing and using this algorithm, however, we met some diffIculties. Firstly, a
number of tolerances were needed to control the algorithm, e.g., a flatness criterion, a smallness
criterion, a tolerance to decide whether a point is inside a region, etc. The problem was that these
tolerances actually should be coupled with every object we wanted to render, because a set of
values that was well-dermed for one object could yield nasty holes or impurities in other objects.
On the other hand, when the tolerances are chosen too strict, the algorithm will be very slow,
since too much time is wasted on unnecessary calculations.

Other problems were described extensively in [Lischinski and Gonczarowski 1990], so we refer
readers to this work. To summarize, the main problem with Toth's algorithm narrows down to
the inability to use coherence between neighbouring rays. Indeed, two rays that are fIred off next
to each other, will most probably hit the same object at almost the same intersection point. So,
the calculations needed for one ray will almost be the same as those needed for a neighbouring
ray. Therefore, it should be possible to use (at least some of) the calculations again from one ray
to another. As Lischinski and Gonczarowski stated in their work, this is impossible in Toth's
algorithm, due to the use of the K-operator to reduce the search space. This is not a regular
reduction since this operator depends on the surface, as well as on the ray direction. So, two
neighbouring rays will produce different K-operators. As a consequence, the K-operator of one
ray is not useful for another ray. Lischinski and Gonczarowski [1990] give a solution for this
problem: they use a regular subdivision scheme together with a tree caching mechanism, similar
to the approach from [Rubin and Whitted 1980].
194

4. Tree Caching According to Lischinski and Gonczarowski

In their paper, Lischinski and Gonczarowski present a new method for caching the subdivisions
from one ray to another, to allow the algorithm to use these subdivision again. In order to
overcome the disadvantage of the ray-dependence of the K-operator in Toth's algorithm, they use
regular subdivision into four subpatches. This way, the search for an intersection can be stored
in a quadtree (called "surface tree"). In each node, the parametric subregion and the bounding
box for the subsurface are stored.

Their general algorithm distinguishes two different cases. If a patch is searched for an
intersection for the fIrst time, such a quadtree is constructed. Afterwards, the surface tree is
cached. When another ray has to be tested against the same surface, it is fIrst intersected with the
cached tree. Only those leaves whose bounding boxes are intersected by the ray need to be
considered for further testing. Of course, one cannot create an unlimited number of trees. It is
not realistic that a cache can be used where all such surface trees can be stored, so a method of
releasing occupied cache space is necessary. Lischinski and Gonczarowski use a LRU (least
recently used) algorithm, similar to algorithms used in virtual memory systems, where memory
pages have to be released in order to use them for another chunk of data. They chose to construct
a queue containing the roots of all cached trees, that is rearranged according to the usage of the
trees. Every time a tree is used in the searching scheme, it is moved to the end of the queue in
order to ensure that the least recently used tree is always the fIrst element in the queue. When the
available space is totally full, a tree is released from the head of the queue and the freed space can
be used for another tree.

Another part of their method deals with better sampling orders. In traditional ray tracing systems,
there are no special requirements for the sampling order, since every pixel can be calculated
independently from all other pixels. But when coherence is wanted, it is a better practice to use
an appropriate sampling order. If one wants to exploit coherence as much as possible, one has to
be sure that the reuse of cached trees is encouraged by the algorithm. Lischinski and
Gonczarowski investigated two different sampling orders in combination with their tree caching
scheme. Firstly, they used an item buffer (introduced in [Weghorst et al. 1984]) to indicate
which objects are associated with a pixel, so the raytracer can keep working on the same object
(and thus use the cached tree) as long as possible. The item buffer is constructed at a low
resolution (e.g., 64 x 64) and one surface is kept in every buffer entry. This way, it is possible
to determine which surfaces are visible through a certain part of the screen. On this basis, a better
sampling order than scanline order can be deduced.
195

A second, more effective, order used by Lischinski and Gonczarowski is the Peano curve order
(see, e.g., [Witten and Neal 1982]). Following the path of this curve, we see that it divides a
square region into four quadrants. Each quadrant is fully visited before the curve goes on to the
next quadrant. The result of using this order for sampling the image screen, successively
sampled pixels stick together much more than in scanline order.

The last part of their theory concerns a better treatment of secondary rays. By making some
simple observations about the algorithm, they can easily speed up the evaluation of reflected,
refracted and shadow rays. Since our work will not discuss the handling of secondary rays any
further, we refer the interested reader to their paper.

In the rest of this work we will state some of our findings when implementing a tree caching
algorithm within our parallel ray tracing system. The parallel nature of our system forced us to
use a different approach on some levels of the algorithm, than the one described before. We also
wanted to "fill the gaps" left open in the conclusions section of [Lischinski and Gonczarowski
1990]: further research could be done on the depth control and on more elaborate schemes for
releasing cached surface trees.

Let's first explain what our parallel ray tracing system looks like.

S. The Parallel Ray Tracer

There are many ways to speed up the traditional ray tracing algorithm. In our implementation we
used some of them: (i) regular space sulxlivision into voxels ([Amanatides and Woo 1987]), (ii) a
network of parallel processors and (iii) the tree caching scheme described in this paper.

Our network consists of transputers: fast floating point processors with 4K on-chip static RAM
and (in our system) 2 MB of dynamic RAM, equipped with four serial links to allow
communication with connected transputers - arranged in the well-known network topology of a
processor farm. In this topology, one proceccor (the controller) is at the head of a linear list of
"workers" (see fig. 2).

This processor divides the workload between the worker processors, gets the results back from
the farm and sends them to the graphics process. The division is done as follows.
196

controller worker \ worker 2 worker n-2 worker n-\ worker n

--~
--~

graphics

Fig. 2: The processor farm topology

Unlike a scanline algorithm, a ray tracing system can virtually calculate every pixel independently
from every other pixel on the screen. Thus, in our configuration, the coordinates of every pixel
could be given to one processor, which returns the colour of the pixel. In practice, this approach
is not efficient enough, since the overhead implied by the communication needed to send the
coordinates to the workers and to send back the resulting screen colour would be too large.
Therefore it is a better practice to use groups of pixels. This is exactly the screen division scheme
we use: every worker gets the coordinates of a small (e.g., 16 x 16 pixels) square part of the
screen, fires rays through each of the pixels in this square and finally sends back the shading,
resulting from the intersection between the ray and the objects in the scene. As we will see in the
next section, this scheme has certain implications for our tree caching algorithm.

Another very important aspect of this network is the load balance. This is a problem inherent in
the use of a parallel system: we have to make sure that every processor participates in the
rendering of the scene as long as possible. For example, if we used a network of four
processors, we could divide the screen into four quadrants and let every processor work on one
quadrant. Suppose a very complex object (in terms of intersection calculation) is visible in the
first quadrant and that no objects are visible in the three other quadrants, then the first processor
would have to work very hard while the three other processors would be able to finish very fast.
Hence, it is very important to divide the workload between the processors in such a way that all
processors have a (more or less) equal lot of work to do throughout the total rendering time. So,
the screen squares we talked about in the previous paragraph should not be too large and should
be distributed uniformly among the workers, in order to avoid the possibility of a bad load
balance.

This scheme of work distribution of course has some severe implications for the tree caching
algorithm.
197

6. Implications of the Parallel Ray Tracer for the Tree Caching Algorithm

The fIrst implication is the distributed memory. In the whole discussion of the surface tree
caching scheme, the implicit assumption was made that there is just one processor that has to
render the whole screen and that has to manage the cache memory. In our system, however,
there are many processors working together on one screen and managing their own tree cache
(since each transputer has its own RAM and we do not share memories between processors).
Hence, if one processor has to intersect a patch for which it has no tree cached yet, while another
processor does have a tree cached for this particular patch, a possible solution would be to ask the
latter processor to send his tree to the former, so this one does not have to calculate these
subdivisions himself. If we adopted this way of sharing trees between processors, the
communication needed to ask other processors if they have a tree cached for a certain patch and to
send back this tree (if there is one) would have some severe implications. Firstly, the processor
with the cached tree has to be interrupted in its own work in order to send the tree to the one that
is waiting for it. Secondly, the size of trees implies that the time needed to send such a tree from
one processor to another will produce a significant overhead. And fmally, there is the problem of
reachability: the only way of sharing data between two transputers is sending this data via one of
the four communication channels. But one particular processor cannot reach any other processor
in the network, since it is only connected to two neighbouring processors (recall the farm
topology of fIg. 2). Therefore, if e.g., processor I wants to send data to processor 6, all
processors on the path between these two processors (i.e. processors 2-5) have to be disturbed,
too: on each of these intervening processors, there has to be a process that simply takes in data
and sends it through to the next processor. Of course, such a process will use some processor
time that could be used for other tasks. Note that this reachability problem is not inherent in our
farm topology: up till now, transputers have only four physical communication links, so it will be
a problem in any topology to reach an arbitrary processor from anywhere in the network. To
summarize, the communication overhead implied by the sharing of trees between processors
cannot be efficient enough. So, every processor works within its own square part of the screen
as if this were the whole screen.

Another consequence of the work distribution in our system is that each processor will only "see"
a relatively small number of objects in the scene. After all, it just works within a small
subscreen. So, when starting with a new screen square, a limited number of objects will have to
be intersected by this processor. Hence, the number of trees cached at any moment will also be
relatively small, so it would not be a good practice to limit the tree depth to a given maximum.
Otherwise, the profit gained from tree caching would be too small. On the other hand, most of
the time it is not desirable to delete whole trees from the cache, but just to prune some "old"
branches.
198

Finally, the sampling order cannot be exploited anymore, or at least not to the same extent as in
the research from Lischinski and Gonczarowski, since there is no "global" work distributor as in
a monoprocessor implementation.

In order to make a trade-off between load balance and coherence exploitation, we decided to
adopt a tree management scheme slightly different from the one described in the introductory
sections of this work.

7. Our Surface Tree Caching Algorithm

Now we are ready to describe the data structures and algorithm in more detail.

In the algorithm, a collection of (quad)trees and a list structure are used. A node in these trees
contains the (sub)patch's control points, the bounding box and the parametric subregion on
which the (sub)patch is defined (the root node is defined on the region [O,I]x[O,I]).

Throughout the whole tree cache, we maintain a doubly linked list of candidate nodes where
pruning can be done: this list contains all the nodes which have 4 leaf children. E.g., in fig. 3,
the grey nodes are members of the list. We will show how this list is used to free cache space.

Fig. 3: A cached tree (with candidate list nodes in grey)

The general outline of the algorithm is as follows:

if <no tree is cached yet for this patch> then


<create root for new tree> (which will be a leaf)
<lake cached root>
current := FirstValidLe<!, (root)
199

while (current <> NIL) do


<check loth's criteria on current node>
case
<criteria o.k.>
<perform Newton iteration>
<parameter space too small or patch surface too flat>
<perform 1 Newton step>
otherwise
<subdivide leaf's patch into four subpatches>
<leaf becomes an interior node with the 4 subpatches as new leaves>
current := NextValidLeaf (current)

Essential to the algorithm are the functions FirstValidLeaf and NextValidLeaf. These functions
have several combined tasks. Essentially, they perform a modified depth- first search whose
purpose is to find the first (next) valid leaf. This is a leaf whose bounding box is intersected by
the ray and whose value of tmin (this is the value of the ray parameter t at the point where the ray
enters the bounding box of the patch's control points) is smaller than the t-value of the closest
intersection found yet. Suppose that our cache mechanism already contains a tree as depicted in
fig. 3. When FirstValidLeaf is called, the algorithm starts the "depth-first" search with fig. 4 as
result.

Fig. 4: The result of a call to FirstValidLeafwith the cached tree offig. 3

At each node, the bounding boxes of the four children are intersected; then the nodes of the
children are rearranged so that they are sorted from left to right on their values for tmin' In the
figure, all grey nodes have been sorted. Once a valid leaf has been reached, the function returns a
pointer to this leaf (the black node in the tree). If, however, no such leaf can be found after
scanning the whole tree, the return value is NIL.
200

Of course, the global list structure may have to be modified each time new leaves are added to a
tree. In fig. 5a, we have a situation in which new leaves have to be added to the current node (the
black one). Since the current node's father is a member of the list, this father has to be removed
from the list. Then the black node has to be inserted at the head of the list, which results in fig.
5b (with a new current node in black).

Fig. 5a: Children (leaves) have to be added to the current node (black)

Fig. 5b: Children are inserted. the candidate list is updated and a new current node is selected

Now we are able to see how memory can be freed, using this list. As stated before, we don't use
a method as drastic as the one of Lischinski and Gonczarowski: instead of deleting entire trees,
we prune a set of leaves in a tree in order to free the space occupied by these leaves. We
implemented the LRU algorithm such that the nodes that are least recently used will be at the tail
of the list. In the situation of fig. 6, the node at the tail of the list is the first candidate to release
its children for usage elsewhere in the tree cache. When at a certain m()ment in the tree-building
phase some shortage of space occurs, the leaves of the tail node will be pruned away such that
this space will be available again; the tail node becomes a leaf itself. This situation is depicted in
fig. 7.
201

Fig . 6: The node at the tail o/the list is the first candidate to release its children

Fig . 7: The result o/pruning the children from the tail node

8. Performance

In order to get a rough impression of the relative perfomance of our tree caching method, in
comparison to previous algorithms, we rendered the classic Utah teapot using the tree caching
algorithm, as well as with our implementation of Toth's algorithm. In this perfomance test, the
scene contained a reflective teapot (reflection depth 2) and two lights at infinite distance. The
video resolution of 768 x 576 pixels was used. The tree cache size was approximately 832 Kbyte
(the free memory we had in our test program). In a preprocessing step, all patches are
subdivided into four subpatches. Our research proved that this subdivision yielded an increase in
speed of roughly 10%, at the cost of using more memory for the object database. This speedup
turned out to be a consequence of the fact that the four subpatches become less curved than the
original one. This simple observation was already made when we implemented Toth' s algorithm.
202

We use figs. 8 - 10 to show the performance of our algorithm, in comparison with our
implementation of Toth's algorithm. At the same time we look at the linearity of both algorithms:
we rendered the same teapot scene on a range from 2 until 23 processors.

Figure 8 shows the absolute time needed to render the teapot scene using each of the two
intersection algorithms. It turns out that our Toth implementation is a lot slower than tree
caching: e.g., Toth needs 6 processors to become as fast as tree caching running on just 2
processors (i.e., ± 2000 seconds).

6000~--V------------------------------------------'

Teapot rendering time (sees.)

4000 ~ Toth
• Caching
3000

2000

1000

0L-~~~~::~~~~~~~
o 2 4 6 8 10 12 14 16 18
Fig. 8: Rendering times for the teapot scene, depending on
the intersection algorithm and the number of processors

100
CD
CD
f/;
Rbsolute speed
----. -...........-
...... .. • . .........
120,----------------------------------------------,
"tI
•• -....... ~.......

80 (piHels/sec./processor)

~ Speed(Toth)
60 • Speed(Caching)

40

20 +-.-..."""T""-T--r-r--.--,r-..-r-...-~~...-.-..."""T"..,.--rProcessors
o 2 4 6 8 10 12 14 16 18 20 22 24
Fig. 9: Absolute speed of both algorithms
203

In fig. 9, we depicted the absolute speed of the two methods, expressed in terms of pixels /
second / processor. It shows that our Toth implementation renders about 35 pixels per processor
per second while the tree caching method yields about 108 pixels. This gives a rough impression
concerning the relative speed of the two intersection schemes. The exact ratio "speed (Caching) /
speed (Toth)" is shown in fig. 10: it turns out that tree caching takes roughly 31 % of Toth's
rendering time. This is a ratio, comparable to the figures given in [Lischinski and Gonczarowski
1990].

45~-----------------------------------------,

40
Relatiue speed [achinglToth (%)

35

30

25

20 -j--.-..,.............,...-r-r-.--r-r--r-r--r--r--y---r--y-,.--y--. Processors
o 2 4 6 8 10 12 14 16 18 20 22 24

Fig. 10: Relative speed: ratio Caching/Toth (in percents)

As a more concrete example of performance, we show some pictures, taken directly from the
screen. Figure 11 shows a scene, consisting of a glass bottle (with 44 patches defining the
double-sided glass) lying on top of a grid of reflecting cylinders and a procedural water-textured
plane. This scene was rendered with reflection depth 7; it took 2312 seconds on 13 processors
(resolution 768 x 576 - not anti-aliased). For the scene in fig. 12, we drew our inspiration from a
picture by Achim Strasser of the University of Karlsruhe in Germany (appearing in [Magnenat-
Thalmann 1990]). In this picture we also used depth 7 for our reflection tree and reflective
cylinders. The coctail glass is also double-sided and consists of 20 patches. This anti-aliased
(786 x 576) picture took 23212 seconds (nearly 6.5 hours) on 13 processors.
204

Fig. 11: EXlJmple picture: "Bottle on sea"

Fig. 12: EXlJmple picture: "Bottle with cae tail glass"


205

9. Conclusions, current and future research

In this work, we investigated the usefulness of previously developed algorithms for rendering
curved surfaces in the framework of a parallel ray tracing system. We wanted to exploit
coherence between rays as well as retain a good load balance between the worker processes. As
a result, we adopted a tree caching scheme, similar to that from Lischinski and Gonczarowski,
and tuned to the use within our parallel system.

In section seven, we saw a fundamental difference with the algorithm of Lischinski and
Gonczarowski: in their algorithm, a cache tree is only constructed for those patches that do not
have a cache yet. Once this cached tree is built, it remains unchanged and is just used to be able
to start Toth's scheme from a smaller patch. For their method, Toth's method serves two distinct
goals: (1) to assist in the building of a tree cache and (2) to perform patch/ray intersection
calculations, starting from a leaf in the cached tree.

In our algorithm however, caching is performed continuously. Toth's two goals are merged into
one: a patch cache tree will be updated every time the patch is tested for ray intersection in Toth's
algorithm; new nodes may be inserted into the cache tree as soon as Toth focuses on previously
unvisited subpatch areas. This way, our tree cache is updated more dynamically than that of
Lischinski and Gonczarowski.

We are currently investigating the influence of the work distribution on the ray coherence, while
the load balance has to be retained. E.g., we could provide the work distributor with some more
intelligence. In our approach, we do not have explicit control over which parts will be rendered
by which processor; as a consequence, ray coherence cannot be exploited to the extent we want
to. If the work distributor knew in which part of the screen a processor was working, it would
be able to let this processor stick to the current screen area as closely as possible.

Another aspect to be considered is the refinement of a father node's tmin value, when its children
are pruned. Since the father node's bounding box encloses the children's boxes, the father's
original tmin value is never greater than those of the children. As a consequence, the smallest
tmin value amongst the children nodes will never be worse than the father's value (most of the
time even better). So, it would be a better practice to store this smallest tmin in the father node,
instead of its original value. This has the implication that the tree will have to be resorted.

Yet another research topic could be sharing cache memory amongst processors. By using a
processor with a substantially larger memory capacity, one large cache could be used by several
processors.
206

References

Amanatides J, Woo A (1987), A Fast Voxel Traversal Algorithmfor Ray Tracing. In: Marechal
G (ed.), Eurographics '87 Proceedings, 3-10
Blinn IF (1978), Simulation of Wrinkled Surfaces. Computer Graphics (SIGGRAPH '78
Proceedings) 12 (3): 286-292
Clark JH (1979), A Fast Algorithm for Rendering Parametric SUrfaces. (Distributed only to
attendees of SIGGRAPH '79). In: Joy KI, Grant CW, Max NL and Hatfield L (eds.)
(1988), Tutorial: Computer Graphics: Image Synthesis. Computer Society Press,
Washington DC
Farin G (1988), Curves and Surfaces for Computer Aided Geometrix Design. Academic Press,
London
Glassner A (1989), An Introduction to Ray Tracing. Academic Press, London
Hall R (1989), Illumination and Color in Computer Generated Imagery. Springer-Verlag, New
York
Lane JM, Carpenter LC, Whitted T, Blinn JF (1980), Scan Line Methods for Displaying
Parametrically Defined Surfaces. Communications of the ACM 23 (1): 23-34
Lischinski D, Conczarowski J (1990), Improved techniques for ray tracing parametric surfaces.
The Visual Computer 6 (3): 134-152
Magnenat-Thalmann N (1990), Computer Art Forum. The Visual Computer 6 (4): 242
Nishita T, Kaneda K, Nakamae E, High-Quality Rendering of Parametrix Surfaces by Using a
Robust Scanline Algorithm. COl '90 Proceedings, 493-506
Rubin S M, Whitted T (1980), A 3-dimensional representation for fast rendering of complex
scenes. Cumputer Graphics (SIGGRAPH '80 Proceedings) 14: 110-116
Toth D L (1985), On ray tracing parametric surfaces. Computer Graphics (SIGGRAPH '85
Proceedings) 19: 171-179
Weghorst H, Hooper G, Greenberg DP (1984), Improved Computational Methods for Ray
Tracing. ACM Transactions on Graphics 3: 52-69
Whitted T (1978), A Scan Line Algorithmfor Computer Display of Curved SUfaces. Computer
Graphics 12 (3): 26
Witten IH, Neal RM (1982), Using Peano Curves for Bilevel Display of Continuous-Tone
Images. IEEE Computer Graphics and Applications 2: 47-52
207

Authors

Wim Lamotte is a research assistant at the Limburg University Center and a


member of the reseach staff at the Applied Computer Science Laboratory in
the same university. He obtained his Master's Degree in Computer Science
in 1988 at the Free University of Brussels, Belgium. His research interests
include ray tracing, computer animation and parallel processing.

Koen Elens is a member of the reseach staff at the Applied Computer Science
Laboratory in the Limburg University Center. He obtained his Master's
Degree in Computer Science in 1990 at the Free University of Brussels,
Belgium. His research interests include parametric surfaces, ray tracing and
computer animation.

Eddy Flerackers if currently full Professor of Computer Science at the


Limburg University Center, Belgium. He studied Physics at the University
of Louvain, Belgium. He received his PhD in Physics in 1980 at the Free
University of Brussels with a thesis on nuclear structure calculations. Since
1987 he is Director of the Applied Computer Science Laboratory at the
Limburg University Center. He is also promotor of a Governmental project
for the introduction of computers and computer science in education. His
research interests include computer graphics, 3D computer animation,
scientific visualisation, simulation and programming environments. He is
member of the ACM, the IEEE Computer Society, the Computer Graphics
Society, The Committee of the Computer Graphics and Displays Group
(BCS)

Mailing Address

Applied Computer Science Lab


Limburg University Center
Universitaire Campus, B-3590 Diepenbeek, Belgium
Phone: (32)-11 -22 99 61 / (32)-11-24 29 85
Fax: (32)-11-223284
Telex: 39948 LUC b
E-mail (BitNet): LUCLTI@ BDIEHLII / lamw@ BDILUCOI
Chapter 4
Volume Rendering
Context Sensitive Normal Estimation for Volume
Imaging
Roni Yagel, Daniel Cohen, and Arie Kaufman

ABSTRACT

Three-dimensional voxel-based objects are inherently discrete and do not maintain any
notion of a continuous surface or normal values, which are crucial for the simulation of
light behavior. Thus in volume rendering, the normal vector of the displayed surfaces
must be estimated prior to rendering. Unlike existing normal estimation methods, the
context sensitive approach proposed here considers object and slope discontinuities. It
employs segmentation and segment-bounded operators in order to achieve high fidelity
normal estimation for rendering volumetric objects.

Key Words: discrete shading, volume rendering, volume shading, filtering, segmentation.

1. INTRODUCTION

The realistic display of 3D objects on a flat screen depends heavily upon providing a
variety of depth cues in order to convey to the observer the illusion of a third dimension.
One of the most effective depth cues is light behavior on the displayed surfaces. Humans
are sensitive to phenomena such as surface colors (shading), light reflection, transparencies,
light refraction, shadows, perspective, and occlusion of one object by another. The
computer simulation of such light phenomena is based on the physical laws of light
behavior, which functionally bind much light behavior to the inclination of surfaces in the
scene (e.g., [Potmesil and Chakravarty 1982], [Torrance and Sparrow 1967], [Warn 1983],
[Whitted 1980]). Thus, an essential requirement for most rendering methods is the ability
to compute the normal vector to the surfaces comprising the 3D scene. In traditional 3D
graphics, where surfaces are defined by geometric objects (e.g., polygons, parametric
patches), the normal values can be easily calculated. However, in 3D volumetric graphics,
objects are digitized into voxels in a process that does not preserve the notion of surfaces
and loses the object's continuous smoothness due to quantization. Moreover, the
volumetric approach lends itself to the representation of sampled or simulated natural
phenomena (e.g., 3D biomedical data, simulated fluid flows), in which surface inclination is
not present in the data in the first place.

211
212

In order to achieve a realistic display of 3D volumetric objects, it is essential that the


volume rendering technique can estimate a normal value for each voxel to be rendered.
The process of computing normal values in a 3D voxel map of sampled phenomena and
synthetic models is referred to as normal estimation and is the subject of this paper. The
next two sections survey previous methods for normal estimation, analyze their
performance, and point at their main flaws and deficiencies. Previous work has usually
compared existing methods by observing either the quality of images they produce [Bright
and Laflin 1986, Chen et a!. 1985, Magnusson, Lenz, and Danielsson 1988, Tam and
Davis 1988] or the estimation of the normal values in a few scenes [Tiede et a!. 1990].
Our approach is to analytically examine the normal values themselves and identify the
sources of the method's weaknesses.

In Section 4, we observe that the use of information from a small and fixed neighborhood,
coupled with the inability to detect discontinuities, accounts for most maladies
encountered in the existing methods. Based on these observations, we propose the use of a
discontinuity detection mechanism that discerns the borders of surface regions called
contexts. This process is driven by the requirement that, except at context borders, the
surface segment must exhibit a high level of uniformity, that is, it is a continuous surface
patch (CO continuity) with a gradually changing tangent (C 1 continuity). Successive
operations aiming at noise reduction, smoothing, and normal estimation are all restricted
to the information available in one context, making them context sensitive operators.
Context sensitive prefiltering and postfiltering are also suggested as remedies for the errors
caused by noise and discretization (quantization) artifacts. The context sensitive normal
estimation method is capable of producing high fidelity results that can be used for the
creation of quality shading and can provide accurate surface inclination for volumetric ray
tracing.

2. ANALYSIS OF NORMAL ESTIMATION METHODS

Three-dimensional image rendering of synthetic scenes relies on normal values for the
simulation of light behavior, and thus, the quality of a normal estimation method can be
appreciated by inspecting the resulting images. However, the human visual perception of
"image quality" is deceived by the contribution of a variety of rendering parameters (e.g.,
image and color resolution, specular reflection) and psychological factors (e.g., object
recognition, smoothness). This deception can be so extensive that less accurate normal
estimation might produce a more pleasing image. For example, a method that accurately
restores the normal in most cases while retaining a few (severe) errors forms the basis for
accurate simulation of light behavior, but exhibits a few noticeable areas where unpleasing
artifacts stand out from the smoothly shaded image. On the other hand, a method which
is "smoothly" inaccurate can provide normal values that produce pleasing smooth shading
but cannot support the accuracy needed for more sophisticated light models (e.g., ray
tracing).

As a first step towards accurate quality assessment of normal estimation methods, we


identify the basic possible surface structures which comprise typical scenes (e.g., linear and
quadratic surfaces, surface and tangent discontinuities, noise, discretization artifacts).
These basic surface structures, called configurations, can be geometrically defined and
algorithmically voxelized (3D scan-converted) into their discrete representation [Cohen and
Kaufman 1990, Kaufman and Shimony 1986, Kaufman 1987, Kaufman 1988]. The
213

Continous Image
A F G H J

Discrete Projection

,,, I lil':"'I,
,;:"""-,,,-'._ ••... _ •.1..•... :; .• .:.,.• :.:.:_••.:.1 ..".: _.~::!~:.!:::!!:~~t.!:~;. ;;.:;J!:i;!::~i:L:.:::.!~~. L~

Error :: Contextual
A C D G H J

1111111111111111111111 II 1111 I 11111111 I 1111. ,1111,,1 "Ii Ii ", "dlll"I"Ii


Figure 1: Performance of the normal-based contextual method.
214

voxelization algorithm also calculates the true normal at each point. In order to emphasize
the visual appearance of the discretization artifacts, the configurations are voxelized into
an image buffer having one fourth of the actual screen resolution. The 3D image buffer is
then projected, yielding a 2D frame buffer as well as a buffer of depth values that is used
as an input to the normal estimation process. The behavior and fidelity of an estimation
method can be judged by observing the discrepancies between the true normal values and
the estimated values.

In order to portray and visually appreciate the behavior of a normal estimation method,
we examine the values it produces for the simpler case of estimating the inclination of 2D
configurations by using a 2D image of linear objects that is projected onto a 1D "screen".
Figure 1 depicts the behavior of a normal estimation method over a set of configurations.
The rectangle area labeled Continuous Image presents the smooth, continuous geometric
definition of the configurations. The area labeled Discrete Projection presents the depth
values produced by a parallel projection of the discrete image downward. This projection
already exhibits discretization artifacts due to its low resolution. The area labeled Analytic
Normal displays the values of the calculated true normal values (measured as an angle in
[0 .. 71"]) at each projected point. The area labeled Discrete Normal presents the values
estimated by the method under examination (e.g., contextual). The area labeled Error
presents an exaggerated view of the discrepancies between the estimated normal value and
the true normal value. By analyzing the reasons for the flaws existing in current normal
estimation methods, we are able to design a method that overcomes most of them. We
propose here the context sensitive approach to normal estimation and assess its
performance - not only by observing the quality of the shading of the images it renders,
but mainly by examining the normal values it produces.

Figure 1 depicts the configurations used to analytically evaluate the quality of a normal
estimation method. A discontinuity in the depth values (CD discontinuity, A and E in
Figure 1) allows us to examine the performance of the estimation method at surface edges
or silhouettes. A discontinuity in the surface inclination (C 1 discontinuity, F, G, and I in
Figure 1) makes it possible to verify the ability of the method to avoid inaccuracies at
inflection points. The ramp configuration (B and H in Figure 1) allows the observer to
study the sensitivity of the method to "staircase" discretization artifacts. The noise
configuration is used to examine the sensitivity of the estimation method to noise which is
either attached to or detached from the visible surface (C and D, respectively). Finally,
surface configurations in a variety of orientations (J) facilitate measurement of the
accuracy achieved by the method at a full range of angles.

3. EXISTING METHODS FOR NORMAL ESTIMATION

Existing normal estimation methods are based on examining a close neighborhood of the
voxel in order to estimate the inclination of the surface it belongs to. If the method
examines the voxel neighborhood in the 3D scene, it is referred to as a 3D or voxel-space
method. If it examines the voxel neighborhood in the projected image, it is referred to as a
2D or pixel-space method (see also [Kaufman 1990]).

The contextual method examines a small neighborhood (context) of the current voxel in
voxel-space, identifying all voxels in the neighborhood that belong to the same surface. A
normal can then be calculated by fitting a surface to this set of points (e.g., a plane
215

Continous Image
A C F G H J

Discrete Projection

Figure 2: Performance of depth gradient method, using a


9-neighborhood central difference.
216

[Bryant and Krumvieda 19S9J, biquadratic surface [Webber 1990]). However, these
techniques tend to smooth away variations in the surface inclination and give rise to
oscillation (Gibbs effect) when applied around surface discontinuities [Grimson 19S1J.
Since this process is most time consuming, several approximations to the contextual idea
have been devised. Observing that there are a small number of different neighborhood
arrangements representing the passage of a surface through the neighborhood (and the
current voxel), normal values are precomputed for each such arrangement and stored in a
table. When a normal is to be computed for a specific voxel, its immediate fixed size
neighborhood is mapped to the most similar neighborhood arrangement in the
precomputed table [Schlusselberg, Smith, and Woodward 19S6J.

The normal based contextual shading method [Chen et al. 19S5J was the first to apply an
approximation of the contextual idea. This method is based on the cuberille model for
volume representation, in which voxels are assumed to be a unit cuboid (rectangular
parallelopiped) having six faces whose normals are in the directions of the primary axes.
The basic element to be examined is a voxel face. Its normal is estimated based on the
face's own orientation as well as the orientation of its adjacent faces (the four faces that
share an edge with it). Since this method operates on the voxel-space data, it is insensitive
to noise that is detached from the surface (D in Figure I). However, this method maps all
normal values in the range [0 .. 7I"J to only five values (71"/6, 271"/6, 371"/6, 471"/6, 571"/6), which
causes an error of up to 71"/12 for normal values in the range [71"/6 .. 571"/6J (B and F in
Figure I) and up to 71"/6 in the ranges [0 .. 7I"/6J and [571"/6 .. 7I"J. The very small
neighborhood examined by this method is the culprit for high sensitivity to discretization
artifacts (H in Figure I). This method is also error prone in the vicinity of CO
discontinuities (A and E in Figure I) and C 1 discontinuities (G and I in Figure I).

The depth gradient shading method [Cohen et al. 1990, Gordon and Reynolds 19S5,
Hom 19S2J accepts as input a depth buffer, which contains the distance from the observer
(z depth) of each visible voxel. The surface normal is obtained from the gradient vector
(az /ax, az /ay, -I) where the partial derivatives are approximated (in pixel-space) by the
differences between the depth values of the current pixel and its immediate neighbors in
the depth buffer.

Denoting by D;.i the depth value at the point (i ,j), we define the backward difference,
the forward difference, and the central difference, respectively, at the point (i , j) along
the X axis to be:
B·I.J. D; -l.i - D;.i (1)
F·I.J. D; +1.i - D;.i (2)

C;.i = D; +1.i - D; -l.i = F;.j - B;.i (3)

The values B;.i' F;.i' or C;.i /2 are used to compute !l.z /!l.x that approximates the
partial derivative az /ax. Similarly, by defining backward, forward, and central
differences along the Y axis, the value of az / ay is approximated.

Although this method can restore more normal values than the normal based contextual
method, it still suffers from a low resolution of the normal values (less than 16 significant
values [Cohen et al. 1990]), which causes noticeable errors (B, H, and J in Figure 2).
217

i Continous Image
A B C D E F G H J

Discrete Projection

Error .. Depth Gradient [5]


A B C D E F G H J

Figure 9: Performance of depth gradient method, using larger 5-neighborhood.


218

Continous Image
A B C D E F G H J

Discrete Projection

,I' !·'II ' , ..,. ,,1''1'1. t:l:r·II,I,"",...,


jll~l~ru:li:!.~~;:~~i:~~~::~~::.L1:;!~,I~i:.L:!I!l!:!ili!!i!!:t:,!~~!~,lbil~~;!:111:1:[:1: __
Error " Weighted Depth Gradient [3]
AB CD FGHI J

Figure 4: Performance of the weighted depth gradient method.


219

Moreover, the set of normal values restored by this method is not evenly distributed,
causing the appearance of severe errors, in particular, in the range [0 .. 11"/4] where the
normal resolution is very poor (e.g., the first identifiable normal is 26° [Cohen et al. 1990])
(B in Figure 2). As in the contextual methods, high sensitivity to discretization artifacts
can be observed (originating from the small neighborhood used) (H in Figure 2). Thus, the
ramp ("staircase") configuration produces a high variance in the normal value near the
stairs (H in Figure 2). When the image is shaded, this causes the appearance of unpleasing
dark bands "emphasizing" the stairs [Bright and Laflin 1986], as can be seen in Figure 8.
Compared with the contextual methods, its sensitivity to noise is higher, since this method,
operating on the depth projection buffer, is unable to identify noise that is disconnected
from the surface (D in Figure 2).

The depth gradient method lacks the knowledge of surface boundaries; thus in the vicinity
of surface borders, its calculations are based on information from other surfaces as well.
This causes the undesired averaging of normal values in areas exhibiting depth
discontinuity (A and E in Figure 2) or tangent discontinuity (G and I in Figure 2), which is
manifested in the rendered image as dark bands around object edges and silhouettes.

This method (like most others) examines a small and fixed neighborhood, thereby
increasing its sensitivity to noise and discretization artifacts. The size of the considered
neighborhood can be increased, in which case Equations 1-3 are generalized. For example,
if we denote by N the size of the neighborhood in one direction (e.g., for the 3-
neighborhood N =1) we redefine the backward difference:
(4)
or

B·, " . = 1
N
f
k=1
D· k . -
,- "
D· .
'"
(5)

Notice that Equation 1 is a special case of Equations 4 and 5 where N =1. Enlarging the
neighborhood size might ease errors caused by noise and discretization (H and J in Figure
3) and increase the resolution of the normal values (B in Figure 3), but it will also
emphasize errors caused by discontinuities (A, C, D, G, and I in Figure 3). Several
mechanisms have been proposed to overcome some of these flaws. Gordon and Reynolds
[Gordon and Reynolds 1985] suggest a weighted average function to ease the problem of
discontinuities.

Figure 4 shows the results obtained by the weighted depth gradient method demonstrating
the improvement at large depth discontinuities (A, D, and E in Figure 4) but not at
tangent discontinuity or at discretization artifacts (B, G, H, I, and J in Figure 4). Bright
and Laflin [Bright and Laflin 1986] propose an interpolated depth gradient method that
examines a variable neighborhood and interpolates the gradients of two adjacent
neighborhoods. This method improves gradient resolution and smoothes the gradient
variance near the stairs.

Figure 5 displays the result of applying this method to the same data, showing the
improvement in areas where error is caused by the small neighborhood used by the original
depth gradient method (B, H, and J in Figure 5). However, this method increases the error
in some locations (A, C, D, and E in Figure 5) by interpolating across discontinuities.
Using a weighted average of all the values in a 3 X 3 neighborhood [Magnusson, Lenz, and
220

Continous Image
A C D F G H J

Discrete Projection

Discrete Normal

Error .. Interpolated Depth Gradient [3]


A CDEFGHI J

Figure 5: Performance of the depth gradient method with


interpolation, using a 9-neighborhood central difference.
221

Danielsson 1988, Tam and Davis 1988] is expected to improve resolution of the normal
values.

The gray-level gradient shading method [Hoehne and Bernstein 1986] uses the gradient of
the data values (gray-level values) as a measure for the surface inclination. More formally,
we denoting by V the gray-level function, the normal is obtained from the gradient vector
(av lax, av lay, av laz). The partial derivatives are approximated (in voxel-space) by
the differences between the gray values of the current voxel and its immediate neighbors.
That is, denoting by Vi ,j ,k the discrete value of the gray-level function V in coordinates
(i,j,k),
LlVILlx = YO+ 1
I "
° L - yO_l
,"
°
I . J ."
L (6)
is the 3-neighborhood central difference approximation of av lax. In a similar way,
av lay and av laz are approximated. This method is based on the assumption that
objects exhibit a partial volume effect, which means that the data values in the
neighborhood of a surface voxel reflect the relative average of the various surface types in
them, and thus can be used as a measure for the surface inclination [Goldwasser 1986].
For medical applications where partial volume effect does exist, the gray-level gradient
method has been shown to produce accurate shading [Tiede et al. 1990].

Since this method fails whenever the assumption of the partial volume effect is untrue, it is
not suitable for estimating normals of binary objects (e.g., thresholded objects, synthetic
objects that were algorithmically voxelized without anti-aliasing). Moreover, the existence
of binary defined areas in a sampled image (e.g., cut planes) or very thin objects requires
the use of other shading methods for these areas [Hoehne et al. 1990]. Pommert et al.
[Pommert et al. 1990] suggest an adaptive method to ease artifacts in thin objects.
Several researchers [Magnusson, Lenz, and Danielsson 1988, Tiede et al. 1990] reported the
use of all values in a 3 X 3 X 3 neighborhood and the use of a larger neighborhood [Udupa
and Hung 1990] in order to achieve a greater accuracy of the normal values.

The local surface interpolation [Webber 1990] method approximates a biquadratic surface
based on the gray values of the voxel's immediate 26 neighbors. This method is time
consuming but produces accurate results for scenes composed of C 1 continuous objects
(e.g., quadratic objects) that exhibit the partial volume effect. However, as mentioned
above, the fitting of a smooth (biquadratic) surface in areas of high inflection tends to
smooth away the C 1 discontinuity. In areas of CO discontinuity, it gives rise to oscillation
(Gibb's effect), since it incorrectly lets one surface influence the shape of a second surface
[Grimson 1981].

The local surface interpolation and gray-level gradient shading methods assume partial
volume effect and thus are not suitable for the general case of volumetric surfaces. We do
not analyze these methods and refer the reader to [Tiede et al. 1990].

4. CONTEXT SENSITIVE NORMAL ESTIMATION

Examining the existing estimation methods, we observe the possible causes of their flaws
and difficulties. Based on these observations, we propose the context sensitive normal
estimation method and describe it in this section in general terms. Section 5 contains a
description of our specific implementation of the context sensitive principles.
222

We first observe that a scene contains area of CO and C 1 discontinuities that can be
detected. Points that do not have a discontinuity between them are said to be in the same
context. A neighbor context, on the other hand, is a surface region that lies across a
discontinuity. That is, in a single context the normal vector to the surface exhibits low
variance, and therefore can be regarded as an attribute of the context that does not
depend on the inclination of neighboring contexts. Thus, when estimating the normal to a
surface context, the information available from neighboring contexts should be disregarded.

Examining existing normal estimation methods, we observe that inaccuracies are usually
encountered in the vicinity of CO and C 1 discontinuities. This is not surprising since any
normal estimation method fits a smooth surface to the data as the basis of its calculations.
Applying such methods across the discontinuities incorrectly implies that the inclination of
one context may be influenced by the inclination of neighboring contexts [Grimson 1981].

The design of our context sensitive normal estimation method has been directed by these
two observations: estimation must be restricted to the information available in a single
context, and the normal values inside a context exhibit low variance. The context sensitive
method consists of two main phases: segmentation and normal calculation. In the first
phase, context borders (Le., area of surface or tangent discontinuities) are detected. In the
second phase, normal values are calculated by applying local operators that are restricted
to data values only from the context they are currently processing. Such operators are
called context sensitive operators.

Ideally, a surface can be fitted to each context to produce accurate normal values. Figure 6
shows the result of fitting curves (least squares) to contexts, taking into consideration all
points in the context. The slight errors appearing near C and F are due to the inability of
our simple segmentation method to detect a boundary at these points (see Section 5).
Applying a more sophisticated segmentation algorithm will remedy this problem. The
errors at G and I are caused by a one-pixel discrepancy between the true location of the
edge and the location detected. The fitting mechanism is very sensitive to such errors,
which cause the whole curve to be slightly inaccurate (J in Figure 6). Although surface
fitting provides accurate normal values, it is extremely time consuming, and a simpler
method is thus desired.

In our normal estimation technique the second phase, the normal calculation phase, is
performed in three stages: prefiltering, discrete gradient calculation, and postfiltering. In
each of these stages a context sensitive operator is applied to the input, and the output is
passed to the next phase in a pipe lined fashion. The prefiltering phase employs a context
sensitive smoothing filter to remove noise and discretization artifacts. This pass is justified
by the observation that the surface function shows low variance inside a context (CO
continuity). In addition, integer based values are averaged into floating-point values
providing a larger resolution range of normals. The next stage estimates the normal values
from the smoothed data inside the current context. The last pass performs context
sensitive postfiltering which smooths variances in the normal values. This operation is
justified by the observation that the surface tangent changes gradually inside a context
(C 1 continuity).

It should be noted that we deliberately refrained from specifying the domain (pixel-space
or voxel-space) of our method since the basic ideas of segmentation into contexts and
context sensitive normal operations can be implemented in both spaces. For simplicity we
223

have chosen to present in the next section an implementation of a pixel-based version of


the context sensitive method. It employs simple operators so that the quality of the
general principle in its simplest implementation can be assessed. Moreover, the simplicity
and locality of the operators we used, coupled with the data-independency between phases,
form the basis for a pipelined hardware implementation of the context sensitive method.

6. IMPLEMENTATION OF THE CONTEXT SENSITIVE METHOD

We conclude by providing some technical details concerning our specific implementation of


the four stages of the context sensitive normal estimation method: segmentation,
prefiltering, gradient calculation, and post filtering. As mentioned above, we have chosen
the pixel-space as the domain of our implementation and we have strived to maintain
locality and simplicity in the implementation of all four stages. In the first stage, our
implementation looks for two types of changes in the depth function in order to identify
context edges:
(I) A sharp change in the depth function indicating a possible CO discontinuity. This
change is detected by observing whether I Fi ,j I > sO for some predefined value so.
(il) A sharp change in the first derivative of the depth function indicating a possible C 1
discontinuity. This change is detected by observing whether IC j +1,j - Ci ,j I > s 1
for some predefined value s 1.
The exact numerical value which defines the "sharpness" parameters sO and s 1 is
application dependent. Dealing with geometrically defined scenes, we discovered that the
values 8 and 5, respectively, produce good results in most cases.

In the prefiltering and postfiltering stages we use simple, context sensitive, 3 X 3 weighted
averaging filters. In order to formalize the computation of the context sensitive filtering,
we use the notations:
P (Pi ,j , Pr ,,) A predicate that returns true (1) if the pixels Pi ,j and Pr " are in the same
context, and false (0) otherwise.
The neighborhood included in the window seen by the filter along the X
and Y axes (e.g., in the case of a 5X3 filter
N% = { 0, ±1}, Nil = { 0, ±1, ±2} ).
The user controllable filter weights where and
~ ~ wk,1 = 1
k EN, lEN,
Vi ,j The original data value at (i ,j ).
Using these notations, Ri ,j ,the final filtered value at (i ,j) is:
~ ~ P (Pi ,j , Pi +k ,j +1) wk ,I Vi +k ,j +1
kEN, lEN,
Ri ,j = -------=~=------:~=---,(,----------:)---- (7)
U U P Pi ,j , Pi +k ,j +1 wk ,I
kEN, IElf,

Although averaging filters used in the prefiltering and postfiltering phases cope with white
noise to some extent, it is recommended to run a median filter [Rosenfeld and Kak 1982]
before segmentation when the image contains binary noise ("salt and pepper") [Rosenfeld
and Kak 1982] in order to reduce false boundary detection.
224

Continous Image
A C D F G H J

Discrete Projection

Error .. Curve Fitting


A C D F G H J

Figure 6: The result of applying curve fitting to contexts.


225

Continous Image
A B C D E F G H J

Discrete Projection

~ I'!!' . I "

111'11 'I'"
.;::1'
"'11'1"
.,1 1
I
I·:'
:
Ii:! ,I I' ;: "j: I." i
I' II'I'II!
J
,.i! r"~" I' I!'
,," I

.... '.ni!!lI'lll"li'iI"':!I'!1'

I" '~I' ' ~' ' 1


;;,:'1',;
••L.J.1.....,.,..1:..I.IlJJ.iJ.J~11I~·
.....tr',·Hi,' ;.' ~'l
Analytic Normal

II", "II!'iI":I'I"!",r lo ·"·'I',·"j!I"11


~~Ldu!.L:"lll'!li! !11~I!d!I!II!Lh,.J...li!U~I!18!lIti!.JwjJ,II:lliill~It!i!!I'
Discrete Normal

111111'
;1,''11'11'1"Ii'III"II'jlll""I"I'IIII'III'I""
1· "il (":I',il.:- ,1,1)' ,. ,j I '1;1:,I ,I,:.
"1,·1',',:'lrlll",II'I"'!I""i
:1, !:,: .'J',!'I-II'I"II"I"
,1;:;"1' :1',. ;1:; I!.I!.IL_,,''
,1.:....... _.I .. :..""-•. :.J.L. .,u';'u.ll...
i ... _I.I.•1I...I..I.I •• • ...;1i.iJ... Ia. •• I.I ........-&.:~, •••I! I0 ..

Error " Context Sensi ti ve Gradient


A B C D E F G H J

.11 ,n, 0\,

Figure 7: Performance of the context sensitive normal estimation method,


using SO = 5 and SI = 3.
226

Error .. Contextual

Error .. Depth Gradient [3]

Error .. Depth Gradient [5]

Error .. Weighted Depth Gradient [3]

Error .. Interpolated Depth Gradient [3]

Error .. Curve Fitting

Error •. Context Sensi ti ve Gradient

Figure 8: Error comparison between the surveyed normal estimation methods,


showing the Error region from Figures 1-7.
227

Normal values are calculated by using a variation of the central difference (Equation 3) as
an estimation of the surface inclination. Since the computation must be context sensitive,
forward or backward differences are used at the context boundaries. We define Gi •i , the
gradient along the x axis at (i ,j), as follows:

c·, .1./2 if P (Pi .i ' Pi Hi ) and P (Pi .i ' Pi -l.i )


F·, .1. if P (Pi .i ' Pi +l.i )
G·, .1. (8)
B·, .1. if P (Pi .i ' Pi -l.i )
o Otherwise

It should be noted that although the normal is not defined where the derivative is not
continuous (e.g., G in Figure 7), we regard each point in discrete space as if it belonged to
an infinite imaginary surface (e.g., the line F or the line H in Figure 7). The normal to the
discrete point is defined as the normal to the imaginary object at that point.

The actual implementation of the four stages of our technique can be adjusted as a
tradeoff between accuracy and computing time. The segmentation phase can either employ
simple local edge detectors (e.g., in a noiseless image) or apply sophisticated discontinuity
detection algorithms (e.g., [Grimson and Pavlidis 1985, Saito and Takahashi 1990]),
possibly using previous knowledge on the image domain. Prefiltering and postfiltering
operators can either employ simple local smoothing filters or examine larger or variable size
neighborhoods and employ a variety of smoothing operators [Rosenfeld and Kak 1982].
Normal estimation can use the simple central difference approximation or use values from
the whole context.

Figure 7 depicts the behavior of our context sensitive method over a set of configurations.
Figure 8 collapses Figures 1-7 by displaying only the Error region for purpose of
comparison; the superiority of the context sensitive normal estimation is clear. The slight
errors appearing near C and F (see Figure 7) are due to the application of smoothing
operators across undetected context boundaries at these points. Employing a more
sophisticated segmentation method will remedy this problem. Alternatively, voxel-space
segmentation is expected to detect most context boundaries not detected by the pixel-space
segmentation since it is not affected by the data reduction caused by projection.
Compared to the curve fitting method, our method is less sensitive to inaccuracies of the
segmentation process because a slight error in the exact position of the context boundary
influences the normal estimation calculation only in a local neighborhood (compare J in
Figures 6 and 7).

Figures 9-12 show the results of shading a volumetric image using normal values provided
by our context sensitive normal estimation method. Although we emphasize in this paper
the analytic examination of normal estimation methods and warn against the use of images
to assess their fidelity, we present these figures as illustrative examples. Figu.res 9 and 10
portray the superiority of the context sensitive method over depth gradient in the case of
sampled data, while Figures 11 and 12 demonstrate it in the case of synthetic models.
Shading of the 2003 resolution images took less than 16 seconds on a Sun-4 workstation
while the 256 3 resolution images took 24 seconds. Table 1 presents the average error in
degrees of each method in a variety of surface arrangements. The behavior of each method
was measured in a scene rich with staircase artifacts (Stairs in Table 1), in a set of objects
228

Table 1: Average error in degrees of normal estimation methods


in different types of scenes.

Contextual Depth Gradient Weighted Curve Context


Gradient Interpolate Gradient Fit Sensitive
Stairs 14.73 12.54 2.09 12.54 1.3 2.69
Angles 10.51 8.95 2.97 8.95 1.39 2.58
CO Discont. 4.57 7.41 18.32 2.31 1.99 2.04
C 1 Discont. 14.56 8.38 11.96 12.97 6.71 6.92
C 2 Discont. 11.82 10.64 4.95 10.64 13.13 4.48
Variety 10.04 13.62 18.15 8.83 5.81 4.43
Average 11.04 10.26 Q.74 Q.37 6.06 3.Q4

with different surface inclinations (Angles in Table 1), in scenes with Co, Ct, and C 2
discontinuities, and in a scene containing a variety of configurations (Variety in Table 1),
which is the one shown in Figures 1-7.

6. CONCLUDING REMARKS

In this paper we have analyzed existing methods for normal estimation for discrete
surfaces. Based on the observation made concerning the causes of the flaws encountered in
these methods, we have devised the context sensitive approach. Although we employ
simple operators in our implementation, we are able to demonstrate the superior accuracy
of our approach.

Our method can be enhanced and extended in various ways. A set of increasingly complex
(and more accurate) operators for segmentation and smoothing can be devised and applied
in an adaptive refinement fashion [Bergman et al. 1986j. It should be noted that in a
volume-rendering architecture where shading is performed in a separate process pipelined
with the projection stage [Cohen et al. 1990, Goldwasser et al. 1989j, the time consumed
by normal estimation is significantly smaller than the time spent in the viewing operation.
This organization allows us to employ a more sophisticated and time consuming
implementation of the normal estimation method without any performance penalty.

Another promising enhancement is the replacement of the pixel based approach by a voxel
based approach in which the volumetric scene is divided into three-dimensional contexts,
and filtering and normal estimation are performed in voxel-space.

Acknowledgment

This project has been supported by the National Science Foundation under grants MIP-
8805130 and IRI-9008109, and grants from Hughes Aircraft Company, Hewlett Packard,
and Silicon Graphics. We thank Dan Gordon for his thorough and enlightening comments.
We thank Dr. P. Adams and Mr. B. J. Burbach of the Howard Hughes Medical Institute
for the use of the confocal microscope, and R. Avila for his help in processing the cell data.
229

Figure g: A 200" MRI dataset of a head, shaded with depth


gradient shading (left) and context sensitive shading (right).

Figure 10: A 256j dataset from confocal microstope of a dissociated bullfrog sympathetic
ganglion tell, shaded with depth gradient shading (left) and context sensitive shading (right).
230

Figure 11: Newell's teapot voxelized into a 200' volume, shaded with
depth gradient shading (left) and context sensitive shading (right).

Figure 12: A polyhedron voxelized into a 256j volume, shaded with depth gradient
shading (left) and context sensitive shading (right).
231

7. REFERENCES

Bergman, L., Fuchs, H., Grant, E., and Spach, S., "Image Rendering by Adaptive
Refinement", Computer Graphics, 20, 4, 29-37, (August 19S6).

Bright, S. and Laflin, S., "Shading of Solid Voxel Models", Computer Graphics Forum, 5,
2, 131-13S, (June 19S6).

Bryant, J. and Krumvieda, C., "Display of Discrete 3D Binary Objects: I-Shading",


Computers 8 Graphics, 13, 4, 441-444, (19S9).

Chen, L., Herman, G. T., Reynolds, R. A., and Udupa, J. K., "Surface Shading in the
Cuberille Environment", IEEE Computer Graphics and Applications, 5, 12, 33-43,
(December 19S5).

Cohen, D. and Kaufman, A., "Scan-Conversion Algorithms for Linear and Quadratic
Objects", in Volume Visualization, A. Kaufman, (ed.), IEEE Computer Society Press, Los
Alamitos, CA, , 2S0-301, 1990.

Cohen, D., Kaufman, A., Bakalash, R., and Bergman, S., "Real-Time Discrete Shading",
The Visual Computer, 6, 1, 16-27, (February 1990).

Goldwasser, S. M., "Rapid Techniques for the Display and Manipulation of 3-D Biomedical
Data", Proceedings NCGA '86 Conference, n, 115-149, (May 19S6).

Goldwasser, S. M., Reynolds, R. A., Talton, D. A., and Walsh, E. S., "High Performance
Graphics Processors for Medical Imaging Applications", in Parallel Processing for
Computer Vision and Display, P. M. Dew, R. A. Earnshaw, and T. R. Heywood, (eds.),
Addison Wesley, Reading, MA, , 461-470, 19S9.

Gordon, D. and Reynolds, R. A., "Image Space Shading of 3-Dimensional Objects",


Computer Graphics and Image Processing, 29, 3, 361-376, (March 19S5).

Grimson, W. E. L. and Pavlidis, T., "Discontinuity Detection for Visual Surface


Reconstruction", Computer Vision, Graphics, and Image Processing, 30, 316-330, (19S5).

Hoehne, K. H. and Bernstein, R., "Shading 3D-Images from CT Using Gray-Level


Gradients", IEEE Transactions on Medical Imaging, MI-5, 1, 45-47, (March 19S6).

Hoehne, K. H., Bomans, M., Pommert, A., Riemer, M., Schiers, C., Tiede, U., and
Wiebecke, G., "3D-Visualization of Tomographic Volume Data Using the Generalized
Voxel Model", The Visual Computer, 6, 1, 2S-37, (February 1990).

Hom, B. K. P., "Hill Shading and the Reflection Map", Geo-Processing, 2, 65-146, (19S2).

Kaufman, A. and Shimony, E., "3D Scan-Conversion Algorithms for Voxel-Based


Graphics", Proceedings of the 1986 Workshop on Interactive !JD Graphics, Chapel Hill, NC,
45-75, October 19S6.
232

Kaufman, A., "Efficient Algorithms for 3D Scan-Conversion of Parametric Curves,


Surfaces, and Volumes", Computer Graphics, 21, 4, 171-179, (July 1987).

Kaufman, A., "Efficient Algorithms for 3D Scan-Converting Polygons", Computers &


Graphics, 12, 2, 213-219, (1988).

Magnusson, M., Lenz, R., and Danielsson, P. E., "Evaluation of Methods for Shaded
Display of CT Volumes", Proceedings 9th International Conference on Pattern
Recognition, II, 1287-1294, (November 1988).

Pommert, A., Tiede, V., Wiebecke, G., and Hoehne, K. H., "Surface Shading in
Tomographic Volume Visualization: A Comparative Study", Proceedings of the First
Conference on Visualization in Biomedical Computing, Atlanta, GA, 19-26, May 1990.

Potmesil, M. and Chakravarty, I., "Synthetic Image Generation with a Lens and Aperture
Camera Model", ACM Transactions on Graphics, 1, 85-108, (1982).

Saito, T. and Takahashi, T., "Comprehensive Rendering of 3D Shapes", Compuer


Graphics, 24, 4, 197-206, (August 1990).

Schlusselberg, D. S., Smith, K., and Woodward, D. J., "Three-Dimensional Display of


Medical Image Volumes", Proceedings of NCGA '86 Conference, ill, 114-123, (May 1986).

Tam, Y. W. and Davis, W. A., "Display of 3D Medical Images", Proceedings of Graphics


Interface '88, Edmonton, Alberta, 78-86, June 1988.

Tiede, V., Hoehne, K. H., Bomans, M., Pommert, A., Riemer, M., and Wiebecke, G.,
"Investigation of Medical 3D-Rendering Algorithms", IEEE Computer Graphics &
Applications, 10, 3, 41-53, (March 1990).

Torrance, K. E. and Sparrow, E. M., "Theory for Off-Specular Reflection from Roughened
Surfaces", Journal of the Optical Society of America, 67, 1105-1114, (1967).

Vdupa, J. K. and Hung, H. M., "Surface Versus Volume Rendering: A Comparative


Assessment", Proceedings of the First Conference on Visualization in Biomedical
Computing, Atlanta, GA, 83-91, May 1990.

Warn, D. R., "Lighting Controls for Synthetic Images", Computer Graphics, 17, 13-21,
(1983).

Webber, R. E., "Ray Tracing Voxel Based Data via Biquadratic Local Surface
Interpolation", The Visual Computer, 6, 1, 8-,15, (February 1990).

Whitted, T., "An Improved Illumination Model for Shaded Display", Communications of
the ACM, 23, 6, 343-349, (June 1980).
233

Roni Yagel is a PhD candidate at the Department of Computer


Science and a researcher at the Department of Physiology and
Biophysics at the State University of New York at Stony Brook. He
received his BSc Cum Laude and MSc Cum Laude from the
Department of Mathematics and Computer Science at the Ben-
Gurion University of the Negev, Israel, in 1986 and 1987 respectively.
His PhD research deals with both hardware and software methods for
efficient volume rendering. His research interests also include
algorithms for voxel-based graphics and imaging, three dimensional
user interfaces, visualization in biology, and animation.
Address: Department of Computer Science, State University of
New York at Stony Brook, Stony Brook, New York 11794-4400, USA.
Electronic mail: yagel@sbcs.sunysb.edu

Daniel Cohen is Phd candidate at the Department of Computer


Science at State University of New York Stony Brook. He is
currently working on voxelization techniques for his doctoral
dissertation. His research interests also include volume visualization,
architectures and algorithms for voxel-based graphics. Prior to
coming to Stony Brook, he was a software engineer at Afkon, Ltd.,
working on bitmap graphics. He holds a BSc Cum Laude in both
Mathematics and Computer Science (1985) and an MSc Cum Laude
in Computer Science (1986) from Ben-Gurion University.
Address: Department of Computer Science, State University of
New York at Stony Brook, Stony Brook, New York 11794-4400, USA.
Electronic mail: dany@sbcs.sunysb.edu
234

Arie Kaufman is a Professor of Computer Science at the State


University of New York at Stony Brook. He is the director of the
Cube project for volume visualization supported by the National
Science Foundation, Hughes Aircraft Company, Hewlett-Packard
Company, Silicon Graphics Company, and the State of New York.
Kaufman has held positions as a Senior Lecturer and the Director of
the Center of Computer Graphics of the Ben-Gurion University in
Beer-Sheva, Israel, and as an Associate and Assistant Professor of
Computer Science at FIU in Miami, Florida. His research interests
include volume visualization, computer graphics architectures,
algorithms, and languages, user interfaces, and scientific visualization.
Professor Kaufman has lectured widely and published numerous
technical papers in these areas. He has been the Papers Chair and
Program co-Chair for Visualization '90 and Visualization '91
Conferences, respectively, co-Chair for several EUROGRAPHICS
Graphics Hardware Workshops, and a member of the IEEE CS
Technical Committee on Computer Graphics. He received a BS in
Mathematics and Physics from the Hebrew University of Jerusalem in
1969, an MS in Computer Science from the Weizmann Institute of
Science, Rehovot, in 1973, and a PhD in Computer Science from the
Ben-Gurion University in 1977.
Address: Department of Computer Science, State University of
New York at Stony Brook, Stony Brook, New York 11794-4400, USA.
Electronic mail: ari@sbcs.sunysb.edu
Rapid Volume Rendering Using
a Boundary-Fill Guided Ray Cast Algorithm
Peter M. Hall and Alan H. Watt

ABSTRACT

Motivated by experience in building radiotherapy planning systems, this article is


concerned with reducing the time taken to render a three dimensional data volume
via volume rendering methods. The method described here reduces the render
time by minimising the number of rays cast. This is achieved by an adaptation of
the boundary fill algorithm used in two dimensional raster graphics with ray
casting. Because ray casting occurs in an order dictated by the boundary fill
algorithm, the approach will be called a boundary-fill guided ray cast (bfg-ray
cast). This technique can also be used to isolate critical regions within the
volume.

Keywords

Volume Rendering. Boundary Fill.


Ray casting

INTRODUCTION

Volume rendering is a group of well established techniques used to display scalar


fields of three spatial variables using three dimensional rendering primitives. A
scalar field or data volume consists of voxels, usually cubic and arranged to
partition the data volume exactly. In addition to the primary scalar value, each
voxel contains values calculated from the scalar field, such as colour, opacity and
field gradient, which are used during rendering. Such algorithms first appeared in
1988 (Drebin 1988; Levoy 1988; Upson 1988; Sabella 1988). These approaches
are made individual not only by the method in which the voxel properties are
calculated but also by the way in which the voxel data is mapped onto the image
plane pixels (Westover 1989). Forward mapping techniques (Drebin 1988; Upson
1988) answer the question "to what pixels does a voxel contribute?" whereas
backward mapping techniques answer the question "from which voxels does a
pixel receive a contribution?" (Upson 1988; Levoy 1988; Sabella 1988). Voxel
properties invariably include colour and opacity terms as well as a surface normal
derived from the local field values by means of, say, central differences. A typical
volume rendering, of a human head, is shown in plate 1.

235
236

Volume rendering can prove costly in terms of time, a fact previously recognised
(Levoy 1990a). This is especially so where only a small part of the whole data
volume is required for display. For example, it might be that only the spinal
column of a patient is of interest. The algorithms previously cited process the
whole of the data volume in order to display just that part required. It was the
desire to efficiently display small, isolated three dimensional regions of interest
within a large data volume that motivated the work presented in this paper. The
algorithm described works by constraining the rays cast to those regions which
are of interest. This reduces the number of rays cast and hence time is not wasted
by processing rays which contribute no additional information to the final image.
It is a ray casting technique and so falls into the backward mapping class of
volume renderers.

To the best knowledge of the authors, there is only one other existing approach
which attempts to reduce the number of rays cast for the purpose of saving time
when using volume rendering; adaptive refinement by Levoy (Levoy 1990a)
(which is similar to the technique described by Mitchell (Mitchell 1987) who
adaptivley casts rays to produce low cost antialiased images). In Levoy's
approach, rays are initially cast into the data volume every few pixels and
interpolation is used to fill in-between casting points. If the colour difference
between two neighbouring casts exceeds some threshold value then the line
joining them is subdivided and the ray cast - interpolate - threshold check cycle
restarts. So long as the colour difference is sufficiently targe, this process
continues until the distance between neighbours becomes pleasingly small, so
that antialiasing may occur. As the author points out this method trades image
quality for an improved rendering time. The bfg ray cast algorithm makes no such
trade off.

Reducing the number of rays cast is only one of two general approaches which
might be adopted in reducing overall rendering time. The alternative general
approach is to reduce the average processing time for each voxel. A variety of
methods exist for this, which will only be touched upon here. One such method is
to track integration points along the ray measured with respect to two frames of
reference simultaneously. One frame is rigidly attached to the image plane on
which the pixels rest, here called image space, the other is rigidly attached to the
data volume, here called object space. This means that all the intergration points
need not be transformed from image space into object space so that the object
voxel can be indexed. Instead the position within object space is automatically
updated for each step along the ray, hence the number of matrix calculations is
very much reduced. The approach has been compared to drawing a line in three
dimensional space using a digital differential analyser (Westover 1989).

A second method to reduce process time per voxel is to use a simplified shading
scheme. These schemes effect the way the surface normal in a voxel is computed,
a large body of work exists on the subject, to enumerate: constant shading
(Herman 1979; Chen 1985); depth only shading (Herman 1981; Vannier 1983);
gradient shading (Gordon 1985); normal-based contextual shading (Chen 1985);
congradient shading (Cohen 1990). An adequate review of these methods can be
found in (Cohen 1990). These shading methods all trade image quality for time,
and a cross section of shading methods with associated errors has been compiled
237

(Heinz-Horne 1990). Adaptive tennination of rays may also be used, in which the
ray cast step tenninates when the accumulated opacity at the pixel rises to
saturation. Equivalently, the intensity of the ray being cast is reduced below a
threshold.

By restructuring the data base into the fonn of an octree the average time taken to
process each voxel in the data volume can be further reduced. This approach is
taken by Levoy (Levoy 1990b) who generates an octree from the underlying
voxel data. Because voxels with a common value fonn leaves of the octree entire
groups of voxels may be processed simultaneously.

The boundary-fill guided ray casting (bfg ray cast) algorithm described here is a
volume rendering algorithm which reduces turn round time. It works when the
data volume contains some spatially coherent "object" which is of interest, an
object of interest to the viewer. Topologically the "object" should fonn an n-
connected region in the data volume. Those portions of the data volume which are
not in the object are in "vacuum" (a set of disjoint n-connected regions). The need
for the object to fonn an n-connected region may seem rather restrictive, it may
even be seen as cutting across the volume rendering principle of displaying the
whole data volume at once. However, in practice there are important application
areas where the data volume can be expected to hold a suitable object, to give just
one instance; medical data might hold the torso of a patient as an object, with the
air around the patient as 'vacuum'. Again, plate 1 serves to iIIustrate the point for
it shows the binary division of the data volume into object and vacuum.

THE BOUNDARY-FILL GUIDED RAY CAST

The boundary fill algorithm is one of a family of area filling algorithms in which
a 'seed' is used. These algorithms are, common in two dimensional raster
graphics, employed when the area boundary is unknown but a pixel inside the
area, a seed, is known. The boundary fill is used to fill an area up to, but not
including, a specific border colour. The algorithm covers all pixels of all colours
up to that border. For a detailed description of this general type of algorithm the
reader is referred to (Rogers 1985). A filling algorithm of this type has four
conceptual parts (Fishkin 1984): The propagation method which traces a line of
search over the pixel plane to the edges of a boundary; to initiate the process a
start procedure is used; the inside procedure is a test which discriminates between
regions on the plane; and the set procedure colours the pixels deemed to lie
inside.

The major departure of the fill procedure described here to conventional seed fill
algorithms is that there is no initial boundary marked on the pixel plane. Instead,
the boundary is fonned by rays cast which do not pass through the object of
interest, these rays contribute no additional infonnation to the image and the
pixels from which they are cast remain unaltered. This induces the use of a 'cast-
flag' plane over the pixel plane which charts the progress of the fill. Hence the bfg
ray cast is more like a one-pass pattern fill algorithm, these fill bounded areas with
a pattern which may include the boundary colour, rather than solid colour. One
238

Fig. 1:
The object of interest projected onto
the pixel plane delimits the ray
casting process. The object of
interest is projected onto the pixel
plane, marked by the dark shaded
region; the perimeter of it's projection
by the pale shaded region. Rays are
cast from both shaded regions, those
cast from the light region do not pass
Object of through the object and terminate the
ixel Plane Intere ( fill. The fill propagates scanline by
""I scanline over spans from the seed S,
between terminating pixels LR. No
rays at all are cast from the blank
region on the pixel plane.

pass pattern fill algorithms also use a 'flag' plane to control fill propagation.
Boundary fill algorithms seek out the perimeter of the region which they fill, and
the bfg ray cast is no exception; ray casts are limited to that area which lies within
the perimeter of the area covered by the object when projected onto the pixel
plane. This is in contrast to the more usual scanline order approach to guiding the
ray cast. Because rays are cast from every important pixel site when filling there
is no trade off between image quality and time.

The situation is depicted in fig 1, where the area in which rays cast using the bfg
system is shown. The advantage of this approach is that the number of rays which
do not pass through the object is minimised, and so a fall in turn round time can
be expected. The fraction by which the turn round time is changed can be
computed: Suppose the pixel plane has area A, and that the object seen in
projection on this plane has area fA, where 0 <= f <= 1. Then, because turn round
time is directly proportional to the number of rays cast the time to render using
bfg ray casting is approximately IT, where T is the time to volume render in
traditional scanline order. This computation is an approximation, for there is more
overhead involved in the bfg ray cast than in the scanline order ray cast, so that
the time taken to cast one ray is scaled up from t to kt, where k is an 'efficiency
factor' of at least one. Hence the turn round time is now kIT and this will be less
than T if kf < 1. This gives a guideline as to when the bfg ray cast can be
efficiently used, namely when f < 11k. For a fixed efficiency factor, k, this provides
an upper limit on the area projected by the object of interest of 11k.

An Overview of the Algorithm

The boundary-fill guided ray cast is an adaptation of the raster graphic method.
Overall the method works as follows: First a seed is required; by seed is meant a
239

pixel whose cast ray passes through the object of interest. At this seed pixel a ray
is cast into the 3D image to determine the colour of the pixel, to prevent future
casts from this seed, the cast is recorded in an array of flags parallel with the pixel
plane, the cast-flag plane. The colouring of pixels, using the value returned from
the ray cast, is the set procedure for the bfg ray cast.

Next a span on the scan line of the seed is determined by evaluating pixels to the
left and right of the seed, as seen in fig 1. This process involves casting a ray from
each pixel to determine its colour. It is terminated whenever a pixel is reached
from which a ray has previously been cast, or the cast ray misses the image body.
Hence the inside procedure consists of testing the cast-flag plane and, if no ray is
cast, raising the flag and casting a ray. The inside procedure must return a flag
indicating a 'hit' or 'miss' of the object of interest and also the colour of the ray
cast for a hit.

The extreme left and right pixels found either side of a seed, here called the
terminating pixels are stored. Finally, in the closed interval between these extreme
pixels the scan lines immediately above and below the current scanline are treated
as potential seed pixels and the algorithm recurses. This propagation process
(which makes the fill a 'scan-line' boundary fill (Rogers 1985)), is repeated for all
ofthe seeds, one seed for each distinct image body, the result is a volume rendered
image in a reduced amount of time. Pseudo code, phrased in terms of the four
major fill components, for the bfg ray cast algorithm is presented in fig 2.

The inside procedure used effects which parts of the data volume count as being
inside the region or objects of interest. For instance, if the termination criteria is
simply that the cast ray must be attenuated to have passed through at least one
object of interest then all the region will be displayed which has an opacity greater
than zero. When used like this the algorithm may be compared to the so called
'Tint fill' (Smith 1979) which fills a two dimensional region according to value
level in the HSV colour system. If, however, it is desired to display just a part of
the non-vacuum region, then a more complicated 'hit' condition may need to be
imposed. For example, the ray must pass through at least one voxel which
contains a scalar value within a given range so as to yield a surface within the
volume.

Finding a Seed

As previously mentioned, the boundary fill algorithm needs a seed to kick it off
and this is found using a start procedure. If the user already knows the
whereabouts of a point in the object, measured in the frame of the data volume,
then a seed on the pixel plane can be found by projection of the known point onto
the pixel plane. The position of the projected pixel can then be measured in both
coordinates systems and the bfg ray cast algorithm becomes feasible.

If no seed is known then one must be found, for which a plethora of techniques
exist, only two of which are mentioned here. Perhaps the quickest way to do this
locate a seed is to search in the frame of the data volume, because the search
240

inside(pixel,colour) {Return TRUE if a ray is cast which hits the object, FALSE otherwise}
Begin {Return the colour of the ray cast if TRUE as a 'side effect'}
If no ray has been cast from this pixel Then
cast ray to yield a colour and a hit/miss flag
raise the cast flag {This prevents further rays being cast from this pixel}
If ray hit object of interest Then
inside returns TRUE {the colour yielded by ray cast needs to be added to image}
Else
inside returns FALSE
Endif
Else
inside returns FALSE
Endif
End

set(pixel,colour)
Begin
colour pixel {using colour yielded by ray cast}
End

10cate_lefUerminating-pixel (seed)
Begin
start pixel at left of seed
While inside(pixel,colour) returns TRUE Do
set(pixel,colour)
move pixel to the left
EndWhile
End

locate right terminating pixel(seed)


{As 10cate_left_terminating-pixel, except substitute right for left}

propagate(pixel)
Begin
If inside(pixel,colour) returns TRUE Then
set(seed,colou)
locate left terminating pixel (seed) {Find the left and}
locate right terminating pixel (seed) {right limits of this span}
For each pixel between terminating pixels on the scan line above Do
propagate (pixel) {recurse fill on the span above}
End For
For each pixel between terminating pixels on the scan line below Do
propagate (pixel {recurse fill on the span below}
End For
Endlf
End

start
Begin
find a seed pixel {say by projection of a point inside the region of interest}
propagate(seed-pixel) {do the bfg ray cast}
End

Fig 2: Pseudo Code for the Boundary Fill Guided Raycast Algorithm,
based loosely on PASCAL. Key words in Bold.
241

position need only be expressed in frame coordinates of the data volume until a
point in the object is located. This point is then treated as above, by projection
onto the pixel plane. If more than one object of interest exists then a three
dimensional boundary fill might be used which marks out connected regions in
space. This prevents two seeds from the same region being collected. Hence the
algorithm works in two phases, first an off-line collection of a seed list and then
using the seed list to initiate the fill process.

An alternate, but slower, method is a pixel plane based search. Here a 'feeler ray'
is sent out from each pixel whose cast-flag is low, in scanline order. If the ray
misses the object the cast-flag is raised, otherwise the fill procedure in invoked.
The 'feeler ray' need contain no more than a simple threshold test to determine
whether or not the ray passes through the object of interest. This is slower than
the previous method, but will render distinct objects. The method can be expected
to be faster overall than scanline order volume rendering because the feeler ray
kernel is so simple. Moreover, this process need be carried out once only for it can
be used to locate a point within the three dimensional region of interest which can
then be used as a known seed in the fashion described above.

Efficiency Considerations

To promote efficiency the seed used is measured with respect to two reference
frames simultaneously; the image space frame and the object space frame. The
former measure allows easy use of the cast-flag plane and plotting of pixel
colours. This measure is exactly the conventional seed used in two dimensional
raster graphic boundary fill algorithms. The latter measure tracks the seed in the
three dimensional frame of object space. This allows for use of the 'digital
differential analyser' (Westover 1989) approach to ray casting described above.

It has been previously shown that the time taken in ray casting for the bfg method
is fkT. Further analysis shows that the value ofk depends, for a given propagation
method, on the shape of the projection of the region of interest. Fishkin and
Barsky state that the time complexity of a seed fill algorithm is proportional to the
sum of pixels in interior of the region filled and the number of pixels which bound
the region (Fishkin 1984). Hence the time complexity of a seed fill algorithm
depends on both the shape and the size of the region to be filled. The efficiency
factor, k, takes only the shape into account. It can be shown that

k = I + n*g >= 1,

where n is the average number of visits to each pixel and g is the ratio of the
average time to visit each pixel to the time to cast one ray. The value of n depends
on the shape of the region and on the propagation method used. The value of g
depends on both the propagation method and the ray cast method. The bfg ray cast
relies on having very small values of g to compensate for the quite large values of
n, which can reach up as high as four for poor propagation methods.
242

RESULTS

The bfg algorithm was implemented on a SUN SPARCstation 1. The host


machine is attached to a network of other SUN workstations, so the results given
below are the average of many trails, with an error margin of 5% either way.
Implementation was made fairly general so that different methods of computing
voxel properties could be used by the inclusion of appropriate object modules at
link time. The propagation method used was that as presented in fig 2. Two
approaches to assignment of voxel properties were used, one to demonstrate the
use of the bfg in the medical field by isolating critical structures, the other to show
that test values support the theoretical result for the time spent bfg ray casting.

By computing the voxel properties in a manner similar to that described by Drebin


et al (Drebin 1988) plates 1,2 and 3 were produced. Within the generalised
framework this code was written to be as efficient as possible without explicit
optimisation; use was made, for example, of adaptive ray termination. Plate 1
shows a human skull, whilst plate 2 shows a human female torso. The bfg raycast
was used to isolate the backbone of this torso, the image appears in plate 3. This
demonstrates the potential of the bfg ray cast in rapidly isolating internal critical
structures.

These results, which are presented in table 1, indicate that the bfg ray cast can
produce volume renderings using medical data much more quickly than the
scanline order method. The values of f and of k were computed from the measured
values of rays cast and time taken in seconds for each image as

f = bfgcast/scancast,

k = bfgtime/(f*scantime):

Where bfgcast is the number of rays cast using bfg, scancast the number of rays
cast using a scanline approach; bfgtime and scan time are the times take to bfg ray
cast and scanline ray cast respectively. The value ofk varies with picture contents,
reflecting the efficiency of the algorithm for a particular view of a particular data
volume.

In support of the theoretical results that time to bfg ray cast is a fraction kf of the
time to scanline ray cast, and that k is given by 1 + n*g, a number of test objects
were produced and timing measurements were made. Whilst ray casting the test
objects the computation of voxel properties was such that each voxel took the

Table 1

l!f2 QfQS;f ~\';llnlins; QfQS;[

i.lIlm ~ ~ ~~ f k .ILk
pklte 1 54733 2851 325440 15246 0.17 1.1 0.9
plate 2 134958 14615 423120 22634 0.32 2.0 0.5
plate 3 58445 1610 423120 10477 0.14 1.1 0.9
243

same amount of time to process regardless of what it contained. This exposed the
effects of the bfg ray cast, leaving aside efficiency measures such as adaptive ray
termination. Because the bfg ray cast is driven only by the silhouette shape of the
regions of interest and because each voxel took the same time to process, the test
objects were made to be planar objects in a three dimensional space.

The measured values of number of rays cast, number of rays which hit and total
time taken to render are presented in table two, with corresponding images in
plate 4. Each of the images appears in a frame of 256*260 = 66560 pixels and
each took an average of 80.0 second to volume render in scanline order. The
'rayscast' column records the values for the total number of rays cast using the
bfg method. The 'interior' column indicates the number of rays cast which pass
through the region of interest, whilst the column 'boundary' contains values for
the number of rays cast which miss the region of interest (computed as rayscast -
interior). The 'full time and the 'bfgtime' columns recorded the time to scanline
and to bfg ray cast respectively. Values of f and k were computed in similar
fashion to those in table 1.

Shapes 0,1,2,3 appear from left to right on the top row of plate 4. These shapes
are of constant perimeter but increasing area from left to right. They demonstrate
that time to ray cast is proportional to the fractional area (given by rayscast/66560
for this table) of the pixel plane covered by the projection. Shapes 4,5,6,7 which
appear from left to right in the middle row of plate 4 are of constant area, but
increase in perimeter from left to right. These show that the time to render is
proportional to the length of the projection perimeter. On the bottom row appear
shapes 8,9,10,11, from left to right, which are of constant shape. These shapes
increase in both area and perimeter from left to right. They demonstrate that the
value of k is constant for a given shape, the variation in k is well within
experimental error.

The results of the bottom row can also be used to show that the limiting fraction
of 11k holds; observe that the average time to bfg ray cast for shape 11 (81.9
second), which covers the field exactly, is greater than the average time to
scanline ray cast (80.0 second).
Table 2

~ ~ ~ llQllnQilr:x: fi!.l.!I.i.!lli ~ f k l.Lk


shape 0 9971 9295 676 81.10 12.407 0.150 1.021 0.979
shape 1 15904 15228 676 79.16 19.134 0.239 1.012 0.989
shape 2 21844 21168 676 80.80 26.862 0.328 1.013 0.987
shape 3 28900 28224 676 79.82 35.603 0.434 1.027 0.973
shape 4 14884 14400 484 79.01 17.962 0.224 1.016 0.984
shape 5 16312 14400 1912 80.49 20.551 0.245 1.042 0.960
shape 6 29143 14400 14743 82.48 38.008 0.438 1.052 0.950
shape 7 43206 14400 57606 80.12 54.321 0.649 1.045 0.957
shape 8 1156 1024 132 78.85 1.399 0.017 1.022 0.979
shape 9 4422 4160 262 78.96 5.377 0.066 1.025 0.975
shape 10 17160 16640 520 78.95 20.709 0.258 1.017 0.983
shape 11 66560 65532 1028 80.68 81.868 1.000 1.015 0.986
244

The average time measured in seconds to access a pixel was also measured, at
2.16x 10-5 . The avera~e time to cast one ray is 1.2 xl 0-3 . Thus the value of g is
2.16xlO-5 / 1.2xlO- = 1.8xlO-2 . For a rectangle each pixel is visited
approximately three times for the simple propagation method used, hence k is
predicted to be 1.054 for rectangles. This value lies within the accuracy of the
experiment.

CONCLUSIONS

The utility of volume rendering applied in the medical arena depends on the
development of interactive tools which allow the user to perform tasks such as
moving the viewpoint, cutting the volume or isolating specific internal structures.
This means bringing the processing time into an interactive range. Algorithms can
be made more efficient by controlling the number of rays cast or by reducing the
processing time per ray. This paper demonstrates that a simple but effective
approach to the first option is available without reducing image quality.

In addition to working rapidly, the bfg ray cast is capable of rapidly isolating
internal structures within a region of interest. This feature makes it suitable for
radiotherapy planning purpose where the planners wish to ensure treatment
beams of radiation do not pass through critical structures of the body. By isolating
these features the passage of the beam can be more clearly envisaged. Careful
study of plates 2 and 3 will reveal that not all the bony structure present in plate
2 is represented in plate 3. This is because the original data set, taken at quite low
resolution from a living patient, was reconstucted in such a way that the bones
form a discontinuous region three dimensional space. When projected onto the
pixel plane regions of bone which are disconnected from the region which
contains a seed are not rendered. Moreover, the parts of bone left out of the bfg
ray cast picture will depend on the view angle. This difficulty represents an
objection to the bfg ray cast. If a seed was either known or found for each
discontinuous region then this difficulty would disappear.

The method works for data volumes which contain objects which are to be n-
connected regions. It works especially well if the area projection of these object
are not too large compared to the area of projection of the total data volume, and
the perimeter of the object's projection are not long compared to its projection
area. A quantitative guideline, the measurement of k, has been proposed to test
the algorithms effectiveness for a given data volume. For a given data volume k
will be different depending on the angle of view, but a statistical value may be
associated with the volume. It has been found that the bfg works for all the
practical data it has been presented with. For it to become less efficient than the
scanline order ray cast either a highly contorted region of interest is required or
the region must occupy a large proportion of the data volume.

The approach is independent of any particular ray cast kernel which returns, in
addition to the final ray colour, a flag indicating whether or not the ray passed
through the region of interest. Thus it should work for octree data bases as well
245

as voxel data bases. It should be possible with a suitable adjustment of the cast
flag plane to implement Levoy's adaptive ray casting over the top of the bfg ray
cast. Antialiasing by supersampling might also be used. To date bfg ray casting
has remained in a voxel environment. If it is implemented in a standard ray tracer
it could still produce benefits, for costly spatial searches are no longer required.

ACKNOWLEDGMENTS

Thanks to Mr. Mark Fuller for proof reading and photographic skills. Medical CT
data supplied courtesy of Weston Park Hospital, Sheffield.

REFERENCES

Chen L.S. Herman G.T. Reynolds R.A. Udupa 1.K. (1985) "Surface Shading in
the Cuberille Environment" IEEE Computer Graphics and Applications, 5(12):
33-43

Cohen D. Kaufman A. Reuven B. Bergman S. (1990) "Real Time Discrete


Shading" The Visual Computer (1990)6: 16-27

Drebin R.A. Carpenter L. Hanrahan P. (1988) "Volume Rendering" Computer


Graphics (SIGGRAPH 1988 Proceedings), 22(4): 65-74

Fishkin K.P. Barsky B. (1984) "A Family of New Algorithms for Soft Filling"
Computer Graphics (SIGGRAPH 1984 Proceedings), 18(3): 235-244

Gordon D. Reynolds R.A. (1985) "Image space shading of three-dimensional


objects" Computer Vision, Graphics and Image Processing 29: 361-376

Heinz-Home K. Bomans M. Pommert A. Riemer M. Wiebecke G. (1990)


"Investigation of Medical 3D-Rendering Algorithms" IEEE Computer Graphics
and Applications, March 1990: 41-53

Herman G.T. Liu H.K. (1979) "Three-dimensional display of human organs


from computed tomograms" Computer Graphics and Image Processing 9:1-21

Herman G.T. Udupa 1.K. (1981) "Display of three dimensional discrete surfaces"
Proc. SPIE 283: 90-97

Levoy M. (1988) "Display of Surface from Volume Data" IEEE Computer


Graphics and Applications 1988,8(3): 29-37

Levoy M. (1990a) "Volume Rendering by Adaptive Refinement" The Visual


Computer (1990)6: 2-7
246

Levoy M. (1990b) "Efficient Ray Tracing of Volume Data" ACM Transactions on


Graphics July 19909(3): 245-261

Mitchell D.P. (1987) "Generating Antialiased images at Low Sampling


Densities" Computer Graphics (SIGGRAPH 1987 Proceeding), 21(4): 65-69

Rogers D.F (1985) "Procedural Elements for Computer Graphics" McGraw-Hill,


New York

Sabella P. (1988) "A Rendering Algorithm for Visualizing 3D Scalar Fields"


Computer Graphics (SIGGRAPH 1988 Proceedings), 22(4): 51-55

Smith A.R. (1979) "Tint fill" Computer Graphics (SIGGRAPH 1979


Proceedings), 12(2): 276-283

Upson C. Keeler M. (1988) "V-BUFFER: Visible Volume Rendering" Computer


Graphics (SIGGRAPH 1988 Proceedings), 22(4): 59-64

Vannier M.W. Marsh J.L. Warren J.O. (1983) "Three-dimensional computer


graphics for craniofacial surgical planning and evaluation" Computer Graphics
(SIGGRAPH 1983 Proceedings), 17(3): 63-273

Westover L. (1989) "Interactive Volume Rendering" Chapel Hill Volume


Visualization Workshop 1989: 9-16
247

COLOUR PLATES

plate 1: A volume rendering of a human head.


The head was produced using a voxel property computation approximating that
due to Drebin et al (Drebin 1988). Bone is shown as opaque white, brain and
muscle are in translucent green and skin is in translucent red. All else is air. The
'anomalies' in the image appearing around the mouth and down the side of the
neck are part of the structure used to hold the patient still. By use of the bfg ray
cast no rays are cast outside the head or it's perimeter.

plate 2: Volume rendering of a human female torso


This image was produced using exactly the same algorithm colour scheme as that
plate 1: bone in white, muscle in green and soft tissue (skin and lungs) in red.
248

plate 3: Critical structure isolation using bfg ray casting


Showing the use of the bfg ray cast in rapidly isolating critical structures within
the data volume. In this case the spinal column and ribs are isolated from the
female torso.

plate 4:Test shapes

Shapes 0,1,2,3 from left to right, on the top row are of constant perimeter but
increase in area from left to right. Shapes 4,5,6,7 left to right, on the middle row
are of constant area but increase in perimeter left to right. Shapes 8,9,10,11 from
left to right, on the bottom row are of constant shape but increase in both area and
perimeter left to right. Results from these shapes are used to demonstrate
properties of the bfg ray cast. Shape 6 is a very tight square spiral and shape 7 is
a fine comb.
249

AUTHOR BIOGRAPHIES

Mr. Peter Hall holds a B.Sc. in Physics-


with-Astrophysics from the University
of Leeds, England and an M.Sc. in
Computer Graphics from Middlesex
Polytechnic, England. Also, he has spent
time in the Computer Graphics Industry
in both London and Cambridge.

Currently working toward a Ph.D. at


Sheffield University his research
interests include the rendering of
computer tomography data and the
reconstruction of cerebral blood vessels
from a pair of angiograms

Dr. Alan Watt is a lecturer in Computer


Graphics at the University of Sheffield.
His research interests include ViSC and
advanced rendering techniques.

Receiving a Ph.D. in Pattern


Recognition from the University of
Nottingham he subsequently developed
an interest in Computer Graphics.

He is author of the text book


"Fundamentals of 3D Computer
Graphics" published in 1989. He has also
published several text books on
programming techniques.

Corresponding Author

P.M. Hall
Vi sLab
Department of Computer Science
University of Sheffield
Sheffield
England
Telephone: +44 742 768555 x 5577
Fax: +44 742 750318
Telex: +44742 739826
Email: ac2prnh@uk.ac.sheffield.primea
A 3D Surface Construction Algorithm for
Volume Data
Renben Shu and Richard C. Krueger

Abstract
Surface construction is an important and precursory step in volume visualization. A
three dimensional volumetric scence is represented by an array of volume elements (also
called "voxels"), and a solid in the scene is defined as a connected set of voxels. Surface
construction involves the detection of all the boundary faces of a given solid. There are
two efficient voxel based surface construction algorithms which have been successfully
applied in medical display systems. However, these algorithms track solids in a discrete
manner. No matter what kind of shape the solid takes, the number of boundary faces
which are visited more than once is always of order O( n 2 ) where n 3 is the number of
voxels in the solid. In this paper we propose a new algorithm which takes advantage of
the fact that many solids have large contiguous runs of boundary faces with the same
orientation. In the best case, the number of boundary faces which are visited more than
once is of order O( n).

Keywords: Surface tracking, medical imaging, volume rendering, scientific visualiza-


tion, solid modeling.

1 Introduction

Volume Rendering is no longer just a useful supplement to scientific and engineering


progress; it is now an integral part of many disciplines. From multi-beam echo sounders
mapping the ocean bottom to radio telescopes studying the heavens, sensors provides
more data than can ever be examined point by point (Frenkel 1989). Visualization has
been well used in medical applications. To display scalar fields of 3-D data obtained
from Computed Tomography (CT), magnetic resonance imaging voxel based techniques
have been developed (Levoy 1988, Levoy 1990, Fuchs, Levoy and Pizer 1989, Upson and
Keeler 1988). Among them surface tracking of 3D objects have attracted much attention.
Surface tracking has the following advantages:

1. It is considerably faster than other techniques such as adaptive ray tracing or


V-buffer (Levoy 1990, Upson and Keeler 1988).

2. Connected objects can be isolated automatically from the rest of the background
and the surface of an object can be easily computed.

251
252

3. High quality surface-shaded images can be rendered very quickly using a high power
graphics workstation (Akeley 1989).

There are two kinds of surface tra~king algorithms. The "Marching Cubes" algorithm
proposed by Lorensen and Cline (1987) is the first kind. This algorithm is a compre-
hensive surface tracking algorithm in so far that it can detect all disconnected surfaces
in a volumetric scene. "Marching Cubes" had complexity of order O(n 3 ) where n 3 is
the number of voxels in the scene. The second kind of surface tracking algorithm is less
costly but is also more restrictive, in that it can only detect the surface of one specific
connected set of voxels in the scene. In 1981, E. Artzy proposed a voxel based surface
tracking algorithm for a connected object. This algorithm has been applied in medical
turn-key display systems for years. This algorithm tracks all boundary faces in the object
twice and requires a pair of queue/dequeue operations for each boundary face. This may
result in excessively large memory access time and searching operations. To improve the
performance, Gordon and Udupa proposed a new algorithm which resulted in an overall
time saving of thirty five percent (Gordon and Udupa 1988). In this paper, we further
improve Gordon and Udupa's algorithm by reducing the number of times a boundary
face is visited. Our algorithm takes advantange of the natural coherence in the shape of
most objects. We assume that most objects have large flat spots of boundary faces, all
of which have the same orientation. Our algorithm would be especially applicable to the
reverse engineering problem of detecting the surface of a mechanical part from X-Ray
data.

In section 2, we present the terminology, definitions, and basic ideas for the new algo-
rithm. Section 3 gives the details of the algorithm itself. While section 4 provides an
analysis of the time complexity of the algorithm.

2 Terminology and Definitions

Binary Scene

We assume that 3D space is partitioned into a grid of nx x ny x n z volume elements (or


voxels). All voxels are equally spaced cubes of constant value which are orthogonal to
the x, y, and z axes of the coordinate system.

There are two kinds of voxels: 1-voxels and O-voxels. The set V of nx x ny x n z voxels
is refered to as the binary scene.

Each voxel in the binary scene V has an integer coordinate (x, y, z) which uniquely
indentifies its position in the 3D space where: 0 :::; x :::; nx - 1, 0 :::; y :::; ny - 1, and
0:::; z :::; n z - 1.
253

Boundary Face

Each voxel is a cube which has six faces. A 1-voxel is said to have a boundary lace with
a O-voxel, if the 1 voxel is adjacent to the O-vcxel (see Fig. 1).

I·Valtel

Figure 1: Boundary Face

There are six types of boundary faces corresponding to the six faces of the cube. The
six types of boundary faces are refered to as: X, Y, Z, X, Y, and Z (see Fig. 2 for a
pictorial definition). Each boundary face in the binary scene belongs to exactly one of
the six face types. We use the "E" notation to indi<;:ate that a boundary face is of a
particular face type. For example, if a boundary face I is of face type X then we say
that lEX.

J--v
x.
CjJ
z y [5Jv:
/
r- -
Z
x
-

Figure 2: Face Types

Tracking Functions

Each voxel has six tmcking directions which correspond to the positive and negative
displacements about the x, y, and z axes. The positive tracking directions are referred
to as bx, by, and bz. The negative tracking directions are referred to as bx, by, and
bz . A positive tracking direction can be derived using the right hand rule, i.e . .clench
all fingers but the thumb, the thumb represents the x, y, or z axis, and the remaining
fingers give the tracking direction (see Fig. 3).

A tracking direction defines an orientation on a boundary face. Each boundary face can
have exactly four tracking directions. For example, a boundary face I E X can have
254

Figure 3: Six Tracking Directions

tracking directions bz, bZ, by, and by (see Fig. 4). We can similarily determine the
tracking directions for the five other boundary face types.

Figure 4: Four orientations

Tracking functions are used to determine adjacent boundary faces in specific tracking
directions. There are six tracking functions: one for each tracking direction. The six
tracking functions are called Tx, Ty, Tz, T x , T y, and T z . A tracking function takes a
boundary face as input and returns the adjacent boundary face in the specific tracking
direction. A tracking function returns one out of a set of three possible boundary faces,
depending on whether the adjacent voxels are 0 or 1. Fig. 5 illustrates the three possible
boundary faces returned by a specific tracking function Tx, where 12 = Tx(h).

Two boundary faces ic and in are o1jacent, denoted icain, if and only if:

in = Ti(Jc) ior some i E {X,Y,Z,X,Y,Z}

Two boundary faces ic and in are connected if and only if exactly one of the following
255

Figure 5: Tracking Functions

cases is true:

1. fcafn
2. 3ft, .. , fk boundary faces, k ~ 1, such that fcaftd .. aJkafn

Surface Detection Problem

We are now ready to define the surface detection problem. Given a binary scene V, a
connected set S of 1-voxels in V, and a boundary face fseed of a voxel in S, produce the
set BF of all boundary faces which are connected to fseed.

The set S of connected 1-voxels is refered to as the solid. The set BF of boundary faces
on the solid S is refered to as the surface.

3 Surface Construction Algorithm

Basic Idea

The idea behind the surface construction algorithm is simple. The solid S is conceptually
divided into one voxel wide slices which are parallel to the x-z plane. The Ty tracking
function is used to detect all the Z, X, Z, and X boundary faces in each slice. The
Ty tracking function walks the surface of the slice in the positive tracking direction Oy
around the y axis parallel to the x-z plane. The Tz tracking function is used to track
all possible Y or Y boundary faces on the slice. The Tz tracking function moves in the
positive tracking direction 8z around the z axis parallel to the x-y plane. The Ty and
Tz tracking functions are sufficient to cover all the boundary faces of the solid. Fig. 6
illustrates how the boundary faces of a 3 X 3 cube of voxels would be tracked.

Tracks and Decision Points


256

J-,
Tl\n(:tifta: vl.u.

Figure 6: Tr;\cking the surface of a cube

To formally understand how this algorithm works we must introduce the notion of tracks
and decision points.

A track is an adjacent set of boundary faces which are all the same type, and which have
been continuously tracked using either the Ty or the Tz tracking functions (but not both
at the same time). More formally, a set Q = {ft, .. , fk}, k ~ 1 of boundary faces is a
track if and only if:

1. fk = Ti(Jk-l), .. , Ti(!2) = Ti(ft), for some Ti E {Ty,Tz}


2. ft, .. ,fkEFwhereFE{X,Y,Z,X,Y,Z}.

Clearly, there are two kinds of tracks: by tracks and bz tracks. The first boundary face
in a track is called the initial boundary face. The process of tracing all the boundary
faces in a bi track from its initial boundary face using the Ti tracking function (where
Ti E {Ty, Tz}) is called clearing the track.

A decision point is a boundary face which is associated with a track. A decision point is
special in so far that it requires the algorithm to visit adjacent boundary faces so that the
decision to initiate a new track or look for another decision point can be made. Decision
points occur at the begining of a by track, and they occur at the end of a bz track. One
can think of decision points, for the purpose of computing computational complexity, as
those boundary faces in Gordon and Udupa's algorithm (1988) which need to be visited
twice.

Data Structures

Two queues Qy and Q z are maintained at all times. Qy contains initial boundary faces
for tracks in the by direction, and Qz contains initial boundary faces for tracks in the bz
direction. Qy contains X, X, Z, and Z faces; while Qz contains Y and Y faces. Both
queues have a first-in/first-out (FIFO) order imposed on them. The following notation
257

- -_\
-
deci-rion '
poi" IS -
-.
~
-f-
.,..
,1S<-~
1--1- - 6: tr'lICk

"" "
.~

I initial boundary (IIOeJ

Figure 7: Tracks and Decision Points

used to specify the enqueue and dequeue operations:

f -- Q: put face f on queue Q (enqueue).


f +-- Q: take face f from queue Q (dequeue).

A oy or oz track is initiated by putting its initial boundary <face in Qy and Qz, respec-
tively<

A set BF is maintained to store all the boundary faces which have been visited by the
algorithm. This set is initially empty. The implementation of BF must be optimized for
the member "E" operation. A hashing scheme is recommended.

Algorithm

We will assume for the purpose of simplicity that f~eed is an initial boundary face for a
track in the oy direction. If this were not the case we could easily find such a boundary
face using a few simple preprocessing steps.

f.eed--+ Qy
BF= 0
while (Qy f:. 0 or Qz f:. 0) do
if (Qy f:. 0) then
fe +- Qy
if (Ie ~ B F) then
Clear Oy track with initial boundary face fe< (1)
fi
else if (Qz f:. 0) then
fe +- Qz
Clear Oz track with initial boundary face f e< (2)
fi
258

od

STEP 1: Clear by track with initial boundary face fe.

There are four types of by tracks depending on the type of the initial boundary face fe,
since fe E {X,X,Z,Z}.

CASE 1: fe E X

In this case the decision points occur at the begining of the track. As we start clearing
the track we must insure that we detect every initial boundary face fi of any adjacent
bz track (see Fig. 8).

J-v
fe output
as boundary
face

Figure 8: fe EX

The algorithm to clear a Oy track of X faces is:

1* Clear track and detect Cz initial boundary faces */


while (fe E X) do
=
Ii Tz(fc)
Ie -+ BF
=
Ie Ty(fe)
if(fi E Y or Ii E Y) then
1* Record the start of the Cz track */
Ii -+ Qz
else
1* Stop looking for Cz track, and recora start of Cy track. */
=
Ip Ty(f;)
if (fp X)rt
Ii -+ Qy
break 1* out of while loop */
fi
od
259

1* Clear the rest of the by track */


while (f. E X) do
f. -+ BF
f. = Ty(f.)
od

1* Record initial boundary face of next by track */


f. -+ Qy

CASE 2: fe E X
This case is simply a symetrical version of case 1. The algorithm steps are almost
identical, except that X is substituted for X.

CASE 3: fe E Z

This case is more tricky. Again the decision points occur at the begining of the track.
In case 1, for each boundary face fe E Z we could have checked whether TzUe) was an
initial boundary face for a bz track. Instead, we only checked for the first such occurences
where TzUe) E {Y, Y}. In order to insure the correctness of the algorithm, we must use
the decision points in case 3 and case 4 to detect the other such occurences.

In case 3, where the initial boundary face fe E Z, we need to check whether there is
an orthogonal by track of X faces (see Fig. 9( a)) from which we can detect each initial
boundary face fi of any bz track (where fi E {Y, Y}). More formally whether:

3fi, fn E BF such that fi = Tx(fc), f; E {Y, Y}, fn = Tz(fi), and fn E X.

We also need to check whether there is an orthogonal by track of X faces (see Fig. 9(b)),
i.e. whether:

3f;, fn E BF such that f; = Tg(fe), f; E {Y, Y}, fn = TzUi), and fn EX.


The algorithm to clear a by track of Z faces is:

1* Check for an orthogonal by track */


Ip = Ty(fc)
Ipp = Tz(fp)
if (fp, Ipp E X) then
Ii = Tx(f.)
if (fi E Y) then
In = Tz(f;)
while (fn E X) do
260

(a) fptX (b)

Figure 9: Ie E Z

Ii = Tz(ln)
In = Ty(ln)
if (Ii E Y or Ii E Y) then
1* Record the start of the Cz track */
Ii -+ Qz
else
1* Stop looking for Cz track, and record start of Cy track. */
Ii -+ Qy
break 1* out of while loop */
fi
od
fi
fi
if (lp, Ipp E X) then
=
Ii Tx(le)
if (Ii E Y) then
In = TZ(li)
while (In E X) do
Ii = Tz(ln)
In = Ty(ln)
if (Ii E Y or Ii E ',0) then
1* Record the start of the Cz track */
Ii -+ Qz
else
1* Stop looking for Cz track, and record start of 6y track. */
Ii -+ Qy
break /* out of while loop * /
fi
od
fi
fi
261

1* Clear the rest of the Oy track */


while (Je E Z) do
Ie -+ BF
Ie = Ty(Je)
od

1* Record initial boundary face of next Oy track */


Ie -+ Qy

CASE 4: Ie E Z
This case is simply a symetrical version of case 3. The algorithm steps are identical.

STEP 2: Clear 6z track with initial boundary face Ie.


This step is very easy. The decision point is always at the end of the track. The next
face in the 6z direction (after the last face in the track) will always be X or X (see
Fig. 10).

&lnIck----<......

ctX inirial
boundary faa:

Figure 10: next face in a 6z track

The algorithm to clear the 6z track is:

1* Clear the Oz track */


while (Je E Y or Ie E Y) do
Ie -+ BF
Ie = Tz(Je)
od

1* Check whether Ie is an initial boundary face */


1* Ie E X or Ie E X */
Ip = Ty(Je)
if (J, 'I. X and Ip 'I. X) then
262

1* Record the start of by track */


Ie --+ Qy
fi

4 Time Complexity

Our algorithm improves on Gordon and Udupa's algorithm by reducing the number of
times a boundary face is visited. For the purpose of simplicity we will compare the time
complexity of two algorithms when faced with the detection of the surface of a solid cube.

101:
,_.• 1:'1 - / "'.
'.

.
I-

Ca) (b )

Figure 11: Detecting the surface of a cube

In Gordon and Udupa's algorithm each X and X boundary face is visited two times
(see Fig. Il(a)). Thus the number of boundary faces which are visited twice is of order
O(2n2) where n3 is the number of voxels in the solid cube.

In our algorithm on the other hanl!, only the boundary faces which constitute decision
points produce visits to other boundary faces which can be counted twice (see Fig 11(b».
Therefore in the cube example the number of boundary faces which are visited twice is
of order O(Sn). This is an order of magnitude improvement.

Furthermore, in our algorithm the number of queue/dequeue operations is directly pro-


portional to the number of decision points, since only the initial boundary faces are
queued. Therefore, the time complexity of these operations is O(Sn). In Gordon and
Udupa's algorithm all boundary faces of the solid are queued. This means a time com-
plexity of O( n 3 ). Again, our new algorithm is faster by an order of magnitude.

Finally, our algorithm only performs the membership operation "E" for the initial bound-
ary faces of the tracks it clears. Thus, the time complexity for membership operations is
O(Sn). Gordon and Udupa's algorithm performs the membership operation on roughly
263

one third of all boundary faces (all X and X faces). This leads to a time complexity of
O(tn 3 ).
Of course, a cube is an ideal case. The performance of the two algorithms will vary
according to the shape of surface being detected. Nevertheless, our new algorithm takes
advantage of the coherence in the shape of the object being tracked. It is for this reason
that we see such improvements in the algorithm's time complexity.

5 Implementation and results

We implemented both our algorithm and Gordon and Udupa's algorithm on an IBM
RS-6000/320 workstation. The same data structure implementations were used for both
algorithms. On average we found that our algorithm performed significantly better than
Gordon and Udupa's. Both algorithms computed a set of boundary faces BF from a three
dimensional binary scene. The boundary faces were rendered using the GL option of the
attached High Performance 24-bit 3D Color Graphics Processor from Silicon Graphics.
The rendered images were then converted to a PostScript format and printed on Apple
LaserWriter.

We ran the algorithms against the two volume data sets which are presented below.
The tables presented for each data set contain the times (in seconds) it took for both
algorithms to detect the boundary faces of the surface of the solid in question.

Dataset 1

The first data set consisted of a 240 x 164 x 30 binary scene containing the CT data of
a human torso.

Algorithm Time
Shu and Krueger: 13 sec
Gordon and Udupa: 17 sec

Dataset 2

The second data set consisted of a 256 x 256 x 43 binary scene containing the X-ray data
of a cylinder head.

Algorithm Time
Shu and Krueger: 58 sec
Gordon and Udupa: 81 sec
264

Figure 12: Human Torso

Figure 13: Cylinder Head

6 Conclusion

In this paper, we propose a new surface construction algorithm which reduces by an


order of magnitude the number of times a boundary face is visited. We believe that our
algorithm will significantly speed up surface generation in the volume rendering process.

Voxel based surface tracking algorith fiS are usefull for performing fast volume renderings
of CT and MRI data sets. The work reported here is part of a larger volume visualization
effort underway at the National University of Singapore. In particular, we are using
265

surface tracking as a volume previewing technique to interactively see the effects of


changing the lighting and rotational parameters. We then use a more sophisticated
volume rendering algorithm such as adaptive ray tracing to produce the final image.
Nevertheless, we believe that surface tracking will still playa crucial role in providing a
fast and efficient voluue visualization method for years to come.

References

E. Artzy, G. Frieder and G.T. Herman (1981) The theory, design, implementation and eval-
uation of a 3-dimensional surface detection algorithm. Computer Gmphics and Image
Processing. 15, pp.1-24.

D. Gordon and K. Udupa (1988) Fast surface tracking in 3-dimensional binary images.
Computer Vision, Gmphics and Image Processing. 45, pp.196-214.

M. Levoy (1988) Display of Surfaces from Volume Data. IEEE Computer Gmphics 8 Ap-
plications. 8(3) pp.29-37.

M. Levoy (1990) Volume rendering by adaptive refinement. The Visual Computer. 6, pp.2-7.

H. Fuchs, M. Levoy, and S.M. Pizer (1989) Interactive Visualization of 3D Medical Data.
IEEE Computer. 22(8), pp.46-51.

W.E. Lorensen and H.E. Cline (1987) Marching Cubes: A high resolution 3D surface con-
struction algorithm. Computer Gmphics. 21(4), pp.163-169.

C. Upson and M. Keeler (1988) V-Buffer: Visible Volume Rendering. Computer Gmphics.
22( 4), pp.59-64.

K.A. Frenkel (1989) Volume Rendering Communications of the ACM. 32(4), pp.426-435.

K.Akeley (1989) Superworkstation: The Silicon Graphics 4D /240GTX Superworkstation.


IEEE Computer Gmphics 8 Applications. July, pp.71-83.
266

Renben Shu is a Member Research Staff at the National University of


Singapore with the Institute of System Science. His research interest
include algorithm development, computer architecture and organiza-
tion, parallel processing, scientific visualization, numerical analysis,
and medical imaging. Renben Shu received his bachelors degree in
Malhematics from Fudan University, China in 1970. From 1970 to
1983 he was involved in computer system design and was the prin-
ciple designer of two computer systems. From 1983 to 1985, he was
a research fellow at the Supercomputer Insitute with the University
of Minnesota. He was awarded his Ph.D in Computer Science from
the University of Minnesota in 1989. He is the project leader for the
ScielLtific Visualization System project at the National University of
Singapore. He is a member of IEEE.

e-mail: issr bs@nusvm.bitnet

Current Address: Institute of System Science, National University


of Singapore, Heng Mui Keng Terrace, Kent Ridge, Singapore 0511.

Richard C. Krueger is an Associated Research Staff at the Na-


tion.J University ·)f Singapore with the Institute of System Science,
and is doing resc'\.' ch in the area of Volume Visualization and Medi-
cal lmaging. He graduated in Computer Science in 1984 from Cornell
University, Ithaca N.Y. In 1989, he obtained a Masters in Computer
Studies from North Carolina State University. From 1984 to 1987
he worked as Associate Programmer for the IBM Corporation, Cary
N.C in Object Oriented Systems development. From 1987 to 1990
he \ iorked for SAS Institute, Cary N.C. in C-compiler construction
and computer graphics. He designed and developed the SAS/C Com-
pile! Student Edition and the NeoPaint software for the NeoVisuals
gra~ hics modeling system. He was also a consultant to the United Na-
tion International Center for Genetic Engineering and Biotechnology
>

(ICl,EB), in Trieste Italy. His research interests include 3D computer


graphics, interactive user interfaces, volume visualization, and object
orieJ ,ted languages. He is a member of ACM/SigGraph.

e-mail: issrck@nusvm.bitnet

Cm rent Address: Institute of System Science, National University


of Smgapore, Heng Mui Keng Terrace, Kent Ridge, Singapore 0511.
Chapter 5
Visualization Methods
Compositional Analysis and Synthesis of Scientific
Data Visualization Techniques
Hikmet Senay and Eve Ignatius

ABSTRACT

The scientific data visualization process involves a sequence of transfonnations that convert a data set
into a displayable image. One of the most important transfonnations in this process is the visualization
mapping which defines a set of bindings between data and graphical primitives. Since these bindings
describe how data is going to be visualized, the effectiveness of visualization critically depends on the
mapping defined at this stage. Establishing a proper mapping which leads to an effective data
visualization requires significant knowledge in several fields, such as, data management, computer
graphics, and visual perception. However, scientists who could benefit most from data visualization
usually lack this knowledge. In order to identify, acquire, fonnalize, and provide this knowledge, the
existing visualization techniques, that are known to be useful, have been thoroughly analyzed. The
analysis shows that most of the existing data visualization techniques can be described in tenns of
attributes of data, a set of primitive visualization techniques, marks (graphical symbols) that modify
primitive visualization techniques, and a set of rules used in their design. The analysis further suggests
a design process leading to the automatic synthesis of scientific data visualization techniques.

KEYWORDS: visualization mapping, visual perception, composition rule, visualization tree, glyph.

INTRODUCTION

Technological advances in the earth and space sciences continue to provide scientists with enonnous
amounts of data that lead to discoveries through data analyses. However, the amount of data that is
generated is usually much larger than the amount of data that is actually analyzed. In order to close this
gap and increase the utilization of data, there is a strong need for innovative tools and techniques to
examine the data. While data visualization tools and techniques offer a way to alleviate the problem,
they are still not widely available to the scientific community. On the other hand, most of the available
tools and techniques require significant knowledge in data management, computer graphics, and visual
perception for their effective use. Scientists who are primarily concerned with the content of the data
usually lack the knowledge of how the data is organized in a database' or how it could be most
effectively visualized using computer graphics. Having this knowledge as an integral part of the
visualization system is essential for facilitating the use of visualization tools and techniques by a larger
population of scientists.

In general, the scientific data visualization process can be viewed as a sequence of transfonnations that
convert a data set into a displayable image. There are typically three transfonnations in this process
(Haber 1988): (1) data manipulation which transfonns data into a fonn that is suitable for subsequent
visualization operations and includes such operations as gridding, interpolation, and smoothing, (2)
visualization mapping which maps data processed in the previous stage into a set of visualization
primitives, such as positional parameters, color, texture, animation and so on, which effectively convey
the data, and (3) rendering which produces an image of the data, as defined in the visualization
mapping, using such rendering operations as projection, shading, and hidden surface removal. The
most important transfonnation in this process is the visualiza,tion mapping which defines an abstract

269
270

visualization technique by establishing a set of bindings between the data and the visualization
primitives. This is the stage at which scientists need the most assistance since the effectiveness of data
visualization directly depends on the bindings defined at this stage. Although the existing visualization
systems, such as NOS (Treinish 1988), FOTO (Cognivision 1989), AVS (Upson 1989), DataScope
(NCSA 1989), apE (Crusi 1987), and EXVIS (Orin stein 1989) provide the basic mechanisms to
establish these bindings, they generally leave this responsibility to the scientists who are not
visualization experts.

Based on previous research in visual perception (Bertin 1983; Cleveland 1985; Kosslyn 1989; Tufte
1983) and the analyses of existing visualization techniques, the work presented in this paper attempts to
shift that responsibility from the scientists to the visualization system. In order to accomplish this
objective, we first identified the essential components of data visualization techniques involved in
visualization mapping. The section on components of scientific data visualization is devoted to
describing these components which include data abstractions, marks (graphical elements) that are used
in creating visualization techniques, primitive visualization techniques that can encode simple relations,
and rules that are useful in creating effective visualization techniques. Most existing visualization
techniques can be represented by tree structures comprised of these components. The subsequent
section on analysis of scientific data visualization techniques presents two examples of visualization
technique analysis to illustrate this process. These analyses indicate that the data visualization technique
design can be automated by defining visualization mappings according to the rules described in the
preceding section. A synthesis algorithm, which extends the compositional approach in designing two
dimensional graphical presentations (Mackinlay 1986) to three dimensions, is outlined in the section on
automatic synthesis of scientific data visualization techniques. This algorithm is employed by VISTA
(Senay 1990b) a visualization tool assistant, which is being developed to assist scientists in data
visualization. The final section discusses the implications of the analysis and synthesis approach
described in this paper.

COMPONENTS OF SCIENTIFIC DATA VISUALIZATION

The scientific data visualization process involves two main components: data and graphics. In the
visualization mapping stage, characteristics of the data determine the type of graphical primitives which
can be used to effectively display the data. For instance, a scalar field can be effectively displayed using
color whereas a vector field cannot be represented using color alone. Identifying the characteristics of
data that are relevant for visualization mapping is the first important step in designing effective data
visualization techniques. The most primitive components of visualization techniques are the marks
which constitute the graphics. In general, a visualization technique can be viewed as a collection of
marks where each mark encodes a certain data value. While it is possible to describe visualization
techniques in terms of simple marks, such as points, lines, areas, and volumes, a higher level
description based on commonly used primitive visualization techniques is more comprehensive. In a
sense, marks and primitive visualization techniques form a graphical vocabulary for visual data
representation. Similar to grammar rules and idioms in a natumllanguage, there are composition and
heuristic rules for generating compound visualization techniques in terms of marks and primitive
techniques. In the rest of this section, each of these components is described in detail.

Data

Scientific data can broadly be classified into two groups: qualitative and quantitative. Qualitative data is
further subdivided into two groups: nominal and ordinal. Nominal data types are unordered collections
of symbolic names without units. For instance, the names of the planets, Mercury, Venus, Earth, Mars,
Jupiter, Saturn, Uranus, Neptune, and Pluto, form a nominal data set. Ordinal data types are rank
ordered only, where the actual magnitudes of the differences are not reflected in the ordering itself. A
typical example of an ordinal data set is the names of the calendar months, January through December.

Compared to qualitative data, quantitative data is more common in scientific disciplines. Quantitative
data is typically classified in two dimensions; (1) based on the number of components which make up
the quantity, and (2) based on the scales of the values. Along the first dimension, quantitative data can
271

be scalar, vector, or tensor. Scalar data types possess a magnitude, but no directional information other
than a sign. They are simply defined as single numbers. Yectors have both direction and magnitude.
Quantitatively, their mathematical representation requires a number (equal to the dimensionality of the
coordinate system) of scalar components. In general, a vector is a unified entity. This implies that the
problem of visualizing vector fields is not equivalent to the problem of displaying independent, multi-
variate scalar fields. The number of components which specify a tensor depends on the dimensionality
of the coordinate system and the order of the tensor. Along the other dimension, quantitative data can be
classified as interval, ratio, and absolute (Kosslyn 1989). Interval data scales preserve the actual
quantitative difference between values (such as Fahrenheit degrees), but do not have a natural zero
point. Ratio data scales are like interval scales but they do have a natural zero and can be defined in
terms of arbitrary units. For instance, two hundred dollars is twice as much as one hundred dollars.
Absolute data scales are also ratio scales which are well-defmed in terms of non-arbitrary units, such as
inches, feet, and yards.

Other important attributes of data, which playa role in selecting visualization primitives, include
functional dependencies between data variables, spacing between sampling points, cardinality of the
data set, upper and lower bounds of values, unit of measurements, coordinate system, scale and
continuity of data.

Marks
A mark is the most primitive component that can encode some useful information in any data
visualization. In general, marks can be classified as simple or compound. Simple mark types include
points, lines, areas, and volumes. A point has a single conceptual center that can indicate a meaningful
position, a line has a conceptual spine that can indicate a meaningful length or connection, an area has a
single conceptual interior that can indicate a meaningful region or cluster of marks, and a volume has a
single conceptual interior that can indicate meaningful space in three dimensions. Of the four simple
mark types, the first three are identified by Bertin (1983) as being the most primitive components of
two dimensional graphics and used by Mackinlay (1986) to automate the design of graphical
presentations. The fourth mark type, volume, is a natural extension of Bertin's classification to three
dimensions and appropriate when the third dimension can be perceived effectively. A compound mark
is a collection of simple marks which form a single perceptual unit. Contour lines, wire meshes,
glyphs, arrows, flow ribbons, and particles are all compound marks. A useful analogy is that simple
marks are like letters in the alphabet, whereas compound marks are like words in a dictionary.

Data can be encoded in a visualization technique by varying the positional, temporal, and retinal
properties of its marks (Bertin 1983). A positional encoding of information is a variation of the
positions of the marks in the image. A temporal encoding of information is a variation of the mark
properties over time. A retinal encoding of information is any variation of the "retinal" properties of the
marks that the retina of the eye is sensitive to independent of the position of the marks. The retinal
properties are size, texture, orientation, shape, transparency, three dimensions of color, namely, hue,
saturation, and brightness. While size, texture, orientation, shape, hue and saturation were described by
Bertin as being the only retinal properties of marks in two dimensional graphics rendered on paper
media, the remaining retinal properties, brightness, and transparency can also be used to encode
information on modem graphics hardware.

The marks can further be classified as to whether they represent single or multiple data variables and
single or multiple data points. A single variable (SY) mark is associated with one variable, whereas a
multiple variable (MY) mark is associated with several variables. A single data (SD) mark conveys a
single value for a single data point, whereas a multiple data (MD) mark shows a range of summary
information regarding the local distribution of several data points. This classification is particularly
useful when visualizing large multi-variate data sets.

Primitive Visualization Techniques


Primitive visualization techniques are those which encode one dependent and up to four independent
variables. Additional variables (dependent or independent) that may exist in a given data set can further
272

be encoded by manipulating retinal properties of marks within the primitive visualization techniques or
equivalently composing two or more primitive visualization techniques into a single design. In general,
the set of primitive visualization techniques are classified into three categories, that is, positional,
temporal and retinal, depending on which mark properties the techniques manipulate. Positional
techniques can be one, two, and three dimensional such as single axis, contour plot, and surface
diagram, respectively. There is only one temporal technique, animation, and one retinal technique for
each retinal property associated with marks. Following the analogy made previously between simple (or
compound) marks and letters (or words), the primitive visualization techniques may be viewed as
forming simple sentences in a language.

The set of primitive visualization techniques that are commonly used include the following:

single axis isosurface texture


line plot arrow plot orientation
scatter plot particle advection shape
bar plot flow ribbons transparency
contour plot deformation hue
pseudo-color animation saturation
surface diagram size brightness

Although some of these techniques may be considered to be compositions of others, for instance, an
arrow plot may be viewed as a composition of a scatter plot and a shape, where the scatter points have
an arrow shape, it is more appropriate to include them in the set of primitive visualization techniques
rather than construct them in terms of other techniques. The set also contains techniques that manipulate
more than one of the positional, temporal, and retinal properties of marks. For example, a pseudo-color
image is basically a positional technique, which also uses one or more of the color parameters, hue,
saturation, and brightness.

Rules
Rules of scientific data visualization generally fall into two groups: (1) composition rules, and (2)
heuristic rules. Composition rules define under which circumstances two visualization techniques can
be combined to form a composite technique. Heuristic rules are primarily concerned with the
expressiveness and the effectiveness of individual marks and primitive visualization techniques.

There are six rules corresponding to different types of composition: (1) single axis composition, (2)
double axes composition, (3) triple axes composition, (4) mark composition, (5) composition by mark
interleaving, and (6) composition by transparency. The first three types can compose two visualization
techniques that have one, two, and three axes in common, respectively, provided that the total number
of positional axes in the composite is not greater than three. Mark composition merges marks of two
visualization techniques by pairing each and every mark of one technique with a compatible mark of the
other. For the mark composition to be applicable, marks of different visualization techniques involved
in the composition must encode information using different retinal properties. Composition by mark
interleaving selects marks of the composite by alternating at regular intervals the marks of two
visualization techniques. Since certain marks of the composed visualization technique are not included
among the marks of the composite, composition by mark interleaving may cause loss of information.
Composition by transparency merges marks of two visualization techniques by manipulating the
transparency of marks in at least one of the techniques. Composition by transparency is quite common
in volume visualization. Of the six compositions, the first, the second, and the fourth were introduced
by Mackinlay (1986). The remaining compositions are new additions to the set and are mainly
applicable to three dimensional graphics.

The same data can be visualized with varying degrees of effectiveness by alternative visualization
techniques. Selecting the most effective technique among alternatives requires human expertise which is
not well defined. Such expertise is often in the form of heuristics. Through literature surveys and
discussions with visualization experts, several heuristic rules that are useful in visualization technique
design have been acquired. In general, these heuristics relate to the expressiveness and the effectiveness
of marks and primitive visualization techniques. While the expressiveness rules identify visualization
273

techniques capable of expressing the desired infonnation, the effectiveness rules identify those
techniques which are the most effective in a given situation at exploiting the capabilities of the output
medium and the human visual system (Mackinlay 1986). For instance, an expressiveness rule states
that "If it is desirable to display twisting effects in a flow field, use flow ribbons rather than particles
because movement of particles does not necessarily show twisting in a flow field". A typical
effectiveness rule states that "Our visual perception system is better tuned to quantitative understanding
using geometry rather than color". Although far from being complete, a set of heuristic rules similar to
the ones presented here were compiled (Senay 1990a) for designing effective data visualization
techniques.

ANALYSIS OF SCIENTIFIC DATA VISUALIZATION TECHNIQUES

Most existing visualization techniques can be represented in tenns of visualization trees that are fonned
by the four components described in the previous section. These trees are similar to parse trees used to
represent the meaning of sentences in a fonnal language. The leaf nodes of visualization trees
correspond to the primitive visualization techniques. The root and the internal nodes represent
composite visualization techniques. At each node, a visualization technique, either primitive or
composite, is described in tenns of its graphical elements, that is, marks, and the associated bindings
established by the heuristic rules. Two nodes are merged to fonn a composite by applying an
appropriate composition rule. Figure 1 and 6 depict two visualization trees corresponding to the
composite techniques shown in Figure 2 and 7, respectively.

Figure 2 presents a visualization of temperature and humidity distribution in Northwest Peru during the
1982-1983 El Nino period. The visualization tree of Figure 2 which is shown in Figure 1 has two leaf
nodes. The left leaf node in this tree corresponds to the visualization of temperature as a function of
longitude and latitude. Since the temperature distribution fonns a scalar field, it is encoded by the
position of marks on a surface diagram as shown in Figure 3. The selection of surface diagram, instead
of another primitive visualization technique such as pseudo-color or contour plot, is made based on
effectiveness rules, such as (1) position is more effectively perceived than color, and (2) contour plots
require some effort on the part of the viewer to establish quantitative relations between different contour
levels since it is not always obvious whether a local extremum is a minimum or a maximum. The right
leaf node in the visualization tree corresponds to the visualization of humidity as a function of longitude
and latitude. Although the humidity distribution also fonns a scalar field and can be expressed using
surface diagram, it is encoded by pseudo-color as shown in Figure 4 because the more effective
technique, surface diagram, has previously been used within the same visualization tree. Again, the
decision to se1ectpseudo-color among the remaining alternatives has been based on the effectiveness
rules. Since both visualization techniques corresponding to the left and the right leaf nodes have two
identical axes, a double axes composition is used to fonn the composite technique at the root node
resulting in the visualization of Figure 2. In this visualization of temperature and humidity data, the
temperature has been considered more important than the humidity resulting in a more effective binding
for the temperature. If the importance ordering had been different, that is, the humidity had been more
important than the temperature, the same data would have been visualized as shown in Figure 5 which
encodes the humidity and the temperature by the surface diagram and the pseudo-color, respectively.

Similarly, trees for more elaborate visualization techniques can be constructed through detailed
analyses. For instance, the visualization of injecting plastic into a mold as shown in Figure 7, which
was generated at NCSA (Elison 1988), can be described by the visualization tree in Figure 6. In this
visualization, three dependent variables, pressure in the mold, temperature of the plastic, and velocity of
plastic flow are visualized as functions of three positional parameters, X, Y, and Z. This technique also
includes a compound mark to encode a collection of flow vectors, which fonn the marks of the 3D
arrow plot. Although it is more convenient to consider compound marks as primitives, it is illustrative
to analyze one for designing effective compound marks. Figure 8(a)-(e) shows several steps in
designing the 3D color coded glyphs used in the visualization of injecting plastic into a mold. Figure
8(a) shows the first attempt to visualize five parallel vectors using line arrows in three dimensional
space. Since it is difficult to perceive the direction of line arrows in three dimensional space, arrows
with volumes are used instead as shown in Figure 8(b). With appropriate projection, it is now easier to
perceive the direction of the arrows. In this figure, the retinal properties of arrows other than length and
274

orientation do not encode any infonnation. Since there is no restriction on the height (h) of arrows as
long as individual arrows do not overlap, it is possible to change the height from h to h*. In doing so, it
is now possible to view the five vectors as a single perceptual unit as shown in Figure 8(c). Smoothing
provides the perception of continuity in the flow field as shown in Figure 8(d). This glyph
corresponds to the rightmost leaf node in Figure 6. After the last step, the length, orientation, and
height cannot be further manipulated since each carries some meaningful infonnation, but any other
retinal property, such as, the unused part of the color range or texture, can convey additional
infonnation as shown in Figure 8(e). Since color is a retinal technique and the temperature is encoded
as a separate perceptual unit, it is superimposed on the 3D glyph by mark composition in Figure 6.

AUTOMATIC SYNTHESIS OF SCIENTIFIC DATA VISUALIZATION


TECHNIQUES

The analyses presented in the previous section suggest a design process which leads to the automatic
synthesis of scientific data visualization techniques. The process model involves defining visualization
mapping between data descriptors and visualization primitives according to the appropriate composition
and heuristic rules. This model is quite similar to that described by Mackinlay (1986). It involves three
major design steps: (1) data decomposition, (2) synthesis of components, and (3) visualization
technique composition.

In the first step, the data set is decomposed into subsets, each of which can be visualized by one of the
primitive visualization techniques. The data decomposition takes place entirely at the abstract data
description level without accessing the actual data. The result of this step is an ordered list of relations
corresponding to a collection of simple data sets where each data set is in one of the following relational
fonns: (1) X ~ A, (2) X,Y ~ A, (3) X,Y,Z ~ A, or (4) X,Y,Z,W ~ A. During data decomposition,
the relations can be ordered either arbitrarily or based on an importance ordering defmed externally. The
ordering defined in this step partially detennines the components of the final composite design.

In the second step, a primitive visualization technique is found for each relation using the
expressiveness and effectiveness rules. The search space for primitive visualization techniques is
primarily organized using the effectiveness rules. Before the search starts, the applicable expressiveness
rules identify those primitive visualization techniques which are capable of expressing the relation. At
this point, all primitive visualization techniques which do not satisfy the expressiveness criteria are
removed from the search space. The space for primitive visualization techniques can be searched either
depth-first or breadth-first. Visualization primitives that have already been used to encode some data
relation are also removed from the search space to avoid incompatibility between component designs.

In the final step of design, primitive visualization techniques representing simple relations are combined
to fonn a composite visualization technique by applying the appropriate composition rules. These rules
primarily check for compatibility of component visualization technique designs and define how to
compose them. If the component designs are not compatible, some of the previous design steps are
redone in order to fmd an alternative solution. The design process tenninates as soon as a visualization
technique is found for the original data set.

Following the completion of the visualization technique design, it is also possible to interactively refine
the design either before or after the actual rendering as long as these refinements do not cause
inconsistencies in the final design.

CONCLUDING REMARKS

The scientific data visualization process involves a sequence of transfonnations that convert a data set
into a displayable image. One of the most important transfonnations in this process is the visualization
mapping which defines a visualization technique for the data. The overall effectiveness of visualization
techniques critically depends on the mappings defined at this stage. Establishing a mapping which leads
275

to an effective data visualization technique requires significant knowledge in several fields, such as data
management, computer graphics, and visual perception. However, the scientists who could benefit
most from the visualization tools and techniques usually lack this knowledge. There is fortunately
strong evidence that this knowledge can be identified, acquired, and formalized for assisting scientists
in data visualization. The analyses presented in this paper support the evidence. According to these
analyses, there are four key components of scientific data visualization. Even though no claim about
their completeness can be made, these components seem to be sufficient to describe most of the existing
visualization techniques. Undoubtedly, further analyses are needed to strengthen the base established in
this paper. In particular, it is essential that formal empirical studies of visualization techniques be
undertaken to assess the effectiveness of primitives involved in their design.

The class of components defined in this paper also supports domain independence and interactive
visualization. Since the same components can be used in different visualization technique designs, the
reusability is naturally ensured. Furthermore, new components can easily be added to the existing set of
components, which support extensibility.

The analyses further suggest a design process leading to the automatic synthesis of scientific data
visualization. In the previous section, a systematic approach to the automatic synthesis of visualization
techniques has been presented. Based on this approach, we are currently developing a visualization tool
assistant (VISTA) which will assist scientists in selecting and creating data visualization techniques. As
illustrated in the glyph analysis, it may even be possible to automate the synthesis of compound marks
whose designs require far more expertise than synthesizing visualization techniques.

ACKNOWLEDGEMENTS

We would like to thank members of the Graphics and User Interface Research Group at The George
Washington University who provided an intellectually stimulating environment for this work. Special
thanks to Professors James Foley and John Sibert who made valuable suggestions. Financial support
was provided by USRA Center of Excellence in Space Data and Information Sciences (CESDIS) under
NASA contract SIC 550-64.

REFERENCES

Bertin J (1983) Semiology of Graphics. University of Wisconsin Press, Madison, WI. Translated by
W. Berg and P. Scott from La Graphique et Ie Traitement Graphique de tlnformation, Flammarion,
Paris, 1977.

Cleveland WS, McGill R (1985) "Graphical Perception: Theory, Experimentation, and Application to
the Development of Graphical Methods". Science 229(August): 828-833

Cognivision Incorporated (1989) FOTO Reference Manual. Cognivision, Inc., Westford, MA

Crusi C, et al. (1987) "apE: A Flexible Integrated Environment for Supercomputers and
Workstations". Proceedings of the Third International Symposium on Science and Engineering on
Cray Supercomputers. September: 533-588

ElIson R, Cox DJ (1988) "Visualization ofInjection Molding". Simulation 51(5): 184-188

Grinstein G, Pickett R, Williams M (1989) "EXVIS: An Exploratory Visualization Environment".


Proceedings Graphics Interface '89, CIPS,Toronto : 254-261

Haber RB (1988) "Visualization in Engineering Mechanics: Techniques, Systems, and Issues".


Visualization Techniques in the Physical Sciences, SIGGRAPH'88 Tutorial Notes (August): 89-
III
276

Kosslyn S (1989) "Understanding Charts and Graphs". Applied Cognitive Psychology 3: 185-226.

Mackinlay JD (1986) Automatic Design of Graphical Presentations, PhD Thesis, Dept. of Computer
Science. Stanford University, Stanford, CA

National Center for Supercomputer Applications (NCSA) (1989) NCSA DataScope Reference Manual.
NCSA, University of Illinois, Campaign-Urbana, IL

Senay H, Ignatius E (1990a) "Rules and Principles of Scientific Data Visualization". Technical Report
No: GWU-IIST-90-13, Department of Electrical Engineering and Computer Science, The George
Washington University, Washington, D.C. 20052. Also in State of the Art in Scientific
Visualization, SIGGRAPH'90 Tutorial Notes (August)

Senay H, Ignatius E (1990b) "VISTA: Visualization Tool Assistant for Viewing Scientific Data".
Technical Report No: GWU-IIST-90-14, Department of Electrical Engineering and Computer
Science, The George Washington University, Washington, D.C. 20052. Also in State of the Art in
Scientific Visualization, SIGGRAPH'90 Tutorial Notes (August)

Treinish LA (1988) "An Interactive, Discipline-Independent Data Visualization System". Technical


Report: NSSDC NASNGoddard Space Flight Center

Tufte E (1983) The Visual Display of Quantitative Infonnation. Graphics Press, Chesire, CT

Upson C, Faulhaber T, Kamins D, Laidlaw D, Schlegel D, Vroom J, Gurwitz R, van Dam A (1989)
"The Application Visualization System: A Computational Environment for Scientific Visualization".
IEEE Computer Graphics and Applications 9(4): 30-42

Visualization of Temperature
and Humidity Data

Surface Diagram Pseudo-Color


axis-1: LONGITUDE axis-1: LONGITUDE
axis-2: LATITUDE axis-2: LATITUDE
elevation: TEMP color: HUMIDITY

Fig. 1. Tree for Visualization of Temperature and Humidity Data


277

Fig. 2. Visualization of Temperature and Humidity Data

Fig. 3. Visualization of Temperature Data


278

Fig. 4. Visualization of Humidity Data

Fig. 5. Alternative Visualization of Temperature and Humidity Data


279

Static Visualization of
Injecting Plastic
into a Mold

Composition
20 Pseudo-Color 3D Arrow Plot
wi Coded Glyphs
axis· ' : X
axls·2: Y
color: PRESSURE

20 Arrow Plot Coded 3D Glyph


axis-I : X
axis·2: Y
arrow: VELOCITY

Fig. 6. Tree for Static Visualization of Injecting Plastic into a Mold

Fig. 7. Visualization of Injecting Plastic into a Mold


(Courtesy of Elison & Cox, 1988)
280

z z

x x
(a) (b)

z z
h* h*

--""'---" y - -..::::.....-.... y

x x
(c) (d )

x
(e)

Fig. 8. Glyph Analysis: (a) Visualizing 3D vector fields with simple arrows; (b) Arrows with
volumes; (c) Increasing the height of arrows; (d) A compound mark after smoothing; and (e) Use
of gray scale to encode an additional data variable.
281

AUTHOR'S BIOGRAPHIES

Hikmet Senay is an assistant professor in the Department of Electrical


Engineering and Computer Science at the George Washington University.
He is responsible for the ongoing development of a knowledge-based
advisory system for scientific data visualization for the Center of Excellence
in Space Data and Information Sciences (CESDIS) at NASA Goddard
Space Flight Center. His research interests include scientific data
visualization, user-computer interfaces, artificial intelligence, and visual
programming environments for artificial intelligence languages. He
received his BS degree in electrical engineering from Istanbul Technical
University in 1979. He was awarded an MS in computer engineering, an
MS in computer science, and a PhD in computer engineering from Syracuse
University in 1982, 1986, and 1987, respectively. Dr. Senay is a member
of the IEEE Computer Society, ACM (SIGCHI and SIGART), AAAI, and
Software Psychology Society.
Address: Department of Electrical Engineering and Computer Science,
The George Washington University, 801 22nd Street, N.W. , T-624A,
Washington, D.C. 20052

Eve Ignatius received the B.Sc. in mathematics from Clemson


University, Clemson, SC, in 1975. From 1976 to 1985 she worked for
Burroughs Corp (Unisys) in the United States and Britain, handling
software development for financial systems and other projects. In 1987
she was awarded the M.S . degree in computer science from the George
Washington University, Washington, DC, where she is currently a
doctoral student. Her research interests include visual perception, artificial
intelligence and scientific data visualization. Ms. Ignatius is a member of
the IEEE Computer Society and Software Psychology Society.
Address: Department of Electrical Engineering and Computer Science,
The George Washington University, 801 22nd Street, N.W.,
Washington, D.C. 20052
Precise Rendering Method for Edge Highlighting
Toshimitsu Tanaka and Tokiichiro Takahashi

Abstract
This paper introduces the Precise Rendering Method which generates very accurately
highlighted images from tessellated polygons. Highlights appearing on edges are strong
cues for recognizing three dimensional shapes. To improve the photorealism of computer
generated three dimensional images, accurate highlighting is necessary. This requires very
effective anti-aliasing since rounded edges can be tessellated as thin polygons which often
lead to jagging.
The Precise Rendering Method first detects exact polygon areas in each pixel by
using the Cross Scanline Algorithm. The algorithm can efficiently detect all polygons
and calculate their exact areas projected onto each pixel even if the polygons are much
smaller than the pixel. The intensity of reflection is then integrated for each polygon
area. This integration is quickly performed by referencing a two dimensional table.
Several synthesized images are created to show the efficiency of the method.

Key words: Edge highlighting and shading, Integration of small reflection, In-
tegration by table referencing, Horizontal and vertical scanning, Anti-aliasing

1 Introduction
Most edges and vertices of objects surrounding us are not sharp but rounded by design. They
are manufactured as smooth tightly curved surfaces which reflect light rays over a wide angle
and cause highlights.
Edge highlights are very important in making photorealistic images. The two computer
generated images in Fig. 1 clearly show the effect of edge highlights: the image in Fig. l(a)
is identical to that in Fig. l(b) except for its lack of highlights. Consequently, Fig. l(b) is
much more realistic.
In the field of technical illustration for industrial design, emphasizing edge highlights has
long been known as a highly effective technique. Renderers often highlight the edges of illus-
trations, but the effect depends entirely on individual skill. Kondo (1988) et al. developed an
interactive rendering system which can create effective edge highlights. However, it requires
excessive effort to specify candidate edges and produce acceptable highlighting.
Conventional shading methods are, of course, theoretically applicable to edge highlighting.
However, rounded edges are so thin and their curvature is so high that severe aliasing artifacts
often occur (Crow 1978; Nishita 1984).
Saito (1989) et al. proposed a rendering method to highlight edges efficiently. The method
treats edges as thin arcs of cylinders. These rounded edges are rendered separately from
ordinary surfaces as shaded wire-frames, and then superimposed onto the normally shaded

283
284

(a) without highlights

(b) with highlights

Fig.1 Effect of edge highlights


285

surface image. This method is simple, however, visible highlighted edges can be wrongly
eliminated.
We have proposed the Cross Scanline Algorithm (Tanaka 1990) which can efficiently detect
all polygons and calculate their exact ,areas projected onto each pixel even if the polygons
are much smaller than the pixel. This paper introduces the Precise Rendering Method which
employs the Cross Scanline Algorithm to accurately render rounded edges and vertices, In
this paper, we also propose a quick calculation technique that references a two dimensional
table for integrating the intensity of each surface.
This paper discusses conventional methods (Section 2), presents details of the Precise
Rendering Method (Section 3), and shows experimental results (Section 4). The conclusion
of our work can be found in Section 5.

2 Conventional Methods
In this section, two conventional methods for generating edge highlights will be discussed.

2.1 Polygonization
The first method converts rounded edges and vertices to polygons, then renders them using
conventional hidden surface removal algorithms and smooth shading techniques (Blinn 1977).
For example, the rounded edges and the vertex shown in Fig. 2( a) are converted to roughly
equivalent polygons in Fig. 2(b). In general, the minimum radii of curvature of the rounded
edges and vertices are much smaller than the object's other dimensions. Therefore, equivalent
polygons projected onto the pixel plane are usually thinner or smaller than individual pixels,
If the polygons in Fig.2(b) are projected onto pixels as shown in Fig, 3 and rendered by
a conventional Z-buffer method, the pixel PI in Fig. 3 is determined to be covered by the
triangle ABC. This is because the method only detects the polygon that covers the pixel
center D, However, triangle ABC only partially covers H.
The surface normals N a, N b , and Nc to points A, B, and C, respectively, are perpendicular
to each other. The intensity of light reflection is closely related to the surface normals, When
the surface normals vary widely, it is highly likely that some reflection will be seen from any
viewpoint. However, the conventional Z-buffer calculates the polygon intensity using only
Nd, which is the surface normal at point D in Fig. 3, Therefore, the calculated intensity is
not accurate.
These problems can be reduced, but not eliminated, by using a lot of sampling points in
each pixel. Thus, the computing cost of sampling methods can be extremely high.

2.2 Thin Cylinders


The second highlight generation method (Saito 1989) accurately renders rounded edges and
vertices separately from ordinary surfaces. In this method, rounded edges are regarded as
thin cylinders, This shape is so simple that accurate pixel intensities can be calculated. A
shaded wire-frame image is accurately generated from the rounded edges and vertices. A
shaded image of the bulk surfaces is generated by a conventional Z-buffer algorithm. Finally,
in each pixel, the shaded wire-frame image is superimposed on the shaded image, if its depth
value saved in the Z-buffer is smaller. So as not to eliminate visible edges, the rounded edges
are shifted a little bit closer to the view point, because the depth value of a rounded edge is
the same as that of its corresponding surface boundary. Since this method does not tesselate
rounded edges, computing cost for highlighting is low,
286

(a) described by curved surfaces (b) described by polygons

Fig.2 Rounded edges and vertices

Fig.3 Sampling by Z-buffer


287

However, this method is not robust because to choose the correct shifting depth is very
difficult in complex scenes. If the shifting depth is too small, visible edges are eliminated and
if too large, invisible edges may be displayed. The radii of the rounded edges projected on
the pixel plane are assumed to be much smaller than the pixel dimensions. Therefore, it is
very difficult to use the method with zooming which is common in animation. The shaded
images include some aliasing artifacts. A supersampling algorithm can reduce aliasing, but
very long computing time is incurred.
Moreover, when a flat object surface is brightly illuminated, its round edges must be
shaded, because they are strongly curved. This effect of indicating surface borders with lower
intensity is very important for image photorealism. However, the second highlighting method
cannot produce this effect because it merely adds the intensities of a shaded wire-frame to a
shaded image.

3 Precise Rendering Method


Rounded edges and vertices are tesselated and converted to thin and small polygons, respec-
tively. In this case, two problems make accurate rendering difficult.

1. The polygons are much thinner or smaller than the pixels. However, their exact areas
within each pixel must be calculated, because highlights are very important.
2. The directions of surface normals given at the polygon vertices are very different. Since
the intensity of light reflection depends on the direction of the normals, a polygon
intensity calculated from just a few surface normals is probably inaccurate.

The Precise Rendering Method overcomes these problems. This method consists of two
steps. First, polygon areas projected onto each pixel are accurately calculated using the Cross
Scanline Algorithm (Tanaka 1990). The Cross Scanline Algorithm is the first implementation
of a full-rendering algorithm (Catmull 1978; Barros 1979). It can calculate exact polygon
areas by using horizontal and vertical scanlines. Next, the intensity of reflection is integrated
throughout the areas. The intensity at each point is calculated from its surface normal. When
original surface normals are given at polygon vertices, a surface normal at each point can be
determined by interpolating the original normals. The algebraic integration of the intensity is
computationally expensive; however, an approximate numerical integration can be calculated
by referring to a two dimensional table.
Details of the Precise Rendering Method are presented in the following sections.

3.1 Cross Scanline Algorithm


The Cross Scanline Algorithm uses horizontal and vertical scanlines named H-scanlines and
V-scanlines, respectively. H-scanlines are located at each horizontal pixel boundary and their
length equals the image width.
The conventional Scanline Algorithm estimates polygon areas as trapezoids or triangles
clipped by scanlines. Thus, the algorithm produces area estimation errors, which are shown
as dark areas in Fig. 4( a). The estimation errors occur at pixels that include polygon vertices
or edge crossing points. To prevent such errors, scanlines must pass through such points.
However, since that would require an excessive number of horizontal scanlines, computing
cost would be excessive. The cost is reduced by using V -scanlines, as shown in Fig. 4(b).
V-scanlines are located at:

(1) points where edges intersect with the upper H-scanline (SL1),
288

/Scanline

\
\

~ _Estimation
Error
(a) conventional Scanline Algorithm

I H-Scanline
--~~~~h---~~.---~~~-- SLI

V-Scanline

SL2

(b) Cross Scanline Algorithm

Fig.4 Polygon area calculation

(2) points where edges intersect with the lower H-scanline (SL2),
(3) polygon vertices between the H-scanlines,
(4) points where edges of polygons intersect, and
(5) the right and left ends of the pixel row.
Each V-scanline spans two adjacent H-scanlines and is one pixel height in length.
Polygons are separated by V-scanlines. Between any two V-scanlines, polygon edges
are continuous and do not intersect except at a V-scanline. Therefore, when vertical pixel
boundaries appear between two V-scanlines, the points where polygon edges intersect the
boundaries can be easily calculated from the points where the edges intersect the V-scanlines.
As shown in Fig. 4, many V-scanlines may be required for exact area calculation. However,
in experiments, a 512 x 512 image consisting of 10,000 small polygons is generated in the
same time as required to scan 8 lines per pixel (Tanaka 1990). This efficiency results because
V-scanlines are just one pixel high.
289

H N

L
E

Fig.5 Blinn's reflection model

The Cross Scanline Algorithm can accurately detect polygon regions in each pixel, so
accurate anti-aliasing can be achieved. The Cross Scanline Algorithm produces much better
image quality than the Multi-Scanning Algorithm for equivalent computing times.

3.2 Intensity Integration


Popular reflection models have been proposed by Phong (1975), Blinn (Blinn), and Cook
and Torrance (1981). Simplified Blinn's model was chosen because its computing cost is
relatively low. The calculation of reflected light intensity lout is described in Equations 1 -
3. The vectors in these equations are shown in Fig. 5.

lout kdld + k.I. (1)


Id = Rd(N·L)Iin (2)
I. = R. (N . Ht lin (3)

where:
N: unit surface normal
L: unit vector in the direction of light
E: unit vector in the direction of viewer
H: unit angular bisector of E and L
lin: intensity of the incident light
lout: intensity of the reflected light
Id: intensity of diffuse reflection
I.: intensity of specular reflection
kd: fraction of reflectance that is diffuse
k.: fraction of reflectance that is specular (kd + k. = 1)
Rd: diffuse bidirectional reflectance
R.: specular bidirectional reflectance
n: index of reflection

Calculating the intensity of specular reflection is discussed for the case shown in Fig. 6(a).
Surface normals at polygon vertices A, D, and E are known. Points B and C are where
polygon edges AD and AE intersect the vertical pixel boundary. The surface normal N B
290

NB

NE

(a) surface normals in a pixel (b) approximation

Fig.6 Specular integration

and N c can be determined by interpolating the surface normals at A and D and A and E.
Surface normals at points inside the triangle ABC are calculated by interpolating N A , N B ,
and N c . Triangle ABC's specular reflection is calculated by Equation 4.

Is = Rr f (N· Ht lin dS (4)


}C,ABC

Here, H and lin vary in a pixel because they are functions of position on the triangle
ABC. However, if the light sources and the viewpoint are far from the triangle, variations
of H and lin in a pixel are negligible smaller than the variation of N. Thus, we assume H
and lin are constant in a pixel. Their values at point G, Ha and I ina , are applied. Point G
is the center of gravity of the triangle ABC. This approximation is illustrated in Fig. 6(b).
I. is represented by Equation 5.

(5)

To convert to a polar coordinate system, point G is chosen as the origin, and the direction
of Ha is the z axis. As shown in Fig. 7, the surface normals N A , N B , and Nc are plotted
on the unit sphere. Intermediate surface normals are plotted in the hatched area on the unit
sphere. Equation 5 is represented by Equation 6 in the polar coordinate system. To simplify
the integration, the determination of Jacobian of the translation, a(x, y)/a(B, tft), is defined
as a constant as shown in Equation 7. This is not strictly correct; however, it is adequate
for calculating edge highlights because surface normals vary smoothly throughout flNANBNC-
Therefore, the specular reflection Is can be described by Equation 8

(6)

(7)
y
F ig . In te g ra ti o n
7 in th e p o la
r co o rd in at
e sy st em

u
y
F ig .
8 D ef in it io n of th e
in te g ra l ta b
le
292

Is = R,.IinGKABC 1
flNANBNC
cos n(} dw, (8)

where:
SABC : area of triangle ABC
QNANBNc: solid angle of N A, N B, and Nc .

It is difficult to calculate Equation 8 numerically. However, function cos n (} is rotationally


symmetric around the z axis. Thus, this integral can be approximately calculated by using
Feibush's filtering method (1980) modified for the polar coordinate system. The integrated
value is given by referencing a two dimensional table. The table is designed as follows.
As shown in Fig. 8, points H, R, S, T, U, and the unit circle r are defined. Here, unit
circles are great circles of the unit sphere whose center is the origin 0 and whose radius is 1.
H is a point on the z axis and z = 1. R lies on the xz plane and the surface of the sphere.
The angle of vector OR from the z axis is given as a. T lies on the xy plane and the surface
of the sphere. The angle of vector OT from the x axis is given as (3. U is a point on the
y axis and y = 1. r is the unit circle that runs through R and U. S is the point where r
intersects with the unit circle that runs through H and T.

J(HRS) = r
JflHRS
cos,,(}dw (9)

The function J(H RS) is defined by Equation 9. QHRS in Equation 9 is the solid angle of
vectors OH, OR, and OS, and (} is the angle measured from the z axis to a point in QHRS.
QHRS on the unit sphere is indicated by hatching in Fig. 8. Since R and S are described
by a and (3, J(HRS) is a function of a and (3. Therefore, we designed a Table(a, (3) whose
elements contain values of J(HRS).
The intensity of specular reflection Is is described by Equation 10 from Equations 8
and 9. It is shown that J(NANBNc) is determined by referring to Table(a,(3). In Fig. 7,
J(NANBNC) can be divided into 3 terms using point H G. This is described by Equation 11.

Is RrIinGKABC J(NANBNc) (10)


J(NANBNc) J(HGNBNC) - J(HGNANB) - J(HGNANC) (11)
J(HGNANB) J(HGRABNB) - J(HGRABNA) (12)

For calculating J(HGNANB), points UAB , R AB , unit circle r AB, and the plane ~AB are
defined as shown in Fig. 9. r AB is the unit circle that runs through NA and N B. Point UAB
is where r AB intersects the xy plane. Plane ~AB includes 0 and is perpendicular to vector
OUAB . Point RAB is where r AB intersects ~AB. J(HGNANB) is described by Equation 12
using R AB . The coordinate system is rotated around the z axis and vector OUAB becomes
the new y axis. However, function J is immutable with this rotation. As aresult, RAB and
r AB in Fig. 9 correspond to Rand r in Fig. 8. Therefore, J(HGRABNA) and J(HGRABNB)
are both given by the integral table.
The angles a and (3 are defined as [0,90], because J(H RS) is symmetrical. If the interval
of a and (3 in the table is 1 degree, the table occupies 90 x 90 x 4 bytes. Integration tables
have to be del')igned for each index of reflection n. However, the number of different indices
that appear in one data set is not usually very large. Ifthe number of indices is limited to less
than 10, the total required memory size is 324 kbytes. This is smaller than for a 512 x 512
pixel full color image (~ 786 kbytes).
293

F ig . Inte
9 gra.tion in U
HNANs w
it h th e ta.bl e
294

4 Experimental Results
4.1 Comparison with the Conventional Scanline Algorithm
The Precise Rendering Method was compared with the conventional Scanline Algorithm
using smooth shading. Radial patterns with rounded edges were generated. Figure 10(a) was
generated by the conventional Scanline Algorithm, and (b) by the Precise Rendering Method.
Strongly jagged edges appear on nearly horizontal lines in Fig. 10( a), while Fig. 10(b) benefits
by their absence as well as accurate highlights.
Multiple exposed images of a moving object are shown in Fig. 11. Since the object is
illuminated by a simple directional light source, the intensity of the object's vertices must be
constant as shown in Fig. l1(b). However, the intensity depends on object position in the
conventional scanline image in Fig. 10( a). This is because its sampling interval is too long for
accurate rendering of highlights. This means that highlights flicker when the object moves.

4.2 Image Generation


Images of a telephone are shown in Fig. 12. Edges and vertices of the buttons are realis-
tically rounded in Fig. 12(b), while they are not in Fig. 12(a). Figure 12(c) shows the model
from a closer viewpoint. These images indicate the importance of correct highlighting for
enhanced photorealism.
The effect of shaded edges is shown in Fig. 13. In this example, flat surfaces of the block
are strongly illuminated by a single directional light source. In Fig. 13(a), edges are not
rounded. Since the intensities of the top surfaces are nearly equal across their border, the
border line cannot be discerned. However, in Fig. 13(b), they are clearly separated by a
shaded line. The shaded line is caused by the top rounded edge, because reflection from its
curved surface is weaker than from the flat surfaces. This effect is also important for realistic
images, but it can not be produced by the highlight superimposition method.

5 Conclusion
The Precise Rendering Method was developed to render rounded edges and vertices accu-
rately. Accurate rendering requires edges and vertices to be tessellated and converted to
small polygons. Since these polygons are smaller than the pixels, projected polygon areas on
each pixel must be exactly calculated. The Precise Rendering Method makes this possible by
employing the Cross Scanline Algorithm.
Surface normals are very different at the vertices of tessellated polygons. Therefore, the
intensity of specular reflection must be integrated. The Precise Rendering Method realizes
this integration by referring to a small number of two dimensional tables. Thus, this method
efficiently generates accurately highlighted images.
Experimental results showed accurate edge highlights are generated even if the edges are
nearly horizontal. Shaded edges were also accurately generated. Highlights and shading of
rounded edges greatly enhanced photorealism. The Precise Rendering Method is thus shown
to be a powerful technique for generating photorealistic images.

Acknowledgment
We would like to thank Dr. Rikuo Takano and Dr. Masashi Okudaira for their continuous
support. We would like to thank Dr. Nelson Max, Mr. Mikio Shinya, and Dr. Takafumi
295

(a) conventional Scanline Algorithm (b) Precise Rendering Method

Fig.l0 Radial pattern with rounded edges

(a) conventional Scanline Algorithm (b) Precise Rendering Method

Fig.ll Moving object


296

(a) without rounded edges

(b) with rounded edges

(c) key pad

Fig.12 Telephone
297

( a) not rounded (b) rounded

Fig.13 Shaded edge

Saito for their advice and encouragement. We would like to thank Mr. Pierre Poulin for his
instructional review of this paper.

Reference
Barros J (1979) Generating smooth 2-D monocolor line drawings on video displays. Proc.
SIGGRAPH'79: 260-269
Blinn J (1977) Methods of light reflection for computer synthesized pictures. Proc. SIG-
GRAPH'77: 192-198
Catmull E (1978) A hidden-surface algorithm with anti-aliasing. Proc. SIGGRAPH'78:
6- 11
Cook R, Torrance K (1981) A Reflectance Model for Computer Graphics. Proc. SIG-
GRAPH'81: 307-":316
Crow F (1978) The use of grayscale for improved raster display of vectors and characters.
Proc. SIGGRAPH'78: 1-5
Feibush E, Levoy M, Cook R, (1980) Synthetic texturing using digital filters. Proc. SIG-
GRAPH'80: 294-301
Kondo K, Kimura F, Tajima T (1988) An interactive rendering system with shading. In:
T.Kitagawa (ed) Japan Annual Reviews in Electronics, Computers & Telecommunica-
tions, Computer Science and Technology 18. Ohbunsya and North-Holland, Tokyo, pp
255-271
Nishita T, Nakamae E (1984) Half-tone representation of 3-D objects with smooth edges by
using a multi-scanning method. Trans. Information Processing Society of Japan 25(5):
703- 711
298

Phong B (1975) Illumination for Computer Generated Pictures. Communication of the


ACM 18(6): 311- 317
Saito T, Shinya M, Takahashi T (1989) Highlighting Rounded Edges . Proc. CG Interna-
tiona1'89: 613-629
Tanaka T, Takahashi T (1990) Cross Scanline Algorithm. Proc. Eurographics'90: 63-74

Toshimitsu Tanaka is currently a research engineer of the Au-


tonomous Robot Systems Laboratory, NTT (Nippon Telegraph
and Telephone) Human Interface Laboratories. He received the
B.E. degree in electrical engineering and M.E. degree in infor-
mation engineering at Nagoya University in 1982 and 1984, re-
spectively. His research interests include computer graphics and
computer modeling of three-dimensional shapes. He is a member
of The Institute of Electronics , Information, and Communication
Engineers of Japan, The Information Processing Society of Japan.
E-Mail: tanaka%nttcvg.ntt.jp@relay.cs.net

Tokiichiro Takahashi is currently a supervisor and senior re-


search engineer of the Autonomous Robot Systems Laboratory,
NTT Human Interface Laboratories. He received the B.E. degree
in electronic engineering at Niigata University in 1977. After grad-
uating, he joined NTT and has been doing research into Computer
Graphics since 1984. His current research interests include both
photorealistic and comprehensible rendering algorithms, as well
as their implementation on parallel processors. He is a member
of IEEE, The Institute of Electronics, Information, and Commu-
nication Engineers of Japan .
E-Mail: toki%nttcvg.ntt.jp@relay.cs.net

Address: Autonomous Robot Systems Laboratory, NTT Human Interface Laboratories,


1-2356, Take, Yokosuka-shi, Kanagawa, 238-03 Japan
Fractals Based on Regular Polygons
and Polyhedra
Huw Jones and Aurelio Campa

ABSTRACT:

The Sierpinski triangle or gasket and the Sierpinski tetrahedron are well known fractal objects.
Principles used in their creation are extended to other regular polygons to create other forms of
"gasket". These new fractal objects are illustrated using a recursive method and a version of the
"chaos game" method developed by Barnsley. The fractal dimension of these new forms is given
and they are related to other fractal objects. A similar extension is made to the development of a
fractal octahedron, using the principles underlying the construction of the Sierpinski tetrahedron.
This new form of fractal is illustrated by two recursive and one "chaos game" method. Extensions
of the technique to the other Platonic solids are suggested.

KEY WORDS:

Detemlinistic fractals, Platonic solids, Sierpinski tetrahedron, Sierpinski triangle.

1. FRACTALS BASED ON REGULAR POLYGONS

1.1 The Sierpinski Triangle or Gasket

The S~erpins~i trian~le is a well known fractal object which can be generated recursively through
r~placIng a smgle tnangle by three half linear scale similar triangles placed within the original
tnangle, one at each corner (Sierpinski, 19 IS; Saupe, 1988). The result of several stages of
recursion of this process is given in fig. 1.

If the recursion is repeated ad infinitum, the object so created contains three exact half scale copies
of itself, so has fractal dimension

D = log(3)/log(2) = 1.585.
This is derived from the general formula for the fractal dimension D of exactly self similar objects
containing n copies of the whole at scale factor f (Mandelbrot 1977; Voss 1988) given as

D = 10g(n)/log(I/f).

299
300

Fig. 1. A Sierpinski triangle or gasket after 6 subdivisions

Barnsley has shown how this detenninistic fractal object can be constructed using the random
process of the "chaos game" (Barnsley 1988a and Barnsley 1988b). Suppose the three vertices of
the original triangle are given as VI, V2 and V3. Given any starting point P in the plane of the
triangle, select one of the three vertices Vi, say, at random and find the point P' which is midway
between P and Vi. If the points are defined in a Cartesian coordinate system, the position of P' is
found by applying the formula

P' = (Vi + P)/2

to the x and y coordinates of the points concerned. The point P' is plotted and the process is
repeated with P' replacing P as the starting point. If the initial point P is not in the Sierpinski triangle
generated, a few stray points will be plotted initially, but the sequence converges rapidly to its
attractor, the Sierpinski triangle or gasket. If P is chosen to be one of the triangle's vertices, no such
stray points will be visible. Fig. 2 gives the result of plotting 20 000 points in this way.
301

Fig. 2. A Sierpinski triangle generated using the chaos game

The illustrations given are based on an equilateral triangle. but the process works equally well for
any shape of triangle.

1.2 A Pentagonal Fractal


It has been shown (Jones. 1991) that fractal objects similar in concept to the Sierpinski triangle can
be created for regular polygons with five or more sides and that such objects can be constructed by
either a recursive method or an adaptation of the chaos game method. Suppose we start with vertices
V 1. V2 •... V n at the corners of any regular n sided polygon. Imagine that the polygon has very
small copies of itself placed within it at each of its vertices.

Fig. 3. The generating "rosette" used for the fractal pentagon


302

Now make the small copies grow, still placed at the vertices and still remaining equal to each other,
until they just touch. If you apply this process to a pentagon, the shape produced will be the
generating shape shown in fig. 3. To produce a fractal gasket, the contents of the original polygon
need to be shrunk to fit inside the sub-polygons, so the size of these copies when compared with the
size of the original polygon is needed. In evaluating this size, the exterior angle of the polygon,
3600 /n, is needed to determine the form of the portion to be cut out of the original polygon. If
3600 /n is not less than 900, that is n is 4 or less, the edges of the smaller polygons meet exactly at
the edge of the original polygon (triangle or square in these cases) as in fig's 4 and 5.

Fig. 4. The generating shape used for the Sierpinski triangle

Fig. 5. Subdivision of a square into four half squares


303

When n is 5 or more, the polygon's exterior angle (calculated as 3600 /n) is acute, so that the smaller
copies touch at a point inside the original polygon (as in fig. 3). When n is 5, 6, 7 or 8, the shape
cut out at an edge is a triangle. For larger values of n, twice the polygon's exterior angle is acute,
meaning that more complicated shapes are cut out. The value of

k = trunc( (90}/{350/[n-l]}) = tnmc([n-I]/4),


where trunc(x) is the truncated integer part of the real value x, determines how many sides of a small
interior polygon are projected onto the cut out portion of a side of the large polygon - fig. 6 shows
that two sides of each small nonagon are projected onto a side of the original large nonagon to create
a pentagonal cut out.

Fig. 6. The pentagonal shape cut out in creating a nonagonal fractal

The length cut out of the side of the larger polygon is given by

k
2sI cos(360io/n) = 2sc, say
i=l

k
where c = I cos(360io/n).
i=l

Thus, if S is the side length of the original polygon, we have

S = 2s + 2sc = 2s0 + c).

The required shrink factor is

f = sIS = 1I{2(1 + c)}.

The gasket can then be constructed by recursively drawing n subpolygons within an original regular
polygon, shrinking the subpolygons by a factor f at each stage (see fig. 7 for a pentagonal version
of this method).
304

Fig. 7. A pentagonal fractal after 5 recursive subdivisions

Alternatively, an adaptation of Barnsley's chaos game method can be used. Given an original point
P, select one of the vertices Vi of a regular n sided polygon at random and then plot a new point P'
at a fraction f of the distance from Vi to P, where f is as calculated above. If P is any point in the
original polygon, this places P' in the scaled down copy which lies next to vertex Vi. For the
triangular gasket, f will be 1/2 as specified in Bamsley's method. Calculation of the position of P' is
simple. Merely apply the formula

P' = (1 - f)Vi + fP

Fig. 8. A pentagonal fractal created using the chaos game


305

Fig. 9. A hexagonal fractal created using the chaos game

Fig. 10. A nonagonal fractal created using the chaos game

to the x and y coordinates of Vi and P in turn. Some results of this process are shown in fig's 8 to
10.
The pentagonal based process (fig. 8) has interesting correspondences with some classical figures.
The triangular cut out is in the shape of the "golden triangle", known in the days of
Pythagoras.(Boyer 1968) and the basic generating shape was called a "rosette" by the Nuremberg
born artist Albrecht DUrer in the generation of tiling patterns (fig. 11) in the year 1525 (DUrer 1977;
Dixon 1988). The hexagonal form (fig. 9) has internal and external bounding lines which take the
familiar form of the Koch curve. This is generated by recursively replacing single line segments by
four line segments as in fig. 12 and was generated by von Koch in 1892 (Boyer 1968). For large
numbers of sides in the original polygon, the pattern produced becomes narrower, resembling a
laurel wreath (fig. 10). If this procedure is attempted with a square, the resulting figure has a set of
random points evenly distributed within the square (Jarrett 1990), as the four half size sub-squares
306

Fig. 11. A tiling pattern using pentagonal "rosettes" published by DUrer in 1525

Fig. 12. Four stages in the generation of a Koch curve

completely fill the original square (fig. 5). Calculation of the fractal dimension for each of these
figures is simple as they are all exactly self replicating. For an n sided gasket, the dimension is

D = log(n)/log(1/f),

as each gasket contains n exact subcopies of itself at scale factor f. Thus, the fractal dimensions for
the pentagonal, hexagonal and nonagonal gaskets illustrated (fig's 8, 9, 10) are 1.6723, 1.6309 and
1.6208 respectively.
307

2. FRACTALS BASED ON REGULAR POLYHEDRA

2.1 The Sierpinski Tetrahedron


The above fractals are based on regular polygons, which exist within two dimensional space. The
equivalent forms in three dimensional space are the regular polyhedra, otherwise known as the
Platonic solids (Cundy 1961). There are only five solids which have identical regular polygonal
faces which are similarly oriented with respect to each other, the tetrahedron, hexahedron or cube,
octahedron, dodecahedron and icosahedron having 4,6,8, 12 and 20 faces respectively. The
simplest is the tetrahedron, which can be more familiarly described as a triangular based pyramid. It
has four triangular faces, six edges and four vertices, the faces taking the form of equilateral
triangles in its regular form. The Sierpinski tetrahedron, a well known fractal object, can be
constructed from a tetrahedron using a principle similar to that used for the construction of
polygonal gaskets in two dimensions. The original tetrahedron is replaced by four half scale copies
of itself placed within the original shape so that each has one vertex coinciding with one of the
original tetrahedron's vertices (fig. 13).

Fig. 13. A Sierpinski tetrahedron after one subdivision

This is repeated recursively to create the Sierpinski tetrahedron - fig. 14 shows the result of several
stages of this recursion. Each face of the original tetrahedron is reduced to a Sierpinski triangle. This
exactly self replicating object has four half scale copies of itself, so has fractal dimension

D = log(4)/Iog(2) = 2.

Here we have a case of a fractal object whose fractal dimension is an integer, although it is less than
the dimension of the space within which the object lies. The four half sized tetrahedra which replace
the original tetrahedron comprise exactly half the volume of the original, as each is one eighth the
volume of the original. Thus, one half of the existing volume is eliminated at each stage of the
recursive process. The total volume eliminated if the procedure is continued ad infinitum forms an
infinite geometric series

1/2 + 1/4 + 1/8 + 1/16 + ...

which sums to unity. This shows that the Sierpinski tetrahedron has zero volume as the whole
volume of the original tetrahedron is eliminated in its creation.
308

Fig. 14. A Sierpinski tetrahedron after 3 recursive subdivisions

2.2 A Fractal Octahedron


The authors have applied the same principle of fractal generation to other Platonic solids. For a
hexahedron or cube, the result does not produce a fractal. The eight half scale cubes that fit into the
corners of a larger cube exactly fill the larger cube - this is analogous to the two dimensional
situation when the gasket process is applied to a square (fig. 5). The octahedron produces a much
more interesting figure. A regular octahedron is an eight faced solid, each face being a triangle. A
canonical form can be modelled with the object's six vertices at the points (1,0, 0), (-1, 0, 0),
(0,1,0), (0, -1, 0), (0,0,1) and (0,0, -1). To create a fractal octahedron, it is necessary
recursively to draw six half scale octahedra within the existing octahedron such that each has a
vertex coincident with one of the vertices of the original octahedron. This is a three dimensional
analogy of the method used to draw the gaskets developed above. The object produced has some
fascinating properties. Even at the first stage of subdivision (fig. 15 and plate 1) the object is non-
manifold, in that four faces (as opposed to the normal two) emanate from certain edges.

Fig. 15. A fractal octahedron after one subdivision


309

Plate I. A fractal octahedron after one subdivision

Plate 2. A fractal octahedron after 4 recursive subdivisions


310

After several stages of recursion, we notice that the pattern remaining on each face of the object is
the Sierpinski triangle or gasket (fig. 16 and plate 2).

Fig. 16. A fractal octahedron after 3 recursive subdivisions

If the fractal octahedron were used as a seal, it would leave the shape of the Sierpinski gasket in the
wax. Unlike the Sierpinski tetrahedron, there are no through holes in the object. The fractal
octahedron contains six half scale copies of itself, giving it fractal dimension

D = log(6)/log(2) = 2.5850.
The six half scale octahedra created at the first stage comprise six eighths or three quarters of the
volume of the original tetrahedron, meaning that a quarter of the original volume has been discarded.
At the next stage, the discarded volume is

6 X 0/4) X (1/8) = (1/4) X (3/4)


of the original volume. Continuing this procedure leads to the infinite geometric series

(1/4) {I + (3/4) + (3/4)2 + (3/4)3 + ... }

for the total volume eliminated which again sums to unity, meaning that if the procedure for creating
the fractal octahedron Were continued ad infinitum, the remaining object would be reduced to zero
volume. However, certain planes are never penetrated. For example, the four square cross sections
of the original octahedron lying in diamond orientation in the main coordinatl? planes are never
penetrated. Similar smaller diamonds of the successive sub-octahedra will also remain. The spatial
occupancy of the fractal octahedron can be reduced to a set of interpenetrating squares lying parallel
to the main coordinate planes in Cartesian space. This gives another way of constructing the object
by synthesis as opposed to the above method which effectively works by cutting out eight tetrahedra
from the existing octahedron. A z-buffer algorithm is particularly effective in illustrating this
method, as shown in plates 3 and 4.

These two methods create objects with the same spatial occupancy but with different surface
normals, so the objects will look different when entered into conventional renderers. This should be
expected. The object created is a fractal whose surface normal is nowhere properly defined - this
property is similar to that of the Koch curve (fig. 12) whose gradient is nowhere properly defined.

There is yet another method of generating the fractal octahedron based on the chaos game method
devised by Barnsley (Barnsley 1988a and Barnsley 1988b). In this case, the six vertices of the
311

Plate 3. The first stage of generating a fractal octahedron from penetrating squares

Plate 4. The fourth stage of generating a fractal octahedron from penetrating squares
312

Plate 5. A fractal octahedron generated by the chaos game method

Plate 6. A fractal dodecahedron generated by the chaos game method


313

original octahedron in three dimensional Cartesian coordinate space are identified as V 1, V2, ... V6·
A random starting point P can be chosen, but the authors selected one of the vertices to be P. The
point P' which is midway between P and a randomly selected vertex Vi is found by applying the
formula

P' = (P + Vi)!2

to the x, y and z coordinates of P and Vi. The point P' is marked in the three dimensional space and
the process is repeated, this time replacing the starting point P by the point P'. If P lies within the
fractal octahedron, then the point P' will also lie within this object. Plotting several thousand points
in this way will gradually build up the spatial occupancy of the fractal octahedron. Viewing is aided
if depth cueing is used to colour the points plotted in a continuum leading from the view position. A
z-buffer algorithm ensures correct spatial ordering of points when depicted. A result of this
procedure is shown in plate 5. A similar procedure could be applied to generation of the Sierpinski
tetrahedron. At the time of going to press, the authors have generated the first images of a fractal
dodecahedron using this method (plate 6). Details of this object and a fractal icosahedron will appear
ina future publication.

3. CONCLUSION

A number of new fractals based on regular polygons and polyhedra have been presented and
illustrated using recursive and stochastic "chaos game" methods. Relationships between them and
other fractal objects have been demonstrated and their fractal dimensions calculated. The authors
hope that the shapes produced will encourage others to explore the beautiful but addictive world of
fractals.

References

Bamsley MF (1988a), Fractal Modelling of Real World Images. In: The Science of Fractal Images,
Peitgen H-O and Saupe D (eds), Springer-Verlag, New York, pp 219 - 242

Bamsley MF (1988b) Fractals Everywhere, Academic Press, San Diego

Boyer CB (1968), A History of Mathematics, Wiley, New York

Cundy HM and Rollett AP (1961) Mathematical Models (2nd ed), Oxford University Press,
Oxford

Dixon R (1987) Mathographics, Blackwell, Oxford

DUrer A, trans Strauss WL (1977, first published 1525) The Painter's Manual, Abaris Books,
New York

Jarrett D (1990) Personal Correspondence, Middlesex Polytechnic

Jones H (1991) DUrer, Gaskets and the Chaos Game. Computer Graphics Forum (in press)

Mandelbrot B (1977) Fractals, Form, Chance and Dimension, W.H.Freeman, San Francisco

Saupe D (1988) A Unified Approach to Fractal Curves and Plants. In: Peitgen H-O and Saupe D
(eds) The Science of Fractal Images. Springer-Verlag, New York, pp 273 - 286

Sierpinski W (1915) Sur une courbe dont tout point est un point de ramifaction. Comptes Rendus
(Paris) 160: 302

Voss RF (1988) Fractals in Nature: from Characterisation to Simulation. In: Peitgen H.-O. and
Saupe D. (eds) The Science of Fractal Images. Springer-Verlag, New York, pp 21 - 70
314

The Authors

Huw Jones graduated from University College, Swansea with a BSc in


Applied Mathematics and a Post Graduate Certificate in Education. He later
obtained a MSc in Statistics from Brunei University and has worked in
Higher Education for 20 years. He is currently Principal Lecturer in
Computer Graphics in the School of Mathematics, Statistics and Computing
at Middlesex Polytechnic, where he is leader of the MSc Computer Graphics
course. He is a Fellow of the Royal Statistical Society, a member of
Eurographics and is vice-Chair of the British Computer Society's Computer
Graphics and Displays Group. His wife, Judy, teaches Mathematics and his
son, Rhodri, and daughter, Ceri, study, all in Secondary Schools in North
LondQn.
School of Mathematics, Statistics and Computing,
Faculty of Engineering, Science and Mathematics,
Middlesex Polytechnic,
Bounds Green Road,
London NIl 2NQ,
UK.
Tel: 081-3681299
Fax: 081-361 1726
Telex: 8954762
Email: huw1@uk.ac.mx.cluster

Aurelio Campa graduated from University College London with a BSc in


Computer Science and Electronic Engineering. He has just completed the
MSc Computer Graphics course at Middlesex Polytechnic where he works at
the Centre for Advanced Studies in Computer Aided Art and Design.
" • •I1)1III1 Centre for Advanced Studies in Computer Aided Art and
Design,
Faculty of Art and Design,
Middlesex Polytechnic,
Cat Hill,
Barnet,
Herts EN4 8HT, UK.
Tel: 081-368 1299
Fax: 081-440 954ITelex: 8954762
Email: aureliol@uk.ac.mx.cluster
Chapter 6
Ray Tracing/Rendering
Ray Tracing Gradient Index Lenses
Kevin G. Suffern and Phillip H. Getto

ABSTRACT

The trajectories of light rays are studied in media in which the refractive index varies with position.
These are known as gradient index media, and the equation of motion of light rays through such media
can be written in the same form as Newton's law of motion for a particle moving in a conservative force
field. We present ray traced images of two families of gradient index lenses: Luneberg Lenses and
gradient index rod lenses. The equation of motion of light rays through the lenses are either solved
exactly, or solved accurately by numerical means. Ray tracing is an important tool for visualizing the
optical properties of such lenses and other gradient index media.

Key Words ray tracing, variable refractive index, gradient index media, Luneberg Lenses,
gradient index rod lenses

1. INTRODUCTION

Previous ray tracing studies of transparent objects have been confined to rendering materials in which
the refractive indices are constant. In this situation, refraction of light rays occurs only at the boundary
between two media of different refractive indices. When travelling through a transparent medium of
constant refractive index, a light ray travels in a straight line. In contrast, when the refractive index
varies with position, the light ray will travel along a curved path in such a way as to minimize the travel
time. This is Fermat's Principle (Halliday and Resnick, 1978).

It is quite common for transparent media to have refractive indices which vary with position, as this
applies to planetary and stellar atmospheres. Many common atmospheric phenomena are caused by light
rays travelling along curved paths. One example is the apparent flattening of the sun and moon when
observed near the horizon (Walker, 1975). Mirages were thought to be the result of the refractive index
of the air varying with height above a hot surface (or a cool surface in the case of water). Berger et. al.
(1990) recently produced images of mirages, but approximated the atmosphere by a layered box with a
constant a refractive index associated with each layer. Musgrave (1990) recently argued that the principal
cause of mirages is total internal reflection, not varying refractive index.

Other examples of gradient index media are gravitational fields, which bend the paths of light rays. As
an example, light rays are bent as they pass by the sun, resulting in a small apparent shift in the positions
of stars that are observed near the sun during total solar eclipses. This was predicted by Einstein's
general theory of relativity in 1916. Much stronger bending occurs near white dwarfs, neutron stars, and
black holes (Misner, Thome, and Wheeler, 1973) .

We provide in this paper, an introduction to the techniques ofray tracing gradient index media. We treat
the media exactly as far as the refractive index is concerned, and use analytic solutions for the ray
trajectories where these can be found, or accurate numerical solutions. We hope this paper will stimulate
further studies in this fascinating area. Section 2 outlines the theory of light rays in gradient index media
and Section 3 presents two examples of ray tracing gradient index lenses.

317
318

2. THEORY OF GRADIENT INDEX MEDIA

A very elegant formulation of the theory of light rays in gradient index media was presented by Evans
and Rosenquist (1986) and Evans (1990). In these papers the equation of motion of a light ray was cast
into the same form as Newton's law of motion for a particle in a conservative force field. In this
formulation, the equation of motion is

(1)

where x is the position of the photon moving through a medium of variable refractive index
n (x). The independent variable a is an affme parameter defined by

I:: 1 = n (x). (2)

Equation (1) is formally equivalent to Newton's law of motion governing the motion of a particle in a
conservative force field U (x),

-v (U (x)) , (3)

provided we make the following associations :

t~a

m ~1

x (t) x (a)
~
1 2
U (x) ~ - 2 n (x).

The advantage of having the equation of motion (1) in the form of Newton's law of motion (3) is that all
the techniques for solving Newton's law of motion can be used to find the trajectories of light rays. This
can considerably simplify the calculations compared with solving the eikonal equation of geometrical
optics (Born and Wolf, 1980), particularly as we are only interested in the shape of the light rays, and
not the position of the photon as a function of time. The eikonal equation is non-linear, and is thus
difficult to solve analytically. In contrast, Equation (1) is linear so that in many simple geometries the
shape of the ray trajectories can be found analytically. This is very useful for ray tracing the spheres
discussed in the next section, but of no use for the rods, where the analytic solution is too complicated.

3. GRADIENT INDEX LENSES

We discuss here the theory of light rays in two families of lenses where the refractive index varies with
position, and present some ray traced images.

The first family is based on the Luneberg Lens (Luneberg, 1944), which is a sphere of radius r in
o
which the refractive index increases towards the center according to the formula

n (r) = (2 - r 2 / r 2) 1/2 . (4)


o
319

Here, the refractive index at the surface is one. As Luneberg demonstrated, this lens focuses all parallel
rays to a point on the surface, so there is no spherical aberration. This is demonstrated in Fig. 1 which
shows a series of parallel rays being refracted through the lens. Rays which are tangent to the sphere at
the impact point travel one quarter of the way around the surface before exiting, and are bent through 90
degrees.

Fig. I A set of parallel rays passing through a Luneberg Lens with refractive index given by
Equation (4).

We study lenses with a generalization of the refractive index formula (4) to

n (r) = (C - r 2 / r 2) 1/2 , (5)


o

where C ~ 2 is a constant. With formula (5), the refractive index varies from -V(C - 1) at the surface of
the sphere, to -VC at the center. Figure 2 shows a family of rays passing through a lens with C = 3. As
can be seen from this figure, the rays are no longer focussed to a single point, so aberration is present.
The refractive index at the surface of this sphere is ..,f2, and for comparison, Fig. 3 shows a set of
parallel rays passing through a sphere with constant refractive index of n = ..,f2. As C is increased, the
relative difference between the refractive indexes at the surface and center becomes smaller, with the
result that these lenses behave more like spheres with constant refractive indexes.

In order to ray trace a generalized Luneberg Lens with refractive index (5), a ray tracing program needs
to be able to perform the following operations : intersect an external ray with the sphere, calculate the
normal, reflected, and internal refracted unit vectors at the intersection point, calculate where the internal
refracted ray intersects the sphere (the exit point), calculate the normal, internal reflected, and external
refracted unit vectors at the exit point. With one exception, these operations can be carried out in the
same way as for ray tracing a sphere with constant refractive index. The exception is calculating where
the internal ray intersects the sphere. For spheres of constant refractive index the usual process in ray
tracing is to use the same procedure (called recursively) for intersecting spheres with internal and
external rays, but since the internal ray follows a curved path in our spheres, a special procedure is
required for the internal processing. The mathematical theory of the ray trajectories inside generalized
Luneberg Lenses is discussed in Appendix A.

Figure 4 is a ray traced image of a number of Luneberg Lenses with various values of C in front of a
vertical grid. In all these spheres, and the spheres in the next two figures, the coefficient of reflection
from their surfaces has been set to zero, to emphasize the refraction effects. The original Luneberg Lens
320

is at the top left where the effect of the 90 degree bending of rays near the surface is apparent. There is a
bright white light source above and slightly to the left of the spheres which accounts for the bright
appearance of the original Luneberg Lens when reflected light from the background is taken into
account. In contrast, the lenses with C > 2 appear similar to lenses with constant refractive indices, as
they refract light in a similar manner (see Figs. 2 and 3). Figures 5 and 6 are other scenes containing
Luneberg Lenses.

Fig. 2 A set of parallel rays passing through a sphere with refractive index given by Equation (5) with
C = 3.0.

Fig. 3 A set of parallel rays passing through a sphere with constant refractive index n = ;/2.
321

Fig. 4 Ray traced image of four


Luneberg Lenses in front of a
vertical grid. From top left to
bottom right these lenses have C =
2,3,4,5.

Fig. 5 Ray traced image of three


Luneberg Lenses and an ordinary
sphere in front of a texture map.
From left to right in the top row the
lenses have C = 2, 3, 4, and the
bottom sphere has refractive index
n = ..J2.

Fig. 6 Ray traced image of three


Luneberg Lenses with three scenes.
The top middle lens has C = 2, the
bottom left has C = 3, and the
bottom right C = 4.
322

It is unlikely that anyone will ever make a Luneberg Lens with refractive index (4) because its surface
refractive index is the same as air, although it is possible that the generalized lenses with refractive index
(5) could be manufactured. The great utility of ray tracing is that it has allowed us to visualize these
lenses.
The second family of lenses are gradient index rod lenses, which are circular cylinders whose refractive
index varies with distance from the central axis of the cylinder. In contrast to the Luneberg Lenses, these
are available a number of suppliers, for example Melles Griot (1988). The particular rods we study here
have refractive indexes given by

n (r) (6)

where r is distance from the central axis, n is the refractive index along the axis of the rod, and A is a
o
constant. The rods are quite small, having maximum diameters of 2mm and maximum lengths of
12.7mm, and can be used as focussing and diverging lenses for lasers of specific wavelengths. Because
of their small size, ray tracing is an ideal method for visualizing some of their optical properties. In order
to create ray traced images of the rods, the same set of tasks must be performed as for the Luneberg
Lenses. Appendix B discusses the mathematical theory of rays inside gradient index rods. For the
parameters n and A in equation (6), and the diameter R and length L, we use values for commercially
o
available rods. Specifically, we use n = 1.658, A = 0.183, R = 0.9mm and L = 3.7mm and 7Amm.
o
With these values, the term tAr 2 in equation (6) has a maximum value of only 0.0741 at r = R, so the
actual variation in the refractive index is quite small. These rods can still exhibit a variety of interesting
optical phenomena, including rays which travel around the central axis in elliptic and circular helixes
(Evans, 1990).

Before looking at ray traced images, it is instructive to plot the trajectories of a set of parallel rays which
are normally incident on the end of the rods, as these show the behavior of rays confined to planes
which contain the central axis. This would apply to light emitted from a laser placed at the end of the
rod. Figure 7 shows rays passing through a rod with L = 3.7mm, and Fig. 8 shows a rod with L =
7 Amm. The rays oscillate about the central axis, and although the rods were designed to focus parallel
rays to a point (or points) on the axis, some aberration is present. Evans (1990) showed the focal length
to be

focal length = (tr /2 A) (1 - ;6 A r ;)


if terms of order (A r 2) 2 and higher are neglected. Here, r is the original distance of the ray from the
o 0
axis. For the rods considered here, there is a 4.6% difference in focal length for rays entering near the
axis, and those entering near the outer boundary, (Evans, 1990). The aberration can be seen in Figure 7,
and is the reason the rays do not emerge parallel from the rod in Fig. 8.

Figure 9 shows a set of parallel rays passing through the rod that are perpendicular to the central axis.
Because of the small value of tAR 2 there is not much difference between the passage of these rays
and the passage of rays through a rod of constant refractive index n = n . The main difference appears
o
for rays travelling down the rod.

The theory of ray trajectories inside the rods is presented in Appendix B. Figure 10 is a ray traced image
of four rods in front of a vertical grid, and Fig. 11 shows the same four rods as Fig. 10 in front of a
texture map. The reflection coefficient of the rods has been set to zero to emphasize the refraction
effects. A difference between the appearance of these rods and rods with constant refractive indices is
apparent from the middle rod, where the view is straight down the central axis. When looking down the
central axis of an ordinary rod the entire far end of the rod is always visible, and the outline appears as a
concentric circle inside the near end. This is not always the case for gradient index rods, where only a
323

small part of the far end may be visible. The exact appearance depends on the length of the rod and the
distance of the viewer from the end.

Fig. 7 A set of parallel rays normally incident on the end of a gradient index rod lens of radius O.9mm
and length 3.7mm.

Fig. 8 A set of parallel rays normally incident on the end of a gradient index rod lens of radius O.9mm
and length 7.4mm.

Fig. 9 A set of parallel rays passing through a gradient index rod lens of radius O.9mm. These rays a
perpendicular to the central axis of the rod.
324

IIIIIIIII~III.~' I

Fig. 10 Ray traced image of four


0-) 0
gradient index rod lenses in front of
a vertical grid. The values of n ,A,
'(("'111 I
o
and R are as given in Section 3 of
the text, and for the rods at the top, • - =
from left to right, the lengths are L
= 7.4mm, L = 4.0mm, and L =
3.7mm.

Fig. 11 Ray traced image of the


same four rods as in Fig. 10, in
front of a texture map.

4. SUMMARY AND DIRECTIONS FOR FURTHER RESEARCH

We have discussed the propagation of light rays through gradient index media, where the refractive
index varies with position. As shown by Evans and Rosenquist (1986), the equations of motion of light
rays passing through such media can be cast into the same form as Newton's law of motion governing
the motion of a particle in a conservative force field. This considerably simplifies the task of finding the
shape of the light ray trajectories compared with solving the eikonal equation, because unlike the eikonal
equation, the equation to be solved is linear. This makes it much easier to find analytic solutions.

We have incorporated two types of gradient index lenses into our ray tracing system and produced ray
traced images of them. These are Luneberg Lenses and gradient index rod lenses, which are finite sized
objects. The only changes required to the system were to add special purpose procedures for processing
the rays in the interiors of the lenses, and to tell the ray tracer when the rays being processed were
interior rays.
325

The ray tracer is part of The CLOCKWORKS, an object oriented computer animation system developed
at Rensselaer Polytechnic Institute, (Getto and Breen, 1990). It is written in C and runs in a variety of
UNIX environments.

The Luneberg Lenses do not exist physically, and although the rod lenses are manufactured, they are
very small. The line drawings in Figs. (1) - (3) and (7) - (9) show the shapes of rays passing through
the lenses, and ray tracing has allowed us to visualize their appearance in a very flexible manner.

We have only studied two types of finite lenses in this paper, and they both need a lot more study. For
example, animated sequences could be used to demonstrate how rods of different lengths affect the
passage of light. Obviously the same techniques used here could be used to ray trace lenses of different
shapes and with different refractive indices. Other simple lenses with spherical symmetry have n (r) =
c / r m, for various values of m, and there is Maxwell's fish eye with n (r) = n [1 + (r / b)2]-1. Future
o
work in this area could also include atmospheric simulations, taking into account the varying index of
refraction of the air. Ray tracing in general relativity would be another interesting area to investigate
where the curved paths of light rays in gravitational fields are taken into account.

ACKNOWLEDGEMENTS

This work was performed while Kevin Suffern was on leave at Rensselaer Polytechnic Institute, Troy,
New York. He thanks Professor Michael J. Wozny of the Rensselaer Design Research Center for
providing the hospitality and facilities which allowed this work to be carried out. Figures 4, 5, 6, 10, and
11 were produced by Phil Getto.

APPENDIX A

Ray Trajectories Inside Generalized Luneberg Lenses

Because of the spherical symmetry, light rays inside the sphere are confined to a plane and so it is
sufficient to consider a ray confined to the (x,y) plane that travels in the x direction and hits the sphere
with impact parameter b. This is illustrated in Fig. 12, where the coordinates of the impact point are
(x,y) = [-( r 2 - b 2) 1/2, b]. The affine parameter a is set equal to zero at the point where the ray
o
enters the sphere. For the refractive index (5), the right hand side of equation (1) is

which is independent of C. This means that both the original Luneberg Lens with refractive index (4)
and the generalization (5) behave like optical harmonic oscillators and the solution for a ray trajectory in
the interior is of the same form as given by Evans and Rosenquist (1986) for the Luneberg Lens:

x (a) = A sin (a / r + a) (AI)


o

y (a) = B sin (a / r + [3). (A2)


o

In these equations the constants A, B , a, and f3 are determined from the position and orientation of the
ray just after it has entered the sphere and been refracted at the surface. When a = 0, the position of the
ray gives

(A3)

y (0) = B sin f3 = b. (M)


326

Fig. 12 Ray striking a


Luneberg Lens with impact
parameter b and passing
through the lens. x

Just inside the sphere let the derivatives of x and y with respect to a be denoted by

I
= X0
:: la = 0

ALI
da a = 0
= Yo·
I

Expressions are derived below for x I and y I • The above derivatives give
o 0

Acosa = r x' (AS)


o 0

Bcosf3 = r y'. (A6)


o 0

From equations (A3) - (A6) it is simple to derive the following expressions:

(A7)

sin f3 = b / (r 2 y '2 + b 2 )1/2


o 0
327

cos /3 = r o y0, 2' 2


/ (r 0 y 0 +b
2 1/2
)

Equations (AI), (A2), and (A7) give the following expressions for the trajectory inside the sphere

x (a) = r x' sin (a / r ) - ( r 2 - b 2) 1/2 cos (a / r )


o 0 0 0 0
(A8)

y (a) = r y' sin (a / r ) + b cos (a / r ). (A9)


000 0

The derivative vector dx / da is tangent to the trajectory, and hence from equation (2), the initial velocity
components are given by

(A 10)
no T y

In these expressions, no =.J (C -1) is the refractive index at the surface of the sphere, and T x and T y

are the components of the unit transmitted vector into the sphere. These are given by the standard
expressions, (see for example Watt, 1989) :

T
x

T
y
where (All)

cos e1 = (1 - b 2/ r;) 1/2.

For the original Luneberg Lens, T x = 1, T y = 0, n0 = 1, x'0 = 1, and y'0 = 0, as there is no refraction
at the surface. These lead to the relations r =A cos a,
o
/3 = 7r / 2, and B = b, which were derived by
Evans and Rosenquist (1985).

The next task is to find the coordinates (x ,y ) of the point where the ray emerges from the sphere.
e e
This is simple for the original Luneberg Lens, as substituting a / r = 7r /2 into (A 1) and (A2) gives y
° o
= and x e = r0' as discussed by Evans and Rosenquist (1985) and illustrated in Fig. 1.
e

For the generalized lenses, the parameter a in equations (AI) and (A2) can be eliminated to give

y = rcos (/3 - a) x + B sin (/3 - a) ( 1 - x 2/ A 2) 1/2 (AI2)

where r = B / A. Eliminating the square root from (AI2) gives


328

which shows the trajectory inside the sphere to be an elliptic arc. When the ray reaches the surface of the
sphere at (x , Y e)' Y 2
e e
= r 02 - x 2, which leads to the following quadratic in x 2 :
e e

(AI3)

where

C 3 = [r:- - B 2 sin 2(/3 - a) ] 2

The appropriate root of (A 13) for the exit point is

as this reduces to xe = r 0 when C = 2.

At the exit point the affine parameter is

. -1
a / ro = sm (x e / A) - a, (AI4)

with A and a given by expressions (A7). Expressions (A8), (A9), and (AI4) were used to plot the
trajectories in Figs. (1) and (2).

As indicated in Fig. (2) , Luneberg Lenses with C > 2 exhibit aberration, and it is instructive to write
down the expression for x when y = O. From equations (AI) and (A2) it follows, that when y = 0,

which shows some of the dependence of x on the impact parameter b. The quantities x' and y' also
o 0
depend on b through expressions (AlO) and (All).

At the exit point, the derivatives of x and y with respect to a follow from equations (AI), (A2), and
(AI4) :

x' = A (1 - x 2 / A 2) 1/2 / r
e e 0

y' = B (1 - x 2 / A 2) 1/2 cos (/3 - a) / r - yx sin (/3 - a) / r .


e e 0 e 0

Dividing these expressions by n gives a unit tangent vector to the internal trajectory at the exit point.
o
With this information, the normal, internal reflected, and external refracted unit vectors at the exit point
can be calculated in the usual manner.
329

APPENDIX B

Ray Trajectories Inside Gradient Index Rod Lenses


The theory of rays inside gradient index rods with refractive index given by (6) was studied by Evans
(1990) in cylindrical polar coordinates, which are naturally suited to the cylindrical geometry of the rods.
However, Cartesian coordinates are most convenient for incorporating the rods into our ray tracer,
where rays are specified in Cartesian coordinates. The coordinate system we use is shown in Fig. 13

Fig. 13 The coordinate system used R


for solving the ray trajectories in y
gradient index rods

In these coordinates, Equation (1), with refractive index (6), can be written as the set of differential
equations .

-n 2 A [1 - -1 (x 2 + y 2)] x (Bl)
o 2

2 1 2
-n
o
A [1 - - (x
2
+ y 2)] y (B2)

(B3)

Equation (B3) shows the derivative dz / da = P z to be a constant of the motion, as expected, since the
refractive index (6) does not depend on z. The solution of (B3) is

z = Zo + P z a

where z and p are determined by the position and orientation of the ray just after it has entered the
o z
rod and been refracted at the surface.
330

It is possible to obtain an analytic solution to equations (B 1) and (B2) by writing them in polar
coordinates (see Evans, 1990). The equations can be manipulated into a single equation which can be
solved in closed form. The solution is not much use for tracing the ray through the rod as it is of the
form z = f (r) where f is quite complicated, involving elliptic integrals. For ray tracing purposes, it is far
more efficient and convenient to integrate equations (Bl) and (B2) by a numerical technique, and we
have used a fourth order Runge Kutta scheme. See, for example, Press et. al. (1988).

The initial conditions for the numerical integration are determined by the position of the intersection of
the external ray with the rod, and the ray's direction after refraction. If the intersection point is (x ,y
o 0
,z ), the initial derivatives are
o

dx
da = n (ro ) T x

.EL=n(r)T
da 0 y

constant,

Here, r = (x 2+ Y 2) 1/2 if the ray strikes the end of the rod, and r = R if the ray strikes the curved
o 0 0 0
surface, where R is the radius of the rod. The components of the unit tangent vector to the ray inside the
rod, T = (T , T , T ), are calculated in the usual manner, as if the rod had constant refractive index.
x y z

Each ray is integrated in steps of h = 0.1 in the parameter a until the ray leaves the rod, as this step size
has proved accurate for rendering purposes. At each integration step, a test is made to see if the end of
the ray is inside the cylinder, and the integration is stopped when this test returns false. This means that
the last integration step overshoots the boundary, and an accurate exit point is then calculated from the
last two integration points. The straight line segment joining these points is intersected with the cylinder
to obtain the coordinates of the exit point, and the direction of this line segment is used as the tangent to
the internal ray at the exit point. With this information, the external transmitted ray and the internal
reflected ray at the exit point can be calculated in the usual manner.

REFERENCES

Berger M, Trout T, Levit N (1990) Ray Tracing Mirages. IEEE Compo Graph. and Appl., May, 1990
36-41
Born M, WolfE (1980) Principles of Optics, 6th ed. Pergamon, Oxford, pp 101-132
Evans J (1990) Simple forms for equations of rays in gradient-index lenses. Am. J. Phys., 58 : 773-
778
Evans J, Rosenquist M (1985) F = ma optics. Am. J. Phys., 54 : 876 -883
Getto P H, Breen D (1990) An Object Oriented Architecture for Computer Animation System. Visual
Computer 6 : 79-92.
Halliday D, Resnick R (1978), Physics 4th ed., (John Wiley & Sons, New York), pp. 972-973.
Luneberg R P (1944) Mathematical Theory of Optics. Brown University mimeographed notes,
University of California, Berkely, CA, 1964
Melles Griot (1988) Optics Guide 4, Melles Griot, Irvine, California
Misner C W, Thorne K S, Wheeler J A, Gravitation. Freeman, San Francisco
Musgrave F K (1990) A Note on Ray Tracing Mirages. IEEE Compo Graph. and Appl., November,
1990: 10-12
Press W H, Flannery B P, Teukolsky SA, Vettering W T (1988) Numerical Recipes in C: the art of
scientific computation. Cambridge University Press, Cambridge
Walker J (1975) The Flying Circus of Physics With Answers. John Wiley & Sons, New York, p 120
331

Watt A (1989) Fundamentals of Three Dimensional Computer Graphics. Addison-Wesley, New York,
pp 166

Kevin Suffern received an M.Sc. from Cornell University in Astronomy in


1973 and a Ph.D. in Applied Mathematics from the University of Sydney
in 1978. From 1979 to 1981 he worked in the School of Mathematics and
Physics at Macquarie University in Sydney, before joining the School of
Computing Sciences at the University of Technology, Sydney, where he is
currently a Senior Lecturer. In 1986 he was a Visiting Research Scientist
in The Center for Interactive Computer Graphics, Rensselaer Polytechnic
Institute, and in 1990 he was a Visiting Associate Professor in the
Rensselaer Design Research Center, Rensselaer Polytechnic Institute. His
main interests are computer graphics, computer aided geometric design,
and computer art. He is a member of ACS, ACM, and ACM SIGGRAPH.

Address : School of Computing Sciences, University of Technology, Sydney, PO Box 123,


Broadway, NSW, AUSTRALIA E-mail: kevin@ultima.socs.uts.edu.au

Phillip H Getto currently works at rasna Corporation. Before joining rasna


he was a research engineer with Rensselaer Polytechnic Institute where he
was co-leader of the Visual Technologies Program in the Rensselaer
Design Research Center. His research interests focus on realistic image
synthesis, sampling theory, computational geometry, object-oriented
computer graphics, computer animation, and user interface design. He has
produced several computer generated animations, which have been shown
at the SIGGRAPH conference. He is also responsible for the SIGGRAPH
'89 poster image and print logo.
Getto is currently a PhD candidate in the Department of Electrical,
Computer and Systems Engineering at Rensselaer. He has received a BS
and ME in Computer and Systems Engineering from Rensselaer. Getto is
a member of the IEEE, IEEE Computer Society, ACM SIGGRAPH, and
Computer Professionals for Social Responsibility.

Address: rasna Corporation, 2590 N. First Street, Suite 200,San Jose, CA 95131 USA E-mail:
phil@rasna.com
Shapes and Textures for Rendering Coral
Nelson L. Max and Geoff Wyvill

ABSTRACT

A growth algorithm has been developed to build coral shapes out of a tree of
spheres. A volume density defined by the spheres is contoured to give a "soft ob-
ject". The resulting contour surfaces are rendered by ray tracing, using a gen-
eralized volume texture to produce shading and "bump mapped" normal pertur-
bations.

KEYWORDS: growth model, soft object, volume texture, bump mapping, coral

1 INTRODUCTION

We constructed the coral as a soft object [Wyvi1l86] inside the Katachi system
[Kunii85] and ray traced it [Wyvi1l90]. We built the coral by a parametrized
growth algorithm, similar to those which have been used to build trees UBloom-
enthal85], [Aon084], [DeReffye88], [Viennot89], and [Prusinkiewicz90]). Kawagu-
chi [Kawaguchi82] has also animated undersea creatures, defined by metaballs
[Nishimura85], the Japanese equivalent of soft objects.

The surface of a soft object is the contour of a volume density, defined as the sum of
spherically or elliptically symmetric densities centred at various points. In this
work, we used only spherical densities, and each was specified by a centre and an
outer radius at ~hich the density function becomes zero. Section 2 explains the
growth algorithm which governed the placement of these spheres.

We wanted to change the surface colour and perturb the.surface normal vector
[Blinn78] at each point on the coral, in order to render a realistic texture. The
standard methods of texture and "bump" mapping look up values in a 2-D colour
or bump height table. However such mappings are difficult to apply to a general
contour surface, because there is no simple method to assign the coordinate
functions into the 2-D texture tables. Therefore volume textures ([Peachey85],
[Perlin85], [Wyvi1l87]) are the preferred method for contour surfaces. These are
defined directly as functions of the 3D co-ordinates of a surface point, often by ana-
lytic computation rather than table look-up. To generate our coral texture, we
have generalized this idea, and compute the colour and normal perturbations as
functions of both the 3D surface point and the normal to the contour surface.
Section 3 describes this generalized volume texture. Section 4 gives our results.

333
334

2 CORAL STRUCTURE
The coral structures are "soft objects" contoured from the sum of spherically
symmetric density functions, so the structure is determined by the centres of the
spheres, and the outer radii at which the density functions become zero. The
structures were made by randomly adding one layer or "generation" at a time to a
growing collection of spheres, starting from a single core sphere, and building up a
tree graph.

All the spheres in a given layer n have the same radius r(n). The distance
bond](n) between the centre of a new sphere and its parent in the previous layer
is a fixed multiple ofthe sum oftheir radii. The direction from a parent to a new
child sphere is chosen randomly on a truncated unit sphere. The truncation re-
moves those directions with z component below zlow, in order to cause an upward
tendency in the growth.

The maximum number of spheres in a layer n is a predefined constant


max_sphere(n). Parents and directions are chosen randomly, and the resulting
new sphere is tested for collision and shape criteria, as described below, and re-
jected if it fails. The trials continue until max_sphere(n) new spheres have been
accepted, or the number of trials exceeds maxtrials =tryfac * max_sphere(n), for
a chosen factor tryfac.

The square of the distance between each trial sphere and every accepted sphere is
tested against one of two collision criteria. The choice between them is determined
by a generation countg. Ifthe trial sphere and the test sphere have a common an-
cestor back at most g generations (e.g., a common grandparent), then a smaller,
more lenient collision factor "adjacent" is multiplied by the sum of their radii to
determine the distance of closest approach. This factor may be chosen so that the
contour surface connects the two spheres into one blob. If there is no close enough
common ancestor, then a larger factor "nonadjacent" is used, which makes the
two spheres of non-zero density disjoint, and causes the blobs to separate.

At each level there are two other shape constraints for the sphere centres; a mini-
mum distance from the central root node, and a minimum z value. These can be
used to force the coral to grow outward and/or upward.

In the current system, the minimum distance and minimum z limits are incre-
mented at each level n by fixed multiples of bond](n), so only the multipliers need
be specified. In turn, bond](n) depends on the sphere radius rCn), which is taken
to be of the form a *b n , so only a few constants need be specified to generate a coral
shape. These are listed in Table 1.
335

Table 1
1 The total number oflevels in the tree
2 The initial radius, a, for the root sphere
3 The radius decrease per level, b, with b <1, so that r(n) =b *r(n - 1)
4 The bondfactor, so that bond_r(n) =bondfactor * (r(n) + r (n - 1»
5 The number g of generations back for which a common ancestor allows a lenient
approach distance.
6 The factor "adjacent" which multiplies r(n) + r(i) to give the lenient distance for closest
approach of two spheres at levels i and n with a common ancestor at level max(O, n - g, i-g).
7 The corresponding factor "nonadjacent" for more distantly related spheres.
8 The multiple of bond_r(n) used to increment the minimum distance from the root node.
9 The multiple of bond_r(n) used to increment the minimum z value.
10 The truncation level zlow for the unit sphere of directions between parent and child.
11 The base d in the formula max_sphere(n) =floor (c * d"). (c is always 2)
12 The factor tryfac in the formula maxtrials = tryfac * max_sphere(n) for the number of
random spheres to try at level n before giving up.
13 An initial seed for the random number generator, used to vary the shape even when the
above 12 constants are the same.

There are two other features which can be turned on or off. The first interpolates
an extra sphere between a parent A and child B, in order to keep the density high
when the distance between parent and child spheres is large. We placed this
sphere to make the branches curve smoothly upward. The second feature at-
tempts to continue the curve of this branch from child B to grandchild C, in the
case that B has no other children, by defining the coordinates of C from those of A
and B. The position of C is originally generated in the random search, and this
random position is only revised if the smooth continuation does not violate the
collision constraints.

By changing the 13 constants in Table 1 and deciding whether to enable the above
two features, a large variety of coral shapes can be generated. Initially these
shapes were examined in hidden line sketches of the spheres, and only the pleas-
ing ones were used to generate soft contour surfaces. The sketches were gener-
ated using a fraction ofthe outer radii for the spheres, chosen to approximate the
eventual soft contour surface.

3 ENHANCED VOLUME TEXTURES

We tried to generate shapes and textures for the coral to approximate those in the
photographs in [Wood83] and [Kaplan82]. According to Wood [Wood83 (page 18)]
the coral polyps lay down a skeleton with a cylindrical wall (theca) surrounding
the chalice which contains the digestive system of the polyp. Radial walls called
septa divide the chalice, and extend outward beyond the theca towards neighbour-
ingpolyps.

We wanted to show the chalices as a round depressions, uniformly distributed on


surface of the coral. A straightforward method to give the locations for these de-
pressions is to intersect the contour surface with a collection of spheres. We chose
a cubic close packing of spheres for its uniform density and because it was easy to
336

detennine the closest sphere centre to a given 3-D point. This packing is based on
a regular cubic lattice, with a sphere at the midpoint of each cube edge, and also
one at the centre of each cube. Each sphere is surrounded by 12 other equidistant
spheres.

Figure 1 shows the result of intersecting an ellipsoid with a close packed collection
of spheres. Note that the approximate circles in which the ellipsoid intersects the
sphere have differing radii.

Fig. 1 Intersection of ellipsoid with spheres in cubic close-packing

We wanted all the chalices to have about the same radius. In order to achieve this,
we used an "enhanced volume texture" which is a function of both the 3-D coordi-
nate on the contour surface, and the surface normal at that point.

Figure 2 shows a contour surface S passing near two sphere centres C 1 and C2•
Points Pi, i =1,2 are the closest points on S to Ci, so that the normals Ni lie along
the lines CiPi. Then right circular cylinders of radii r and axes CiPi intersect the
surface in approximate circles of radius r, so using these cylinders will make the
circles have equal radii.
337

Fig. 2 Cylinders from close packed centers Ci , normal to surface S

For any point P in the surface S, we could find the nearest centre Ci of the cubic
close packing, find the distance s to the corresponding axis CiPi (C'lF2 in Figure 2),
and calculate the texture as a function of s. However, finding the points Pi would
involve a lot of iterative calculation, and also storage unless the calculation is re-
peated for each texture evaluation. Instead we draw a line CPi through Ci paral-
lel to the unit normal Nat P, and take q as the distance from P to this line. If the
principle radii of curvature ofthe surface S are large compared to the distance be-
tween the centres Ci, q is a good approximation to s. Thus, we find the centre Ci of"
the close packing which is nearest to P, and compute

The texture is then a function of q. Figure 3 shows the approximate circles


coloured white whenever q <r. Note that the circles have approximately the same
radius, but that occasionally two circles intersect at the plane oftransition between
two different closest centres Pi and Pj- (This intersection can be interpreted in our
application as polyp budding, as described in [Wood83].)
338

Fig. 3 Enhanced volume texture version of Figure 1

Figure 4 shows an ellipsoid with a bump mapped coral texture. The surface nor-
mal to the ellipsoid was perturbed radially as a function of q, to indicate the raised
wall (theca) of the chalice and the depression it surrounds. In addition, the normal
was perturbed tangentially to represent the septa which divide the chalice into
sections, and also extend between them. This tangential perturbation was com-
puted as a function of the direction of the vector TP in Figure 2. The areas where
the septa cross the theca were made whiter to indicate the tentacles, by changing
the surface shade.

4 RESULTS

The images were ray traced by the algorithm described by Wyvill and Trotman
[WyvilI90]. This algorithm starts by tracing rays on a lattice of test points spaced
several pixels apart, and only traces intermediate rays inside a region if the colors
at the corners of the region differ. This scheme follows profiles of objects and con-
tours of shading exactly, but does not waste rays on regions of constant shade.
When the stopping condition for the subdivision is less than a pixel, it can provide
antialiasing, which we used in Figures 6 and 7.

Figure 5 shows a single coral, rendered with bump mapped texture as described
above. The first twelve shape parameters listed in Table 1 were set respectively as
8,2., .9, 1.6,2, .2, 1.12, .67, .55, -1.,1.34, and 16., and the extra sphere and smooth
continuation features were used. Figure 6 shows two corals with the same tex-
ture, positioned in a sandy bottom, and rendered with shadows. The orange coral
339

Fig. 4 Bump-mapped ellipsoid, using enhanced volume texture

Fig. 5 A single coral, with bump-mapped texture of polyps


340

on the left is the same as the one in Figure 5. The blue-green coral on the right
had shape parameters 11, 2., .85, .7, 4, .2, .7, .47, .45, 0., 1.34, and 30., and did not use
the two features. .

Fig. 6 Two corals with shadows, anti-aliased

Figure 7 shows two more corals without shadows. The red one on the left had pa-
rameters 11, 1.6, .85,1.67,2, .35,1.16, .6, .67, .22, 1.26, and 26., and used the two fea-
tures. The volume texture scale was set large enough to be visible, which made the
polyp distribution at the tips of the thin branches unreasonably sparse. The green
coral on the right had parameters of8, 2.6, .92, .52,5, .1, .35, .37, .22, -.2,1.34 and
30., and did not use the two features. For positioning the raised wall ofthe theca in
the texture, we did not use a function of q, as described in section 3. We instead
used a function of the ratio of the distances of a surface point to the two nearest
cubic close packing sphere centres. This caused the walls to butt against each
other.
341

Fig. 7 Two more corals

ACKNO~DGEMENTS

This work was performed under the auspices of the U.S. Department of Energy
under contract number W-7405-Eng-48 to the Lawrence Livermore National
Laboratory. We are also indebted to the University of Otago William Evans Fund
for financial support and to Television New Zealand Limited for use of equipment.
Carol Smith adapted the soft object ray tracing algorithm described in Wyvill and
Trotman [Wyvil190] to accept spheres of varying radii, and proofread the manu-
script. Paul Sharp helped with the production of the photographs. Kaye Saunders
and Debra Streets did the typing.

REFERENCES

Aono, M. and Kunii, T.L. (1984) Botanical Tree Image Generation. IEEE
Computer Graphics and Applications, 4,(5): 10-34
Blinn, J. (1978) Simulation of Wrinkled Surfaces. Computer Graphics, 12(3):
286-292 (SIGGRAPH '78 Proceedings)
Bloomenthal, Jules (1985) Modeling the Mighty Maple. Computer Graphics,
19(3): 305-311 (SIGGRAPH '85 Proceedings)
De Reffye, P., Edelin, C., Francon, J., Jaeger, M. and Peuch, C. (1988) Plant
Models Faithful to Botanical Structures and Development. Computer
Graphics, 22(4): 151-158 (SIGGRAPH '88 Proceedings)
342

Kaplan, E. H. (1982) A Field Guide to Coral Reefs. Houghton Mifflin Company,


Boston,MA
Kawaguchi, Y. (1982) A Morphological Study of the Fonn of Nature. Computer
Graphics, 16(3): 223-232 (SIGGRAPH '82 Proceedings and numerous subse-
quent computer animated films.)
Kunii, T. L. and Wyvill, G. (1985) A Simple but Systematic CSG System.
Proceedings of Graphics Interface '85: 329-336. (Conference held in Montreal,
Canada, May 27-31, 1985.) Proceedings also published as Computer-Gener-
ated Images, Nadia Magnanet-Thalman and Nadia Thalman, eds. Springer
Verlag, Tokyo
Nishimura, H., Hirai, M., Kawai, T., Kawata, T., Shirakawa, I., and Omura, K.
(1985) Object Modeling by Distribution Function and a Method of Image
Generation. Journal Electronics Communication Conference, J68-D(4) (in
Japanese). English translation in Kesson, M.A. (1989) An Investigation into
the Modelling of Iso-Surfaces of Scalar Fields. MA thesis, Middlesex Poly-
technic
Peachey, D. (1985) Solid Texturing of Complex Surfaces. Computer Graphics,
19(3): 279-286 (SIGGRAPH '85 Proceedings)
Perlin, K. (1985) An Image Synthesizer. Computer Graphics, 19(3): 287-296
(SIGGRAPH '85 Proceedings)
Prusinkiewicz, P. & Lindenmayer, A. (1990) The Algorithmic Beauty of Plants.
Springer Verlag, New York
Viennot, X. G., Eyrolles, G., Janey, N. and Arques, D. (1989) Combinatorial
Analysis of Ramified Patterns and Combinatorial Imagery of Trees. Computer
Graphics, 23(3): 31-40 (SIGGRAPH '89 Proceedings)
Wood, E. M. (1983) Corals of the World. T.F.H. Publications inc. Ltd, Hong Kong
Wyvill, G. and Trotman, A. (1990) Ray-tracing Soft Objects. CG International
'90: 469-475 (T.S. Chua and T.L. Kunii, eds.) Springer Verlag, Tokyo
(Conference held in Singapore, June 25-29, 1990)
Wyvill, G., Wyvill, B. and McPheeters, C. (1986) Data Structure for Soft Objects.
The Visual Computer, 2: 227-234.
Wyvill, G., Wyvill, B. & McPheeters, C. (1987) Solid Texturing of Soft Objects.
IEEE Computer Graphics and Applications, 7(12): 20-26.
343

Max, Nelson L. is a Professor of Computer Science at the


University of California, DavislLivermore, and a Computer
Scientist at the Lawrence Livermore National Laboratory.
He received a Ph.D. in mathematics at Harvard University
in 1967. His research interests are in realism in images of
nature, molecular graphics, computer animation, and sci-
entific visualization. He has served as computer graphics di-
rector for two dome-screen stereo movies, produced for the
Fujitsu pavilions at Expo '85 and Expo '90 in Japan, and has
produced many other award winning computer animated
films.
AddreE Nelson Max can be reached by mail at Lawrence
Livermore National Laboratory, P.O. Box 808, Livermore,
CA 94550, U.S.A., and bye-mail at max2@llnl.gov.

Wyvill, Geoff graduated in physics from Jesus College,


Oxford, and started working with computers as a research
technologist with the British Petroleum Company. He
gained MSc and Ph.D. degrees in computer science from the
University of Bradford where he lectured in computer sci-
ence from 1969 until 1978. He is currently senior lecturer in
computer science at the University of Otago. He is on the edi-
torial board of The Visual Computer and is a member of
SIGGRAPH, ACM, CGS and NZCS.
Address: Geoff Wyvill can be reached by mail at Department
of Computer Science, University of Otago Box 56, Dunedin,
New Zealand, and bye-mail atgeoff@otago.ac.nz.
A New Color Conversion Method for Realistic
Light Simulation
Toshiya Naka, Kenji Nishimura, Fumihiko Taguchi, and Yoshimori Nakase

ABSTRACT

In this paper a new color conversion method to replace the RGB approximation based on the
perceived change in color when a colored light source illuminates a surface is proposed. In our
study 1200 uniformly distributed samples were taken from Commission Internationale de
l'Eclairage 1976 L*a*b* uniform color space and used to construct a uniform subset.
Photographic patches of the colors in this subset were then subjected to color light sources and the
color changes were measured. From these measurements rules which govern the corresponding
change within the color space were determined. Changes within the color space were then defined
by simple linear equations for any light source illuminating any surface. Lastly, experiments were
conducted to confirm that color conversion within the cm color space generates more realistic
images than RGB approximations for a variety of light sources.

Key Words Photorealistic, Radiosity, Color Conversion, Uniform Color Space,


lllumination

1 INTRODUCTION

Currently, in computer graphics, methods for faithfully expressing physical phenomena are
being widely researched. Optical simulation is one of the most impomlnt elements necessary for
realistically generating computer graphics. The present paper will discuss a new color conversion
method which provides faithful reproduction of changes in hue and saturation of an object under
various illuminating conditions.
In computer graphics based optical simulation, two aspects of light must be carefully
considered: energy and color. Viewing light as energy is the most common way of considering
light. In the 1980's two scientists (Whitted 1980; Hall 1986) proposed the ray-tracing method
which considers only the specular reflection component of light energy on an object's surface. In

345
346

order to reproduce more realistic lighting conditions, two other scientists (Kajiya 1986; Cohen
1986; Cohen 1988) developed the radiosity method which takes into consideration diffuse
inter-object reflection. These methods fonnulate a light energy relationship between objects.
For photorealistic image generation the color of light must also be taken into consideration. All
types of light have a relative spectral energy distribution which causes the human eye to perceive
color. Accurate simulation of color changes of an illuminated object requires the calculation of the,
energy transfer for all wavelengths in the visible range ( ). = 380 to 780 nm). Such simulation,
however, requires a great deal of computation, and therefore is not practical in computer graphics.
Thus, in conventional computer graphics, the consideration of the color of light, such as is needed
for reproducing color changes of objects under color light sources, has been used only in a few
cases (Hall 1987; Hall 1988; Meyer 1988).
Our present study proposes a new color conversion model for expressing the change in color
of an object under a light source with a smoothly varying spectral distribution. The features of this
method can be summarized as follows:
(1) The CIE ( Commission Internationale de l'Eclarirage ) color space is used instead of
conventional spectral band approximation such as ROB, because it is highly unifonn with respect
to human visual perception.
(2) Rules governing color changes, based on empirical findings, are defined by simple linear
equations.
(3) This method can be incorporated into conventional illumination calculation algorithms by
converting ROB values to the CIELAB color space.
Repeated experiments perfonned using this technique with a number of color light sources
confmn that color conversion within the CIELAB color space produced less color difference than
the conventional ROB approximation.
Section 2 discusses problems with color calculation methods and then proposes a new color
conversion method to be used for illumination simulation under a color light source. Section 3
describes simulation experiments with a number of color light sources. Section 4 provides an
evaluation and Section 5 a summary.

2 COLOR EXPRESSION METHOD


2.1 Problems

In the ROB system the color of an object is quantified by the three values, R, 0, and B, which
can be calculated by Eq. (2.1). In this equation, light with spectral distribution L()' ) is incident
on an object surface with spectral reflectance distribution p (). ). The ROB values are also
weighted by visual cells R()' ), O()' ), and B( ). ) of the human eye which have three different
spectral distributions.
347

R = }; P ( A )L( A )R(). ) ~ A
G = }; p (). )L( A )G(). ) ~ A
B = }; P ( A )L( A )B( A ) ~ A
where A is wavelength in the visible range from 380 run to 780 nm,
~ A is the change in wavelength.
(2.1)
According to this mechanism, accurate simulation of the color of an object requires calculation
for all possible A 's in the visible range. This calculation is computationally expensive and
therefore not very practical. Hence, in conventional illumination algorithms an energy relation
equation has been established with three spectral bands to express light colors: RGB.
Achieving white balance is fundamental to satisfactory reproduction of colors. Experiments
were conducted to determine the number of wavelength samples necessary for acceptable color
expression. In these experiments two lights, magenta and cyan are projected onto a standard white
surface as shown in Fig. 2.1(a). The relative spectral energy distributions of the light sources are
shown in Fig. 2.1 (b). The results of these experiments are shown in Fig. 2.2. The vertical axis of
Fig. 2.2 represents ratios RIG and BIG. The closer to 1.0 both of these are, the better the white
balance. Using Eq. (2.1) for calculation and varying the number of wavelength samples taken it
was determined that at least 20 wavelength samples must be taken in the visual range to achieve
white balance.
Summarized below are problems with conventional illumination simulation.
(1) In conventional methods, such as ray tracing and radiosity, the energy equivalence at the
surface of an object can be obtained to a relative degree of accuracy. However, in converting the
colors of surface texture based on this illumination data, shades are produced using the RGB
approximation, thereby introducing a degree of inaccuracy.
(2) Furthermore, in conventional methods, all light sources are colorless so the change in color
on the surface of an object due to a colored light source cannot be simulated.

Energy
Light A LightB ~----r-----r----'
(Cyan) (Magenta)

400 500 600 700


(White) Wavelength (run)
(a)
(b)

Fig.-2.1 Experimental (a) to obtain the white balance and (b) the
relative spectral energy distribution of the lights.
348

RIG ,BIG
1.3 ""--"""T""---,r---.,....---, Light source
Imax--...
RIG
1.2 t-+--+----1f---*B~/G~-I

l.lr.-\---+----1f----+----I

te b ance
0.9 L..-_.....L..._----lL.....-_....I......_---l
Patch
o 10 20 30 40
Sampling number

Fig. 2.2 The number of samples with Fig. 2.3 Relationship between illuminance
wavelength ). and ratios and luminance on an object surface
RIG and BIG.

In order to find a solution to these problems a new color conversion method which could
replace the RGB approximation method was investigated.

2.2 Uniform Color Space

In response to the fact that the RGB color system corresponds poorly to the peculiarities of
human vision the Commission Internationale de I'Eclairage (CIE) in 1976 proposed the CIELAB
uniform color space (CIE 1976). In any region within this space, human perception of the
difference in any two colors will always correspond to the actual distance between the two colors in
the space. Therefore, with the introduction of this space it became possible to formulate linear
approximations of color conversion rules. Hence, in the following discussion,. the CIELAB
uniform color space is used instead of the RGB color system.
In order to convert from the conventional RGB system to the CIELAB color space, RGB
values in Eq. (2.1) must first be converted to CIEXYZ color system using Eq. (2.2). A point
worth noting about the XYZ system is that the primaries X and Z, representing hue and saturation,
are chosen on the non-luminance plane. Because of this, lightness information is represented in Y
only.

(X, Y, Z) t = A (R, G, B) t
where A is the 3x3 color space conversion matrix. Refer to Appendix,
Color Transfonnation.
(2.2)
349

Final conversion into the CIELAB unifonn color space is accomplished by using the resulting
X,Y,and Z values from Eq. (2.2) in Eq. (2.3) where X o ' Yo and ~ are the values for standard
white light and YIY0 expresses the reflectance of the object

L" 116(YIY0>1f.l - 16 (YIYo > 0.(08856)


L" 903.25(YIY0> (YIYo ~ 0.(08856)
a 500[ (xlXif.l - (YIYif.l ]
b" 200[(YIYif.l - (ZIZo)If.l]
XO = 0.9804, Yo = 1.0000, ~ = 1.1812
(2.3)
Color in the uniform color space is expressed in three components, (L", a", b") where L"
represents lightness, and ( a", b") contain saturation and hue information.

2.3 New Color Conversion Method

Conventional illuminance algorithms calculate the surface illuminance, I, of an object as shown


in Fig. 2.3. This value represents the energy that illuminates a surface under given lighting
conditions. For realistic images, it is necessary to convert texture color on a surface based on this
illuminance value. Formerly, as an approximation, each spectral band of ROB was weighted with
illuminance I.
As an alternative to this method, the following discussion proposes using the uniform color
space and converting illuminance data, I, into texture lightness, L". We propose to use the uniform
color space instead of approximation to convert illuminance I to lightness L" of the texture. The
relation equations for this are given in (2.4) - (2.6). Mter illuminance calculations have been done
for all patches, illuminance of a given patch i is represented as Ii' the maximum value of all Ii is
I max , and reflectance of patch i as P i' Assuming Yo to be the value of luminance Y for standard
white, the luminance value of light entering patch i, represented as Yin' is given by Eq. (2.4):

(2.4)
Thus, luminance Yi of light reflected from patch i is equal ~o the Y value of the incident light
multiplied by the reflectance, as given by Eq. (2.5), where constant k defaults to a value of 1.0 but
may change depending on dynamic range and other characteristics of the display device.

Yi k PiYin
k P i (~JIma,.) Yo
(2.5)
Further, by converting the relationship between luminance Y and lightness L" using the fIrst
350

expression in Eq. (2.3), the relationship given in Eq. (2.6) can be obtained (for simplicity, the
coefficient Yo on the right side of Eq. (2.5) is assumed to be a.) L·i in Eq. (2.6) is the lightness
of the texture illuminated by illuminance~.

L·i = a III (L·iO + 16) - 16


where Cio is the value of texture lightness under standard C light source.
(2.6)
Thus, conversion to texture lightness values is possible by calculating the illuminance of
individual objects under given illuminating conditions using conventional illuminance calculation
algorithms and then introducing this illuminance ~ into the lightness calculation in accordance with
Eqs. (2.4) to (2.6). As demonstrated above, when light energy is converted to the lightness of the
surface texture of an object, color changes in the texture when subjected to color light can be
expressed via color conversion within the uniform color space. Next, experiments were conducted
to determine color changes of an object under color light

3 EXPERIMENT
3.1 Lig htness Correction

In order to determine the color change when a texture, mapped on an object surface, is
illuminated with a non-colorimetry source, a number of color samples were prepared, and these
changes were measured. Surfaces were assumed to be completly diffuse planes in this study.

L*

Saturation
a* Fig. 3.2 Color patches used in color
conversion experiment. (L * =75
to 90 intervals, hue angle = 0 to
Fig. 3.1 Distribution of color samples in
360 degree and saturation = 70)
the uniform color space.
351

40
CCD
image
en or

b*
01---+-++--+-l"--+-~-H-1H----l

Light
sources -20 '-----'-_ _---''--_ _. . . L - - . I
20
-20 o
a*

Fig. 3.3 Configuration of the Fig. 3.4 Example of measurments of changes in


measuring system the color space under light sources
differing in illuminance

First, photographic samples of 1200 colors distributed within the ClELAB color space were
prepared. These color samples were uniformly distributed within the uniform color space as shown
in Fig. 3.1. Each lattice intersection in Fig. 3.1 represents the chromaticity coordinate of a color
sample in the color space. Fig 3.2 shows color samples with lightness L' ranging from 75 to 90 in
increments of 5, with hue ranging from 0 to 360 degrees, and with a saturation value of 70.
The illuminance of the light on these 1200 color samples was varied in four steps (with L' set
to 45, 55, 60 and 70), and the amount of color change within the color space was measured. Table
3.1 provides the illuminating conditions. For measurement, a high performance CCD image
scanner was used (PIC-2350, Ikegami Tsushinki Co.) Color sample data were converted from
RGB values read from the scanner to the ClELAB uniform color space via Eqs. (2.2) and (2.3).
Fig. 3.3 shows the configuration of the measuring system . In this measuring system it was
possible to vary light source illuminance by varying power and to vary hue by using color filters.
Fig. 3.4 provides examples of measurements taken of changes in color samples within the space (a'
- b' plane; L' = 70). In the uniform color space lightness and luminance change virtually linearly
with changes in light source illuminance. Based on these measured results and taking into
consideration the color continuity within the space, changes within the color space as illuminance
varies were approximated by the linear equation shown in (3.1). Chromaticity coordinates (L' i' a'i'
b',) in the color space changes to (L'j, a'j, b') as the light source illuminance changes.

Table 3.1 Dluminating condition Table 3.2 Chromaticity coordinates

Light source FL205W-EDL-50K Color L* a* b*


(Toshiba Co., Ltd.)

5000K L1 Yellow 80 -4 40
Color temperature 20
L2 Green 70 -20
Maximum illuminance 4000 Ix L3 Red 80 10 40
L4 Yellow 90 10 40
Resolution 250 dpi L5 Red 90 20 20
352

a"j (j-:J L"w) a"i


b", (l:JL"w) b"i
J
where L"w is the lightness of standard white under standard light source,
and L"c the lightness of standard white as illuminance varies.
(3:1)
Eq. (3.1) allows the conversion of illuminance data obtained under specific illuminating
conditions to scenes where illuminance is arbitrarily changed. In this conversion the illuminance of
the light source is converted to lightness value L"c using Eqs. (2.4) to (2.6). The second and third
expressions in Eq. (3.1) are the formulation of relationship between the hue and saturation of the
texture as light source illuminance is changed.

3.2 Color Conversion of Color Light Source

This section describes a color conversion method for applying the method described in Section
3.1 to color light sources. In section 3.1 we considered cases where the light source was colorless.
Conventional illumination algorithms handle only non-colorimetry sources so no accurate color
conversion model has been presented for color light sources. The principal purpose of using the
uniform color space in the present study is to obtain faithful visual reproduction of a scene with a
color light source. To this end the illuminating condition under color light is reproduced by
converting texture color.
Hence, as in Section 3.1, existing color samples were illuminated with a color source, and
changes within the color space measured. Table 3.2 shows the chromaticity coordinates (values
converted into the CIE space) of three representative light sources selected from those used for the
experiment. Fig. 3.5 shows sample measurements of color changes within the color space of 30
color samples when illuminated with light source L1 ( a" - b" plane; L" = 70, in two types).

40D~~§;:=IJ
20

-20 o
a*
20 40
Fig. 3.5 Example measurements of changes in color
samples under color light source Ll.
353

L*

p
b*

./-- - -- .....~ b*
a* o a*

Fig. 3.6 Changes of color samples in the color space under a color light source.

The color of the region marked ~ under standard white Light P.


changes to the one in the region marked under a color light Q

These experimental results indicate that changes in existing colors under color light follow
certain rules, and thus, by using the method of least squares, the 1200 sample colors in the space
can be approximated as given by Eq. (3.2).

(L·w / L·max ) L·i


(L·w / L·... ) a·i + (1 +m) (L.J L·max ) a·w
(L·w/L·max) b·i + (1 +m) (L.J L·max ) b·w
( )
m =0 )
(3.2)
The rules of changes of color samples measured under color light is given below. In Fig. 3.6,
point P represents the chromaticity coordinate value (L·max ,0,0) of standard white light in the
uniform color space, and point Q the chromaticity coordinate value (L·w' a ·w' b·w) of the color light
source. When texture at chromaticity point I (L·i , a·i , b·i ) is illuminated with color light, the point
changes to chromaticity point J (L.j, a .j, b·j ) within the color space.
First, the color space shrinks proportional to the amount of change in light source lightness
from point P to Q ( L·w/ L·max ). This is indicated by the fIrst term of each expression in Eq. (3.2),
and the term corresponding to lightness correction in Eq. (3.1). In Fig. 3.6, the color existing on
the shaded plane changes to the one on the hatched plane, within the color space. Moreover, since
light source Q has color components, the color space is distorted such that the amount of distortion
is proportional to the amount of inclination of colorless axis OP to axis OQ. This is indicated by the
second expression in Eq. (3.2). The coeffIcient m in this expression is an approximation of the
saturation reduction of the color corresponding to the complementary color of the light source. In
354

Eq. (3.2), all hues within 0 0 of 0 are unifonnly distorted by IIlo only. These coefficients
correspond to the color purity of the light source. 0 /2 is equal to the angle of expansion from
plane OPQ shown in Fig. 3.6. These coefficients are determined based on measurements, using a
least square fitting method.

4 EVALUATION
4.1 Comparison with Conventional Method

As a means of evaluating the color conversion model proposed in Section 3, approximation by


the conventional ROB color system was compared with conversion using Eqs. (3.1) and (3.2). In
the ROB method each ROB value of a texture is merely weighted with the illuminance of the object
obtained from an illuminance calculation. The color difference within the color space, t:. Eab"
between the respective approximations and the measured values was obtained with respect to the
three light sources used for measurement in Section 3. This color difference, t:. E.b', is the
CIELAB color difference expressed by Eq. (4.1), and is the difference between two colors in a
color space expressed quantitatively (Robertson 1977).

(4.1)
Table 4 Optimum parameter and color difference

Parameter Color difference


t:.Eab*
20
b* mO lb [deg.] CIELAB ROB

o +--+---+---+---0+--1 Ll 0.2 39 2.6 12.8


L2 0.2 32 4.1 18.9
L3 0.1 48 2.2 14.8
L4 0.3 20 2.3 13.6
L5 0.4 23 3.1 15.5

o
a * 20
-20 40

Fig. 4.1 Relationship between Changes of the color space

Table 4 provides color differences between optimized values obtained by the method of least
squares of coefficients IIlo and 0 0 and measured values with respect to each light source used for
measurement (Light source parameters are shown in Table 3.2. Light source Ll is yellow, L2
green, and L3 red). Fig. 4.1 reveals a close correspondence between the measured values of color
distortion under light source Ll, and those obtained through approximation using Eq. 3.2 ( a' - b'
plane; L' = 70). When m is 0.2 and 0 is 39 degrees in Eq. (3.2), the optimum approximation is
355

achieved, reaching values marked with broken line. For all three light sources used for the
experiment, using our method it is possible to reduce f). Eab" to 1/3 of that obtained using the ROB
approximation. Although these color differences are clear in blue color where large measuring
errors occur, the differences are within a permissible tolerance in other hues. Thus this color
conversion method can provide sufficiently acceptable image quality for illumination simulation
with computer graphics.

4.2 Simulation with Radiosity

lllumination scenes in which the present color conversion is applied to texture were constructed
using illuminance data calculated by the radiosity method (the values of coefficients 1llo and 8 0

taken to be as shown in Table 4). Fig. 4.2 shows the results of our simulation. The scene in Fig.
4.2 uses seven light sources and some 1300 polygons, and some 110,000 patches employed for
radiosity illuminance calculation.
Fig. 4.2 (a) is an illumination scene under standard C light source, and (b) and (c) rooms
generated by the color conversion method under color light sources L2 and L3, respectively. Fig.
4.3 (b) shows the scene of a room converted under another light source L6. Light L6 has the
chromaticity coordinates (L" = 80, a" = -20, b" =-15) and coefficients are 8 0 = 35 degrees and 1llo
= 0.7. Repeated experiments with different light sources further confirmed that more faithful
simulation was possible with the proposed method than with the ROB approximation.

5 SUMMARY

In the conventional illumination algorithm, conversion equations to correct texture color on an


object surface based on illuminance data are not fully established, and, accordingly, conversion is
carried out by multiplying texture ROB values by illuminance. Moreover, since no consideration is
given to color light sources, it is difficult to achieve faithful reproduction of subtle color changes
caused by color light sources.
In the present color conversion method, these problems have been solved by introducing the
uniform color space to process image data. The features of this method can be summarized as
follows:
(1) The method succeeds in formulating the relationship between illuminance and
lightness;
(2) The method succeeds in approximating changes in texture hue and saturation linearly
based on measured values.
Models determined by Eq. (3.2) permit the generation of illumination scenes under given color
light sources with as small an amount of computation as the conventional method. There is a
356

growing need for computer graphics to efficiently model color components of light sources in order
to obtain photorealistic images. The presented method is widely applicable to such a need.

(a) Conventional radiosity and texture mapping (b) Dluminated under light source L2

(c) Dluminated under light source L3

Fig. 4 .2 A test scene for color illumination

(a) Conventional radiosity and texture mapping (b) Dluminated under light source L6

Fig. 4.3 Another test scene for color illumination


357

ACKNOWLEDGEMENTS

The authors would like to express their thanks to Mr. Nishizawa, Mr. Nishimura, Mr. Hirai
and other members of the SIG group of Matsushita Electric Industrial Co., Ltd. for their
encouragement and assistance. The autors also thanks Richard Doerksen and Dabney Israel for
their help in improving the manuscript

REFERENCES

Cohen, M.P., D.P. Greenberg, D.S. Immel, and P.I.Brock, " An Efficient Radiosity
Approach for Realistic Image Synthesis," Computer Graphics and Applications, 6(3),
March 1986, pp 26-35.
Cohen, M.F., S.E. Chen, J.R. Wallace, and D.P.Greenberg," An Progressive Refinement
Approach to Fast Radiosity Image Generation," ACM SIGGRAPH 88, New York,
August 1988, pp 75-84.
Hall, R.A. and D,P. Greenberg, "An Testbed for Realistic Image Synthesis," Computer
Graphics and Applications, 3(8), November 1983, pp 10-20.
Hall, R.A., " Hybrid Techniques for Rapid Image Synthesis," in Whitted, T., and Cook, eds.,
Image Rendering Tricks, Course Notes 16 for ACM SIGGRAPH 86 Dallas, TX,
August 1986.
Hall, R.A., " Color Reproduction and Illumination Models," from techniques for Computer
Graphics, edited by D.F. Rogers and R.A Earnshaw, PA, 1987, pp 194-238.
Hall, R.A, " Illumination and Color in Computer Generated Imagery," Springer- Verlag,
New York, 1989.
Kajiya, J.T., "Rendering Equation," ACM SIGGRAPH 86, Dallas, TX, August 1986.
pp 143-150.
Meyer, Gray W., " Wavelength Selection for Synthesis Image Generation," Computer
Vision, Graphics, and Image Processing, vol. 41, 1988, pp 55-79.
Robertson, AR., "The CIE 1976 Color Difference Formulae," Color Research vo1.2,
1977, pp 7-11.
Supplement No.2 to CIE Publication No.l5 Colorimetry, "Official Recommendation on
Uniform Color Spaces, Color-difference Equations, and Metric Color Terms," 1976.
Whitted, T., "An Improved Illumination Model for Shaded Display," Communications of
the A~ 23(6), June 1980, pp 343-349.
358

Appendix
Color Transformation

Color transfonnation from a color space ROB to XYZ is perfonned as follows:

x 0.6067 0.1736 0.2001 R

y = 0.2988 0.5868 0.1144 G

z o 0.0661 1.1150 B

Xo = 0.9804, X> = 1.0000, ~ = 1.1812

where XO' Y0 and Zo are the value for standard white light.

Toshlya Naka is Research Engineer in the Information and


Communications Research LaboratOry of the Matsushita Eleclric
Induslrial Co .• Ltd. His research interests include computer
graphics and color image processing. He received the B.Sc.
and M.Sc. degrees in electronics engineering respectively from
University of Osaka prefecture in 1983 and 1985.
He is a member of IIEE of Japan and IPS of Japan.
Address: Matsushita Eleclric Industrial Co.,Ltd. Information
and Communications Kansai Research Laboratory. 1006.
Kadoma. Kadoma-shi. Osaka. 571. Japan.
E-mail: naka@sy3.isl.mei.co.jp
359

Kenjl Nishimura is Research Engineer in the Information


and Communications Research Laboratory of the Matsushita
Electric Industrial Co., Ltd, His research interests include
computer graphics and geometric modeling. He received the B.Sc.
and M.Sc. degrees in electronics engineering respectively from
Osaka University in 1984 and 1986. He is a member ofIEICE ofJapan.
Address: Matsushita Electric Industrial Co.,Ltd. Information
and Communications Kansai Research Laboratory, 1006,
Kadoma, Kadoma-shi, Osaka, 571, Japan.
E-mail: kenji@sy3.isl.mei.co.jp

Fumlhlko Taguchl is Research Engineer in the Matsushita


AVC Software Co., Ltd, His research interests include
computer Graphics and its application. He received the B.Sc. degrees
in electronics engineering from Kansai University in 1989.
Address: Matsushita Electric Industrial Co. ,Ltd. Information
and Communications Kansai Research Laboratory, 1006,
Kadoma, Kadoma-shi, Osaka, 571 , Japan.
E-mail: taguchi@sy3.isl.mei.co.jp

Yoshlmorl Nakase is Research Engineer in the Information


and Communications Research Laboratory of the Matsushita
Electric Industrial Co., Ltd, He leads a research group of
computer graphics and image processing. He received the B.Sc. degrees
in electronics engineering from Nagoya Institute of Technology in 1977
and M.Sc. degrees in electronics engineering from Tokyo Institute of
Technology in 1979. He is a member of IPS of Japan.
Add ress: Matsushita Electric Industrial Co. ,Ltd. Information
and Communications Kansai Research Laboratory, 1006,
Kadoma, Kadoma-shi, Osaka, 571 , Japan.
E-mail: nakase@sy3.isl.mei.co.jp
Chapter 7
Picture Generation
Two-Dimensional Vector Field Visualization by
H~o~g .
R. Victor Klassen and Steven J. Harrington

Abstract

Digital halftoning is a technique for converting an image with multiple levels of


grey into a bi-Ievel (bitmap) image, typically in preparation for printing on paper. It
is standard practice to "optimize" the halftoning process to reduce the visibility of
artifacts that appear as textures within what should be a region of uniform or slowly
varying intensity. This paper describes a method of manipulating the halftoning
process to cause the texture to give an indication of field direction, while the field
magnitude is displayed using the intensity.

Key Words: Error diffusion, dot design, efficiency, directional halftone cell.

1 Introduction

This paper presents a fast method for producing a bi-Ievel image portraying field
strength and orientation. The image produced is suitable for a laser printer or
bitmapped display. The method is appropriate for displaying a field that has been
either measured or computed, on a grid of moderate .resolution (sizes greater than
roughly 100 x 100 require larger format than the 6 inches normally used on a printed
page). The method is appropriate for both electromagnetic fields and flow fields,
although electric fields are used for illustrative purposes.
Two dimensional vector fields are commonly displayed using contour maps (of
potential) and stream lines. Contour maps are good for displaying the gradient and

363
364

direction of a field, but to read field strength from a contour map is more difficult.
False colouring makes this easier, but requires a colour printer for hardcopy, and is
not acceptable for journals that do not contain colour pages. Streamlines give a more
intuitive sense of field strength and direction, but the orientation information is lost
where the lines are either too sparse or too dense. In order to draw either a contour
map or streamlines, it is necessary to track the curves to be drawn, and then scan
convert them (unless they are tracked on the raster grid as in (Lee and Chang, 1990».
A significant body of knowledge exists for tracking such curves (see ego (Sabin,
1985) and, more recently, (Dickinson et aI., 1989». While such techniques are
fine in their place, the method we propose is much faster, and in some applications
conveys as much or more information. They are also of no use in non-conservative
fields (where the number of sources differs from the number of sinks).

For many of us, our first visualization of a magnetic field involved a real-time
analog computation by a collection of iron filings on a sheet of paper, with a
magnet below. The image that comes to mind is similar to those produced by
particle-system based scientific visualization tools (Fowler and Ware, 1989, van
Wijk, 1990). Having been exposed to a real physical particle system, we then
proceeded to learn about fields illustrated using contour and stream lines. Other
techniques for displaying fields have existed for some time. Edmund Halley (1686)
used pen strokes - short dashes, thicker at one end and tapering to a point at
the other - to show prevailing wind direction on a map of the world's oceans
(Halley's map is discussed by Tufte (1983». Short lines from each grid point have
been used as well (Roberts and Potter, 1970); line length was tied to field strength
with reasonable results in regions of low field strength, less so in regions of high
field strength, where the picture becomes cluttered. In regions of moderate strength,
the orientation can be determined by close examination, although the visual system
tends to connect grid points rather than parallel lines (more on this in section 3.1).
Lavin and Cherveney (1987) have used the density of line segments, rather than
their length, to display the field intensity. More recently, particle systems have come
into vogue, as machines powerful enough to compute a useful number of trajectories
become available (Fowler and Ware, 1989, van Wijk, 1990).

The remainder of the paper describes our technique of using halftoning to display
orientation and field strength in a single image. We begin with a short tutorial on
digital halftoning. This is followed by two sections giving details of the modifications
to the standard halftoning algorithms we have used. After showing the usefulness
of the method using several examples, we discuss efficient implementation of the
ideas previously described. Finally, we discuss the relative merits of the method,
and future work.
365

2 Digital Halftoning

The phrase digital halftoning refers to any technique for converting a continuous
tone image to a bi-Ievel image. Ulichney (1987) surveys most of the techniques
in use at this time. In this paper the primary technique is ordered dither, 'which
provides the intensity variation, along with various modifications that are used in
controlling the display of direction. This section provides an outline of the ordered
dither technique for those unfamiliar with this method of halftoning. The description
is more functional than practical, paying no regard to efficiency.

Ordered dither translates a b bit per pixel input image, represented as an n X m


array Iij to a one bit per pixel image Okl represented as a p x q array, with n :::; p
and m :::; q. An auxiliary table of thresholds, normally known as a halftone cell,
is used to control which pixels are turned on. The halftone cell is normally much
smaller than either image, but it is generally at least as large as the scale factor
required to translate from input to output resolution. Repetitions of the halftone
cell are tiled over the domain of the output image to give the halftone screen. At
each pixel of the output image, the corresponding location in the halftone screen
is compared with the corresponding location in the (possibly scaled) input image.
If the input image is darker than the threshold in the halftone screen, the pixel is
black, otherwise it is white.

Halftone cells are usually designed with a different threshold for each pixel within
the cell. In a region of slowly varying intensity, pixels are turned on (black) one
by one, as the desired intensity drops. Halftone cells in which adjacent pixels come
on in sequence are referred to as clustered-dot. Clustered-dot cells are normally
designed to have a dark region that grows uniformly until adjacent regions touch,
after which a symmetric light region shrinks uniformly. The shape of the dark region
before adjacent regions touch is called the dot shape. If widely dispersed pixels
come on in sequence, the halftoning is said to be dispersed-dot. The dispersed dot is
preferred for its greater frequency response; the clustered dot is however necessary
for most print media, since isolated pixels have a relatively slim chance of appearing.
Figure 1 shows a simple 4 x 8 clustered dot cell and a 4 x 4 dispersed-dot cell.

Halftone cells are normally designed to give a uniform checkerboard pattern at


neutral grey. At other intensities, visible patterns are avoided wherever possible, as
they detract from the illusion of a continuous tone image. Increasing the size of
the halftone dot increases the intensity resolution available, at the expense of spatial
resolution. The main idea of this paper is to exploit the patterns that are normally
366

3 15 1 13 3 15 1 13

15 8 12 16 18 25 21 17 11 7 9 5 11 7 9 5

11 2 1 5 22 31 32 28 2 14 4 16 2 14 4 16

7 4 3 9 26 29 30 24 12 8 10 6 12 8 10 6

14 10 6 13 19 23 27 20 3 15 1 13 3 15 1 13

11 7 9 5 11 7 9 5

2 14 4 16 2 14 4 16

Fig. 1 A possible 4x8 clustered dot (left) and part of the tiling of a possible 4x4 dispersed
dot. (For illustrative purposes only; neither is necessarily optimum). If the input image
value at a point is darker than the corresponding halftone value, the pixel is set to black.

avoided, by redesigning the halftone cell to display an oriented pattern, and to use
different cells to provide different orientations.
Intensity is not generally a linear function of the number of pixels turned on in a cell.
This is a result of the physical processes involved in translating from a pixel being
set in memory to a point on the page or screen being blackened or intensified. Most
output devices in use today have some overlap between pixels, and the way they
add is not necessarily linear (this is particularly the case for print media). The best
way to compensate for this nonlinearity is to print a uniform grey region for each
halftone cell intensity, and measure the light reflected. From this information, the
correct set of thresholds can be found and substituted into the tables. This process
is known as tone reproduction curve (TRC) correction. For the purposes of this
paper, TRC correction has not been done. In order to improve the reproductions,
the images have been printed with enlarged pixels, so that multiple pixels are used
to simulate a single image pixel. This reduces the need for TRC correction.

3 Dot Design

This section addresses issues which arise in designing dots intended to portray
orientation. The obvious choice of dot is a clustered dot cell with a rectangular dot.
Different cells with rectangles at various orientations can be designed in order to
convey direction information. Such a simple dot design using uniform tiling of the
halftone cells is insufficient for the task, for reasons related to the way the human
367

visual system does edge detection. In order that regions of coherent orientation
be perceived, adjacent dots must connect to give longer lines or curves. This is
discussed in section 3.1. Section 3.2 discusses some of the factors in the resolution
trade-off that are unique to this application.

3.1 Alignment

The alignment of adjacent halftone dots is surprisingly important (but note that if
the dot is sufficiently large, and the intensity range sufficiently small, we are back
to the method of lines at grid points used by Roberts and Potter (1970)). When we
first conceived of the method, we imagined an image not unlike that produced by
iron filings in a magnetic field on a piece of paper. Each halftone dot would behave
as an individual iron filing, with the result that we could see the field in the much
the same way as in the grade school demonstration. Figure 2 shows the twelve 4x4
cells (at 25% on) for this simple arrangement.
A combination of effects make the halftone cell of Figure 2 surprisingly ineffective.
The problem is best explained by demonstration. Figure 3 shows a region of
vertically varying field intensity with orientation rotating through 180 0 from left to
right. While continuously varying orientations are portrayed, only four (horizontal,
vertical, and two diagonal orientations) are visible. The eye is tracking the wrong
features.
One method of reducing the alignment errors is to offset halftone cells a constant
amount depending on their position on the page and their dot slope. The vertical,
horizontal, and diagonal cells are unchanged. The second cell in the first row of
Figure 2 is offset vertically by 2 (mod 4) times the horizontal position in units of
cells, the third cell is offset vertically by 3 (mod 4), the fifth and sixth cells are
offset horizontally by 3 and 2 (mod 4), and so forth (Figure 4). One can improve
this by increasing the number of orientations beyond those available with integer
endpoints, and changing the offsets accordingly. The left image of Figure 6 shows
the effectiveness of this method.

Fig. 2 Twelve orientations possible with a single line halftone dot in a 4 x 4 cell.
368

Fig. 3 A simple test pattern displayed using an single line halftone dot in an 8 x 8cell.
Intensity varies linearly vertically, orientation passes from horizontal through vertical to
horizontal from left to right. There are 96 distinct orientations displayed, not all of them
differentiable at all intensities, but most of them can be differentiated by close inspection in
regions of high and low intensity. The dominant patterns in the image are vertical, horizontal,
and diagonal stripes. This is a result of the way the visual system tries to connect edges.

Offsetting cells as described above gives much improved results in homogeneous


regions and works tolerably well at transitions between regions of constant
orientation, if the orientation gradient is parallel to the orientation vector. It fails
completely if the gradient of the orientation vector is perpendicular to the orientation
(Figure 5). In that case, lines on opposite sides of the boundary are essentially
uncorrelated, and the resulting pattern of light and dark gives completely wrong
information (Figure 6). Unfortunately, in the typical field, orientations change most
in the direction orthogonal to field lines.

Fig. 4 By offsetting adjacent cells, one


can increase the number that line up. Two
different orientations are shown, and they
line up perfectly along the boundary between
the third and fourth columns of cells.
369

Fig. 5 Cells of the same orientations as


in Figure 4, but with the change occuring
along the vertical direction rather than
horizontal. Figure 6 shows the effect
of this on a large scale.

We have had the best success by designing halftone cells with multiple lines, rather
than single lines, and ensuring that lines in one cell are aligned with those in its most
likely neighbours. For a 15 x 15 cell, three lines can be drawn in twelve positions as
shown in Figure 7. With the positions of the endpoints of the lines limited to three
positions along an edge, all those between horizontal and 45° connect smoothly,
as do all those between diagonal and vertical. Figure 8 shows a region of a field
near a pole, displayed in this way.

Fig. 6 On the left is the left quarter of the field of Figure 3, but with constant intensity
(orientation goes from horizontal to 45°). Similar cells are aligned by offsetting them as
described in the text. On the right, the same algorithm is applied to a field with orientation
changing from top to bottom, rather than left to right. At the top the image consists of
horizontal lines, while at the bottom it consists of diagonal lines. Except at the left and
right edges of the image the predominant pattern is unrelated to the intended orientation.
370

.........

Fig. 7 A 15 x 15 cell, with three lines in the halftone "dot". As long as


orientation does not change too rapidly, dots are aligned with their neighbours.
Of the twelve possible orientations, four are shown; the remaining orientations
are obtained by mirroring and simple rotations.

Fig. 8 Using three stripes in a 16x 16 square. The field is part of the field surrounding a
dipole, showing one of the poles. The twelve orientations give a good impression of field
direction. In this and all remaining figures, intensity is strong in white regions, weak in dark.

3.2 Resolution

The usual trade-off between spatial and intensity resolution is not an issue once
a multi-stripe cell is being used. In a 15 x 15 cell, there are 225 intensity levels
available, of which some are lost due to non-linearities in the printing process, but
there are still enough levels for a smooth appearance. With this method there are
other issues that restrict the usable range of resolutions for a given device.
371

Normally an increase in halftone cell size leads to a decrease in apparent resolution,


but does not necessarily affect the size of input image that can be fit on a page.
When the input image has more input pixels on a scanline than there are halftone
cells within the page width, multiple input pixels are used within one halftone cell,
giving a higher resolution than the halftone cell size. However, for this visualization
method, in order to preserve the connectivity of lines within a halftone cell only
one input pixel per cell may be used.
With the availability of higher resolution printers, one has the choice of increasing
the input resolution or increasing the cell size. Keeping the cell size (in pixels)
constant results in a smaller (in area on the page) halftone cell, and finer lines. If
the line spacing drops much below 120 per inch, the orientation information is lost,
as it becomes unresolvable at normal reading distances. Hence the input resolution
is essentially determined by the size of the page (in inches, not pixels). 1 Increasing
the cell size does improve the rasterization of the lines within a cell, so there is some
value in using a higher resolution printer. Note that the images used to illustrate
this paper were magnified to improve the quality of the reproduction. Figure 8, for
example, would normally be 2.25 inches wide by 1 inch high.
Increasing the cell size to hold more stripes than about four does increase the
orientation resolution, but reduces the spatial resolution correspondingly (assuming
the stripe width is held fixed). In a 16x 16 cell one can fit four stripes, providing
16 distinct angular orientations. However at 50% each stripe and space is only two
pixels wide and this can be difficult to resolve. Using only two stripes in a 16x 16'
cell makes the stripes much easier to see, but yields only eight distinct angles. We
found three stripes to be a good compromise.

4 Angular Quantization and Error Diffusion

Twelve orientations is better than none, but not quite good enough. In order to
increase the number of orientations available in a single cell, a larger number of
lines would be required. Increasing the number of lines requires either going to a
higher resolution output device (leading to unresolvably thin textures) or reducing
the resolution of the input. The latter approach requires a reduction in the amount
of data that is represented on a page. (The number of angles available is four
times the number of lines in a cell, and the number of lines can be increased in
I The appearance on a 19 inch bitmapped display at 1280 x 1280 was consistently better than that of a 7 inch square
on a printed page, at 2100x 2100, in spite of the fact that we could in principle display more than two and a half
times as much data on paper.
372

proportion to the perimeter of the cell, not the area, so the improvements to be
had by this technique diminish rapidly). The alternative is to use error diffusion
(Floyd and Steinberg, 1975), a technique normally used to compensate for intensity
quantization. Applying error diffusion to the orientations, rather than the field
strengths, allows us to portray a larger number of orientations with lines that are
"on average" at the right orientation.
The simplest form of error diffusion (not recommended for diffusing intensity
quantization errors) is to proceed from left to right across a scanline, and at each
pixel, quantize that pixel to the nearest level (in our case one of twelve orientations),
and compute the error resulting from quantization. The error is then added to the
pixel to the right before that pixel is considered. More sophisticated forms of
error diffusion generally distribute some fraction of the error to each of the three
neighbouring pixels on the next scanline as well as the next pixel on the current
scanline, and traverse alternate scanlines in reverse order.
The eye is particularly good at recognizing vertical and horizontal lines, but has
remarkably little resolution for other orientations. This means that the error diffusion
need not be very good to be sufficient. We tried three different variants, the simplest
of which is to send all of the error to the right neighbour. The other two variants
were to send to the right neighbour or the neighbour below, depending on the
orientation of the field. The second variant was to send to the direction parallel
to the orientation, and the third was to send to the direction perpendicular to the
orientation. Quality was improved by the adaptive versions, but not enough to
survive reproduction. The differences between the second and third choices were
subtle enough that it was a matter of personal opinion which one was better. The
first method was good enough, except in a small number of cases, that its slight
speed advantage is likely to outweigh the subtle differences in quality. (This was
a surprise to us).

5 Examples

Figures 9 and 10 show two fields displayed using the method described. To aid
in the reproduction, they were computed at 150 dots per inch and scaled up by
replication. Resolution can be doubled by omitting the scaling step, and improved
slightly further by using a higher resolution printer. As discussed in section 3.2, a
major increase in resolution produces unresolvable lines. The examples shown use
the simple error diffusion technique described above.
373

Fig. 9 The same field as in Figure 8, displayed with error diffusion, and showing
more of the field . Note the effectively continuous range of orientations.

6 Efficiency

Without any optimization the algorithm will run almost as fast as halftoning. It is
generally much faster than drawing arrows by scan-converting lines. Because one
input pixel maps to an entire halftone cell, the algorithm can be made to run faster
374

Fig. 10 A parallel plate capacitor.

than more general halftoning algorithms. The cost is essentially one doubly-indexed
table lookup per 16 output pixels.

For speed, the halftone cells for all orientations are pre-rasterized into lookup tables
for all intensities. For eight bits of intensity, 256 halftone cells are computed for each
orientation. For a 16x 16 halftone cell, each cell requires 32 bytes. With this size,
375

three lines is about right, so there are twelve orientations. Five of the orientations
are vertical mirror images of another five, so with next to no performance cost they
can be re-used, resulting in seven orientations. In total 56K bytes of lookup table
are required for this halftone cell. The lookup tables can be pre-calculated and read
in or even linked in to the program if speed is a major concern.
For each input pixel, the orientation is quantized, its error passed on to its neighbour,
and the quantized orientation is then used as the first index into the lookup table.
The second index is the intensity (which can be quantized and error diffused, if
memory is tight), and the third index is the scanline within the halftone cell. Sixteen
16-bit words are copied from the lookup table to the frame buffer or output line
buffer, and the process is repeated. Quantizing the orientation requires one integer
multiplication and an integer division. If orientations are computed in the range -96
to 96, rather than -128 to 127, this can be reduced to a single shift. Error diffusion
requires a multiplication and a subtraction to calculate the error, and an addition
to spread it. (Again with special case code, for -96 to 96, the multiplication and
subtraction can be done with a mask). Finding the first word of a cell requires
two multiplications (by powers of two in this case but not in general) and an
addition; remaining words are adjacent. In all, four multiplications, two additions, a
subtraction, a division and fifteen increments, along with sixteen 16-bit word moves
are required per input pixel, and this can be improved slightly as indicated.
In the discussion above a 15 x 15 cell was used in conjunction with a three stripe dot,
whereas for the last two p.aragraphs a 16 x 16 cell was used for efficiency. Experience
shows that making one stripe slightly wider than the other two to increase the cell
size by one. row and column does not introduce any objectionable artifacts.
The algorithm has been implemented in relatively conservative C, and run on a SUN-
4/260 to produce the images in this paper. A 64x64 image, suitable for display on
a high resolution workstation screen, requires approximately 60ms to halftone. This
is sufficient for real-time animation at 16 frames per second. The images shown as
examples took approximately four times as long to produce, from 128x 128 images.

7 Discussion

As the examples show, we have succeeded in displaying both orientation and


intensity information by halftoning. This makes the technique accessible to anyone
with access to a laser printer or bitmapped display. As intended, it is fast; indeed,
fast enough for interactive use in browsing through large pre-computed fields. The
376

appearance of the resulting image is less like that of iron filings on paper than
we had imagined, but it portrays the information at least as well. The intensity
resolution is as good as halftoning can provide, by virtue of the large cell size,
while orientation resolution is good, but imperfect.

The primary need for TRC correction is between orientations within a single
intensity. This is most obvious when error diffusion is not used (Figure 8). When
images are computed with single device pixels per image pixel, some correction
within a single orientation is needed as well.

There are a number of weaknesses to the method. One is the directional ambiguity.
This ambiguity is no different from the iron filings model. We are looking into
ways of creating a sense of direction, using the ideas from (Fowler and Ware,
1989). The spatial resolution is more limited than we would like, although this is at
least in part a result of the resolution of the human visual system. It is inherently
two-dimensional: the lack of an obvious extension to three dimensions reduces the
number of problems to which it can be applied. There are still many field problems
that are solved in a two dimensional domain either because of the prohibitive costs
of computing in three dimensions or because of the inherently two-dimensional
nature of the problem. For them a two dimensional presentation is perfectly natural.

The method differs from a method that draws stream lines or potential contours in
that the number of lines is independent of field strength. This is both a disadvantage
and an advantage. On the one hand people are used to seeing streamlines drawn at
higher concentration in regions of greater field strength, so it takes some adjustment
to read these images. On the other hand, this method is not restricted to conservative
fields, which is useful in some applications.

References

Dickinson RR, Bartels RH, Vermeulen AH (1989). The interactive editing and
contouring of empirical fields. IEEE Computer Graphics and Applications,
9(3):34-43.
Floyd R, Steinberg L (1975). An adaptive algorithm for spatial gray scale. In: Society
for Information Display 1975 Digest of Technical Papers, pages 36-37.
Fowler D, Ware C (1989). Strokes for representing univariate vector field maps. In:
Proceedings of Graphics Interface '90, pages 249-253.
377

Halley E (1686). An historical account of the trade winds, and monsoons, observable
in the seas between and near the tropicks; with an attempt to assign the phisical
cause of said winds. Philosophical Transactions, 183:153-168.
Lavin S, Cherveny R (1987). Unit-vector density mapping. The Cartographic
Journal, 24:131-14l.
Lee YP, Chang CS (1990). Field visualization by interactive computer graphics. In:
CG International '90, pages 403-422. Springer-Verlag.
Roberts KV, Potter DE (1970). Magnetohydrodyamic calculations. In: Alder Berni
et aI., (ed.), Methods in Computational Physics: Volume 9, Plasma Physics, page
402. New York.
Sabin MA (1985). Contouring - the state of the art.
Tufte ER (1983). The Visual Display of Quantitative Information. Graphics Press,
Cheshire, Connecticut.
Ulichney R (1987). Digital Halftoning. MIT Press.
van Wijk 11 (1990). A raster graphics approach to flow visualization. In: Proceedings
of Eurographics '90, pages 251-269.

Victor Klassen is a Post-Doctoral Fellow in the Systems


Sciences Laboratory of Xerox Webster Research Center. His
research interests include rendering, scientific visualization,
and natural phenomena. He received a BSc in Physics in
1983, followed by a MMath and PhD in Computer Science
in 1986 and 1989, all from the University of Waterloo. He
is a member of ACM, ACM SIGGRAPH, IEEE Computer
Society, and Eurographics.
378

Steven Harrington is a Principal Scientist in the System


Sciences Laboratory at Xerox Webster Research Center.
His research interests include document representation and
rendering. He is the author of Computer Graphics: A
Programming Approach, and lnterpress: The Source Book.
He received BS degrees in Mathematics and Physics from
Oregon State University in 1968. Besides a PhD degree
in Physics received in 1976, he has MS degrees in both
Computer Science and Physics from the University of
Washington. Before joining Xerox he was a Post-Doctoral
Fellow at the University of Utah and taught Computer Science
at the State University of New York, Brockport.

Address: Xerox Corporation, Webster Research Center, 128-29E 800 Phillips Rd.
Webster, New York 14580, USA.
Three Plus Five Makes Eight : A Simplified
Approach to Halftoning
Geoff Wyvill and Craig McNaughton

Abstract
We wish to represent a halftone picture on a digital device like a laser printer. One
cheap and effective way is to trace a space filling curve over the picture,
accumulating colour density at each pixel. When the accumulated density is high
enough, we draw a dot.

We present a simple, robust and practical way to do this. Our tests have shown that
the way the colour density is scaled is important and the choice may be different for
natural and synthetic images. The same algorithm gives us a way to represent colour
pictures effectively on cheap displays with few available colours.

Keywords: Dithering, halftoning, greyscale, grayscale, space filling curves.

1. Introduction
The idea for this work came from a paper on halftoning by Cole (1990). He observes
that many of the refmements of dithering techniques are actually attempts to remove
artifacts put there by the original (faulty) algorithms.

Cole's basic algorithm may be described as follows:

sum = 0;
for each pixel do
begin
sum = sum + pixel_value
if sum > threshold_value then
begin
sum = sum - threshold_value
print a dot at pixel_position
end
end

The idea of this algorithm is not new. The well-known Floyd-Steinberg algorithm
[Burger 1989] is a refinement of the same idea where the accumulated error is
allowed to propagate to more pixels than just the next one in the scan. Cole's original
contribution was in choosing the order of the scan.

379
380

The quality of the resulting picture depends a lot on the order in which the pixels are
scanned. A simple linear scan gives quite a good result but tends to produce dots
lining up in 'contours.' A much better result comes from scanning the image with a
crinkly space-filling curve such as a Peano or Hilbert curve. In any case, the results
seem to be much better than those produced by dithering or the Boyd-Steinberg
algorithm. Cole generates such a space-filling curve with the 'murray scan,' an
elegant algorithm using mixed based arithmetic.

This algorithm suffers from some minor inconveniences. It works on pixel arrays of
m by n pixels when m and n are odd and have some low factors, e.g: 3,5,7. (In some
cases m and n can be permitted up to two even factors.) For best results, m and n
should both be divisible by three. If your starting image does not conform to this
pattern, then you have to lose a few edge pixels or add extra rows and columns of
black (zero valued) pixels. The algorithm, as published, does not include this selection
of suitable m and n, although it is not difficult to find suitable values close to the
actual picture size.

There have been many algorithms for halftoning published over the years and we will
not review them here. Cole [1990] and Knuth [1987] list the major ones. To some
extent comparisons are subjective, but it seemed to us that Cole's method was a
substantial improvement on his predecessors except for the lack of generality. One of
our images, for example, was of width 600 pixels. The nearest odd number below
600 for a good murray scan is 585. Using an even number special case, 594 is
possible. Or we could add three extra lines of black to make 603.

Our first idea was to find a generalization to extend Cole's technique. Rather than
pursue the wonders of mixed based arithmetic, we looked at the patterns from a
purely geometric point of view. This led quickly to a cheap and rather nasty solution
that nonetheless produces good results and works on pixel arrays of any size larger
than 7 by 7. This was supposed to be an interim measure but, alas, we have yet to find
anything better, even including Professor Cole's delightful 'murray' polygons. The
scanning algorithm is described in Section 2.
At first, we found that some images processed by ordered dithering appeared to be
clearer. But it turned out that the dithering program was scaling the grey values to
increase contrast. This can equally well be done with scanned images. Some results
are given in Section 3.
Finally, we applied the same process to produce versions of some of our ray-traced
images to be shown on an 8-bit colour display. These 'low quality' images are almost
indistinguishable from the originals. Heckbert [1982] made a similar statement of
colour images dithered by the Boyd Steinberg algorithm, but we believe that ours are
better. Examples are given in Section 4.

2. Three Plus Five Makes Eight


We need to traverse every point of an m by n grid with a path that never moves too
far without changing direction. An ideal way to do this is to traverse some sub-
rectangle in the grid and link these traversals up with unit lines. For example, in
381

Fig. 1, four identical sub-paths are linked by three unit lines to give a complete
traversal of an 8 by 8 grid. We would like to be able to do this for any integers m and
n, but Fig. 1 is a special case.

We choose to scan each sub-rectangle starting and ending at a corner. If the start and
end points are adjacent corners of the rectangle, we describe the path as an 'edge'
path. If the start and end points are diagonally opposite, we call it a 'diagonal path.' If
we colour alternate points in the grid black and white, then where the number of
points is odd, the scan must start and finish on points of the same colour. From this it
can be seen that there are three classes of grid:

• odd/odd, where m and n are odd,


• even/even, where m and n are even,
odd/even, where m is odd and n is even.

Fig. 1. Combining sub-paths

• 0
• 0
• 0

0
• 0
• 0
• 0 0
• 0

• 0
• 0
• 0
• • 0

0
• 0
• 0
• 0 0
• 0

Fig. 2. Classes of grid

Figure 2 shows one of each of these types. The first is even/even and a diagonal
traversal is not possible. The second is even/odd and an edge traversal is possible
along the even (vertical) edge but not along the other. The third grid is odd/odd.
Diagonal and both edge traversals are possible because all the corner points are the
same colour.
382

If we can divide a grid into smaller grids so that all the smaller grids are odd/odd,
then we can traverse the whole grid with a simple pattern such as that of Fig. 3. A, B,
C, D, E, F, G, H, I, J, K, L represent the sub-rectangles and the bold lines show how
they are connected. Notice that A, C, J and L must be scanned in 'diagonal' fashion,
but the rest are all scanned in 'edge' fashion. Since all the sub-rectangles are odd/odd,
this can be done and we have a method of scanning the whole rectangle.

A B C D

E F G H

l!.jJ K L

Fig. 3. Building a scan from smaller odd/odd scans

How do we divide a rectangular grid into odd/odd rectangles? Observe that every
number greater than seven can be made by adding the numbers three and five. The
following table covers numbers up to fifteen.

8=3+5
9=3+3+3
10=5+5
11=5+3+3
12 = 3 + 3 + 3 + 3
13=5+5+3
14 = 5 + 3 + 3 + 3
15 = 5 + 5 + 5

Higher numbers can be expressed as a mUltiple of eight plus one of these. So, for any
rectangle for which m > 7 and n > 7, both m and n can be expressed as a sum of
threes and fives. Suppose there are m rows and n columns. First express n as a sum of
threes and fives and divide the rectangle into columns, three or five points wide and
m tall. Then express m as a sum of threes and fives and each column is divided into
sub-rectangles and each sub-rectangle is odd/odd. Indeed, there are only four possible
sub-rectangles and these are shown in Fig. 4. Figure 4 also shows how they may be
scanned diagonally. Figure 5 shows how they may be scanned edgewise.

Figure 6 shows how these basic rectangles can be combined to cover a 36 by 24 pixel
grid. Notice that most of the grid is made of 8 by 8 units built from four basic
rectangles. The last four columns of 3 by 3 rectangles make up the 36 points. This
technique of scanning a pixel array has become known in our group as the "Geoff
scan" to distinguish it from the murray scan.
383

Fig. 4. Basic rectangles and diagonal traversals

Fig. 5. Some 'edge' traversals


384

Fig. 6. Composite grid

Inspection of Fig. 6 shows that the longest straight line is four units long, connecting
five dots. It is possible to further refine the choice of basic rectangles and their
orientation so that the longest such run is only three units connecting four dots. Our
results, however, suggest that this is unnecessary, and we have not pursued the idea.

3. Greyscales
Figure 7 shows a natural scene halftoned by three different methods. ·7a is done by
ordered dither, 7b with a murray scan and 7c with a Geoff scan. In theory, the
murray scan should be slightly less prone to patterning effects than the Geoff scan,
but we have not seen any evidence of this in natural pictures.

In artificial images, however, one is more likely to encounter large areas of exactly
constant shading, and patterning can be a problem. Figure 8 shows such an image,
again with dithering (a), murray scan (b) and Geoff scan (c). Both murray scans and
Geoff scans produce patterning at certain greyscale values. It may be possible to
improve the situation by choosing which grey levels to use. In an artificial picture,
you have this freedom. It amounts to choosing a slightly different picture that is more
acceptable to the particular halftoning algorithm.

Figure 9 shows an artificial scene that suffers from low contrast problems. In Fig. 10,
the grey levels have been mapped in a non-linear fashion to improve this. Each grey
level is mapped using a scaled square root function so that the full scale from black to
white is represented.
385

"'--- a

c
Fig. 7. Halftoning of a natural scene
386

c
Fig. 8. Halftoning of an artificial scene
387

Fig. 9. Low contrast

Fig. 10. Contrast enhanced

4. Coloured Images
The simplest way to apply halftoning to coloured images is to perfonn a separate scan
in red, green and blue and combine the results. The basic algorithm is:

red = 0;
for each pixel do
begin
red = red + red_ pixel _value
red_docvalue = red div red_step
red = red mod red_step
end
a b (.0)

c d

Fig. 11. Ray traced images with reduced colour sets


389
390

b
Fig. 13. Colour sorted images

where red_pixel_value is the original value and red_doCvalue is the new value in a
smaller range. Fig. 11 shows a ray-traced image reproduced with successively fewer
bits per pixel. lIa is the original 24 bit image. lIb is reduced to 216 colours
representing six levels of red, green and blue. lIc uses five bits, providing three
levels of red, green and blue. This was included because some cheap displays use this
system. Finally, lId shows the same image reduced to three bits, one each for red,
green and blue.

Figure 12 shows the same colour reductions done with a natural scene. It is
interesting to compare 11 b and 12b with Fig. 13 where the images are produced by
reducing the number of colours without halftoning. Instead, the colours have been
sorted and a reduced set of 'representative' colours chosen from the palette. Because
of the large number of colours needed to antialias the edges, there are insufficient
shades left to pennit smooth shading of the larger areas so shade lines appear. The use
391

of Geoff scans for halftoning seems to give a better result for the areas without
compromising the antialiasing. Figure 13a uses 225 colours. Figure 13b uses 195
colours. It would be interesting to see whether the halftoning results could be
improved by using the sorted colours rather than a fixed palette, but it is not at all
clear how to do this.

Figure 14 shows the image of Fig. 11 processed with a Floyd-Steinberg scan as


described by Heckbert [1982]. Comparing this with Fig. lIb, we see some colour
banding on the back wall and other minor patterning.

Fig. 14. Floyd-Steinberg scan

5. Conclusion
A simplified space-filling curve, the Geoff scan, has been described. It appears to be
as good as the murray scan without restricting the size of grid.

Applied to halftoning, Geoff scans give excellent results. With artificial images,
however, there is still patterning over areas with certain grey levels.

The technique also gives us an algorithm to reproduce coloured images on displays


with relatively few bits per pixel. The results are better than we have seen from
colour sorting techniques or other dithering methods.

Acknowledgements
This research has been funded by the University of Otago. We are also grateful to
Television New Zealand Limited for loan of equipment.
392

References
Burger P, and Gillies D (1989) Interactive Computer Graphics, Addison Wesley.
Cole AJ (1990) Naive Halftoning, Proceedings of CGI '90, Springer Verlag, 203-222
Heckbert P (1982) Colour Image Quantization for Frame Buffer Display, Proceedings
of SIGGRAPH, Computer Graphics 16 (3) : 297-307
Knuth DE (1987) Digital Halftones by Dot Diffusion, ACM Transactions on
Graphics, 6 (4) : 245-273

Geoff Wyvill graduated in physics from Jesus College,


Oxford, and started working with computers as a research
technologist with the British Petroleum Company. He gained
MSc and PhD degrees in computer science from the University
of Bradford where he lectured in computer science from 1969
until 1978. He is currently senior lecturer in computer science
at the University of Otago He is on the editorial board of The
Visual Computer and is a member of SIGGRAPH, ACM, CGS
and NZCS.

Address: Department of Computer Science


University of Otago Box 56
Dunedin, New Zealand

e-mail: geoff@otago.ac.nz

Craig McNaughton is a graduate student at Otago University.


His research interests include constructive solid geometry and
computer animation. He completed his BScdegree in
computer science in 1989 and he is a student member of ACM.

Address: Department of Computer Science


University of Otago Box 56
Dunedin, New Zealand

e-mail: cosc546@otago.ac.nz
ChapterS
Computational Geometry
A Theory of Geometric Contact for Computer
Aided Geometric Design of Parametric Curves
Chih Lee, Bahram Ravani, and An Tzu Yang

ABSTRACT

This paper develops a theory of contact for piecewise parametric curves based on the
differential geometry of evolutes, polar curves and binonnal indicatrices. This theory is
completely geometric, independent of parametrization and generalizes to any order.
Two sets of dimensionless characteristic numbers describing the local geometry of a
curve up to the nth order are defined. These characteristic numbers can be used to
describe conditions for higher order contacts in an algebraic fashion. The same
characteristic numbers can also be used to interpret contact conditions of up to nth order
in tenns of the geometry of higher evolutes and binonnal indicatrices. The resulting
geometric contact conditions are used to design piecewise parametric curves for
Computer Aided Geometric Design (CAGD) applications.

Key Words:Geometric Contact, Principal Evolute, Polar Curve, Binonnal Indicatrix,


Geometric Continuity.

1. INTRODUCTION
In differential geometry, two curves are said to have contact of order n if they have n+ 1
consecutive points in common (see, for example, Kreyszig 1959; Stroik 1950). In
CAGD, this theory of contact has been used for the design of smooth spline curves. The
actual differential geometry of contact for curves and surfaces was developed by
Lagrange (1797) and Cauchy (1826). It was, apparently, Geise who first used the theory
of contact to define smooth spline curves (see Boehm 1988a). Bezier (1970) introduced
the concept of tangent and curvature continuity that gave geometric interpretation to
piecewise curves having geometric continuity of order one and two respectively.
Manning (1974) and Nielson (1974) developed a theory for the design of spline curves
with curvature continuity. These works were refined and extended by several
researchers including Barsky (1981, 1983), Boehm (1984, 1985, 1987, 1988a), Farin
(1982, 1985, 1988), Hagen (1986) and Herron (1987). The basic advantages of these
works are that they describe the continuity conditions in a geometric fashion and that
they are independent of parametrization. The disadvantage has been that, in their
present fonn, they do not easily generalize to high order contacts. Design of spline
curves with higher order geometric continuity has been considered, most extensively, by
DeRose and Barsky (1985a, 1985b, 1989, 1990). They have developed a powerful

395
396

approach using geometric shape parameters. Their approach, however, results in


contact conditions that are described algebraically based on the use of the chain rule of
differentiation. Boehm (1988b) also develops higher order contact conditions
algebraically in terms of the end-point derivatives. Herron (1987) states that "the
geometric quantities which are independent of parametrization and embody the
information of the nth order derivatives are not known to exist in forms simple enough
to work with".

In this paper we develop geometric conditions for contact of spline curves. These
geometric conditions are not only independent of parametrization but also generalize to
any order. The theory is developed based on higher order curvature and torsional
properties of space curves described in terms of the differential geometry of their polar
curves, principal evolutes and the binormal indicatrices. The concept of evolutes
(French: developpees) for space curves was introduced by Monge in 1785. Here we use
the concepts of polar curves and principal evolutes and develop a completely geometric
theory of contact for spline curves in a CAGD environment. In the much simpler case of
planar curves, a classical theory of contact based on higher order curvature properties
of the evolutes has been developed in the past (Muller 1891). In this case a set of
dimensionless numbers, called curvature ratios, was used to characterize planar curves,
at a point, up to the nth order. A more recent account of the differential geometric
properties of evolutes of planar curves can be found in Guggenheimer (1963).
In this paper, we show that higher order curvature and torsional properties of a space
curve can be described in terms of the intrinsic geometry of their polar curves, principal
evolutes and binormal indicatrices. We introduce two sets of characteristic numbers
which are defined in terms of intrinsic properties of the polar curves and the binormal
indicatrices. We show that these dimensionless characteristic numbers describe the local
geometry of a general space curve up to the nth order. We will use these characteristic
numbers to relate geometric properties of the principal evolutes and binormal
indicatricesto the contact conditions of different orders. Based upon these numbers, we
use the tangent, and the curvature vectors of the principal evolutes and the torsions of the
binormal indicatrices to develop a theory of contact for spline curves which is not only
geometric and independent of the parametrization but also generalizes to any order. The
theory is illustrated by constructing G 3 spline curves.
In the following sections, we begin with a study of different.ial geometry of polar curves
and their principal evolutes in so much as it is needed for the development of the theory
of higher order contact of space curves. This is given in section 2. The main new
contributions of this paper which are the characteristic numbers and the geometric
contact conditions are given in sections 3 and 4.

2. POLAR CURVES, PRINCIPAL EVOLUTES, AND BINORMAL


INDICATRICES
In this section, we describe the concept of polar curves and principal evolutes, and
review the definitions of polar developable, and binormal indicatrices of space curves
(see Graustein 1935; Kreyszig 1959; and Struik 1950).
397

The osculating circle 1 of a point P on the curve C : x = xeS) is the circle which has three
points in common (contact) with C ; the osculating sphere2 is the sphere which has four
points contact with the curve C. The center of the osculating circle is referred to as the
center of curvature at P. A polar axis is the axis which goes through the center of the
curvature of the corresponding point and is parallel to the binormal of the curve.
Therefore, the center of the osculating sphere also lies on the polar axis of the curve.
The surface generated by these polar axes is called the polar developable of the curve.
Accordingly, every curve, other than a circle, has a unique polar developable on which
lies the locus of center of curvature and the locus of the center of osculating sphere.
The edge of regression of the polar developable is called the polar curve (Mitrinovic
1969) of the original curve. It has been shown that the polar curve is the locus of the
center of the osculating sphere for the space curve. On the general polar developable
lies also the locus of the center of curvature which is, here, referred to as the principal
evolute of the curve. Since the polar curve and the principal evolute are unique for any
space curve they are useful for our purposes.
A space curve, not a straight line, has a unique principal evolute and a unique polar
curve. The polar curve itself has a unique principal evolute and a unique polar curve.
We shall refer to these as the second principal evolute and the second polar curve,
respectively, of the original curve (see Fig. 1). In this manner, we can define the nth
principal evolute and the nth polar curve of a space curve (see also Scheffers 1915; Loria
1902; Barner 1961).
-------
Original curve C

2n~

, Polar Curve P 1
,

Fig. 1. The Concept of Higher Polar Curve


The binormal indicatrix (i.e. the spherical indicatrix of the binormal) of a curve is the
locus which lies on the surface of a unit sphere, defined by the binormal of the curve,

1.2 The actual definitions of osculating circle and osculating sphere can be found in standard differential
geometry books such as Kreyszig (1959) and Struik (1950).
398

and characterizes the torsion of the original curve. In the case of planar curves, the
binormal indicatrix degenerated to a point.
The nth binormal indicatrix of the curve is defined similarly as the binormal indicatrix
of the (n-l)th binormal indicatrix. This enables us to describe higher order intrinsic
properties of the original curve up to any order in terms of the geometric properties of
its successive principal evolutes, polar curves and binormal indicatrices. The
differential geometry of higher evolutes seem to have been developed in German
literature (see, for example, Scheffers 1915; Loria 1902; Barner 1961). Here we study
them together with the differential geometry of higher order polar curves and binormal
indicatrices and cast them in a form suitable for the subsequent developments in this
paper.
2.1 The Differential Geometry of Polar Curves
Since the defmition for the nth principal evolute and nth polar curve are introduced here
based on the (n-l)th polar curve, we will study the differential geometry of the polar
curves and principal evolutes in so much as necessary for the subsequent development in
this paper.

2.1.1 The Polar Curve


It has been shown that the polar curve is the locus of the center of osculating sphere for a
space curve. Therefore, the equation of the polar curve Pl(S) , which is the edge of
regression of the polar developable of a space curve (see Struik 1950; Kreyszig 1959), is
given as
PI = X + p N + QQ 0" B (2-1)
dS
where the radius of curvature peS) = K!S) ,radius of torsion O"(S) =,,)S) , normal
vector N(S) and the binormal vector B(S) are all functions of the arc length S of the
space curve xeS).
It should be noted that equation (2-1) is only applicable for space curves at points with
non-zero torsion and that the polar curve equation for planar curves is not a proper
specialization of this equation.

We differentiate (2-1) with respect to S, the arc length of the space curve,

~ _ dPl dS 1 _ dS 1 _ (d 2p ~ dO" ~)
ds - dS 1 dS - T 1 ds - ds 2 0" + ds dS + 0" B
where SI is the arc length of the polar curve.

We may measure SI so that the unit tangent TI =B (2-2)


dS 1 d2 p dn dO" n
and therefore ~ = 0" + = - + ~
dS dS 2 dS ds 0"
399

The curvature of PI can be found by differentiating the equation T I = B with respect to


Sl. Then

dT l dB dS dS
dS l = Kl Nl = ds dS l = -'t N dS l

We let the principal nonnal Nl be in the direction of - N i.e. Nl = - N (2-3)


and we have the curvature of PI

dS. dS l
Kl = 't dS I or the radlus of curvature of PI is PI = cr dS

Thus the binonnal of PI' BI =TI X NI =B x (- N) =T (2-4)

The torsion of PI can be found by differentiating the binonnal BI = T with respect to SI

dB l dT ds ds
- = -'tIN I = - - = K N -
dS l dS dS I dS l
dS . . dS I
Thus 'tl = K dS I or the radIUS of torslOn of PI' cr I = P ds

From (2-2), (2-3) and (2-4), we have the Frenet trihedron of the polar curve

Tl = B
{N I = -N
B 1 = T
2.1.2 The Principal Evolute
The equation of the principal evolute E.. the locus of the center of curvature, is given as

El =X + P N (2-5)

Note that the principal evolute is, in general, a space curve since x and N are vectors in
space.
We differentiate (2-5) with respect to S, the arc length of the curve C.

dEl = del T = QQ. N + Q B (2-6)


dS dS el ds cr
where el is the arc length of the principal evolute.

It follows that the unit tangent of the principal evolute lies in the nonnal plane of the
original curve, and hence cuts the original curve orthogonally. If the tangent is inclined
to the principal nonnal N at an angle e, then
400

Tel = cose N + sine B (2-7)


.QQ
a
where cose = dSr ,sine = ~

del = .... I( <!2.. )2 + ( ~)2 =a


r
(2-8)
ds 'J ds a

and r = ....
'JI( <!2..
dS
a )2 + p2 is the radius of osculating sphere of the curve C.

Fig. 2 The curve C , EI (principal evolute) and the angle e.

To find the curvature, we differentiate (2-7) with respect to el

(2-9)

- + 1
ds ( - K cose T - ( de
= -d -) . -d + 1::;) cose B )
sme N + ( de
el dS a S v

We have the curvature of EI

dS
Kel=-d ..J (K cose)2 + (02 (2-10)
el
de 1 2 . 1 (d2p ~da~) 2r2-pPI
where (0 = ds + = cr cr - r
sme dS2 a + dS dS + a = ar2
and PI is the radius of curvature of the polar curve Pl'
dS
It should be noted that for plane curves, e = 0, (0 = 0, and therefore Kel = K del
From (2-9) and (2-10), we have the nonnal of the principal evolute

Nel =..J 1 ( -K cose T - (0 sine N + (0 cose B )


(KCOse)2 + (02
401

2.1.3 The Second Polar Curve

From the defmition of the nth polar curve, we have the equation of the second polar
curve

dPI
P2 = PI + PI N 1 + dS I al BI

Similar to section 2.1.1, we obtain the curvature of P2

. dS.. dS2
The torsIOn of P2 12 = 1 -d
'S2
or the radIUS of torSIOn a2 =a -d
S
The Frenet trihedron of the second polar curve

f'
= T
N2 = N
B2 = B

2.1.4 The Second Principal Evolute


The equation of the second principal evolute E2 is given as

E2 = PI + PI NI
The tarlgent of E2 is
T dE 2 dS I dE 2 dS I dPI PI
e2 = de2 = de2 dS I = de2 (dS I NI + al B I)
where e2 is the arc length of the second principal evolute.

= iL and

is the radius of osculating sphere of the curve Pl'

The curvature of E2
402

2ry - PIP2
where 001 = r2' and P2 is the radius of curvature of the 2nd polar curve.
al 1

2.1.5 The nth Polar Curve


The same procedure can now be applied to higher order polar curves. We observe that
the nth polar curve has the Frenet trihedron of { B, -N, T } if n is an odd number and of
{ T, N, B } if n is an even number. The expressions derived for the Frenet trihedrons of
polar curves can be generalized in matrix form.

[::]=[: ~ :] [::.:] =[: ~ :J [:]


Bn 1 0 0 Bn- 1 1 0 0 B
For the geometric properties of the nth polar curve, we summarize them in the
following:

The equation of the nth polar curve Pn is

dPn-1
Pn = Pn-l + Pn-l N n-l + ~ an-l B n-l
n-!

and (2-11)

where Sn is the arc length of the nth polar curve and

dS n . .
_ P dS If n IS an even number
{ (2-12)
Pn - dS
a dSn if n is an odd number

an a n_! =Pn Pn-l


2.1.6 The nth Principal Evolute
We follow the same procedures as in previous section and generalize the results for the
nth principal evolutes. The equation of the nth principal evolute En is

En = Pn-l + Pn-l Nn-l


The unit tangent
403

T
e(o)
= 0'0-1
ridS
(dPO-1 N
0-1 +
PO-l
0'
B )
0-1
0- n-l n-l
and den = rO-l
dSn_1 O'n-l

The curvature of En

dSn-l (dPn- 1 1(0-10'0-1)2 2


KC(0) = den dSn-l r
n-l
+ CO n -l

2.2. The Spherical Indicatrices of the Binormal


The binonnal indicatrix, in general, is a spherical curve which gives infonnation about
the orientation of the binonnal of a curve. To describe higher order torsion properties
of a space curve, we apply similar concept of the nth polar curve. That is, the binonnal
indicatrix itself has a unique binonnal indicatrix, and we call it the second binonnal
indicatrix of the curve (see Fig. 3).

Fig. 3 The Binonnal Indicatrices


In this similar manner, we can define the nth binonnal indicatrix of the space curve
which will let us to describe higher order torsion of the curve up to any order in tenns
of the geometric properties of its successive binonnal indicatrices. For a planar curve,
the binonnal indicatrix degenerates into a point with zero torsion. Here, we study the
404

differential geometry of the binormal indicatrices to describe the higher torsional


properties of a space curve.

The equation of the binormal indicatrix is

Esl = Rsl = B
where RSI is a unit vector in the direction of the center 0 of the sphere to a point of
the curve Esl .

The unit tangent of the binormal indicatrix is

dE sl <iRs 1 dB dB dS dS
Tsl = dli = dli =dli = dS dli = (-'t N ) <fli"
where li is the arc length of Esl.

d[l
We may measure li so that Tsl = - N and dS = 't
The vector field G sI = R sIX T sI = T called the central normal of the binormal
indicatrix.

Thus, we have the relationship between the geodesic trihedron {Rsh T s1 ' G s1 } of Esl
and the natural trihedron {T, N, B} of the original curve. That is,

RS1]
[ Tsl =
[0 01][ T ]
0 -ION
GS1 1 0 0 B

The natural curvature Ksl and geodesic curvature 91 of Esl can be found by
differentiating TSI with respect to [I.

dT s1 ds dN dS K
d4 =- d4ds=-d4 (-KT+'tB)= (-B+-:r T )
= - Rsl + 91 G S1 = Ksl Nsl (2-13)

Hence, we have

(2-14)

= _I
and Nsl
\' 1 + gr (- Rsl + 91 Gs1 )
91 =~ ,is the geodesic curvature of Es1 .
405

The relative orientation of the natural trihedronand the geodesic trihedron of the
binormal trihedron is defined by the angle <1>1 between the unit vector Rsl and the
binormal B s1 ' It is measured by rotating about the common axis Ts1 of the two
trihedrons in the negative sense, so that

TSl] = [ -SI001C0]l [Tsl


[ Ns1
Rs1 ]

Bs1 C 1 0 Sl G s1
where Sl =sin ~1 and Cl =cos ~1 and ~1 is the angle between Rsl and Bs1 '

The well known Frenet formulas

~[ :::] =[ -Ks~ Ks~ ~~][ :::]


Bl
s O~d 0 B sl
dK d~
-~ - -K
dS ds . '
where Ksl and ~sl = ~ (K2 + ~2) are the curvature and torSIOn of Est. respectIvely.

From (2-13) and the differentiation ofGs !> we obtain the geodesic Frenet formulas

d~ [
RS1]
Tsl =
[0 1
-1 091
0] [ Rsl ]
Tsl
G S1 0 -91 0 G S1
K
where 91 = cot <PI = -:t .
Note that the torsion of Rsl can be written in terms of its geodesic curvature (see Lee
~_1
1991). That is, the torsion of the Rsl> ~sl = d{ 2
Ks1

where
~ _ dS ~
d{ - d{ dS =cr dSiL(Q)P
The nth Binormal Indicatrix
Similarly, for the nth binormal indicatrix, we have the following:
406

Rs<n)] [ 0 1 0][ Rs(n)]


d~ [ Ts(n) = -1 09n Ts(n)
Gs(n) 0 -9n 0 Gs(n)

Rs<n) ] [Cn_1 0 Sn-I] [ Rs(n-I) ]


[ T s(n) = So-I 0 -Cn_1 Ts(n-I)
Gs(n) 0 1 0 Gs(n-l)

d~
d~_l = 'rg(n) d~ = - d<Pn-1 (2-15)

Kn=~ = ~9~ + 1

_ d<Pn _ _ I_dOn
(2-16)
'ts(n) - - d~ - g~+ 1 d~

where go =~s(n)
s(n)
is the geodesic curvature of the nth binormal indicatrix.

3. THE CHARACTERISTIC NUMBERS


In this section, we define two sets of dimensionless characteristic numbers, called
curvature ratios and torsion ratios, that carry the local geometric properties of space
curves up to the nth order. The definitions of the curvature ratios are based on the
curvature of the higher polar curves, and the definitions of the torsion ratios are based
on the torsion of the higher binormal indicatrices. These numbers are dimensionless,
geometric and generalize to any order. We will show that a space curve can be
characterized intrinsically up to the nth order, at a point, using these dimensionless
numbers_
3.1 The Cu rvatu re Ratios
We define the first set of characteristic numbers as follows:

Pi dp·
~i =P and ~i+l = dSI i = 0, 1, 2, ....n
where Pi is the radius of curvature of the ith polar curve and Po = p.

The explicit expressions of the An's can be obtained from equations (2-11 and 2-12). For
examples:

(3-1)
407

(3-2)

Since the first set of characteristic numbers are based on the curvature of the polar
curve, we call them the curvature ratios of the curve. This defmition is analogous to the
curvature ratios of plane curves. In the case of plane curves

dPi Pi
~+l = ~i+2 =-= -
dS P

which are the same curvature ratios as used by Muller (1891) and Freudenstein
(1965).
3.2 The Torsion Ratios
In addition to the curvature ratios, we introduce a second set of characteristic numbers
to describe a general space curve with non-vanishing torsion. They are based on the
torsion of the higher binormal indicatrix of the original curve, namely:

~ = P dS
d4
=P g
i-I
't sO) i = 1, 2, ...n
where 4is the arc length and 'ts(i) is the torsion of the ith binormal indicatrix.
From equations (2-15 and 2-16), the explicit expressions of the second set of
characteristic numbers can, therefore, be written as:

for i = 1 (3-3)

for i = 2 (3-4)

dYn-1 1
for i = n ~ = P 'ts(n-l) •.• 'tsl 't = - P -d- (~1)
S In-l+

where Yi =;i and 'ts(i) are the radius of geodesic curvature and the torsion of the ith
binormal indicatrix respectively.
We see from the above explicit expressions of the second set of characteristic numbers
that they characterize the torsion properties of the binormal indicatrix up to any order.
In the case of plane curves, the second pair of characteristic numbers will vanish since
the torsion is zero. Since the second set of characteristic numbers are based on the
torsion of the nth binormal indicatrices of the space curve, we call them the torsion
ratios of the curve.
408

The two sets of dimensionless numbers can now be used to characterize a general space
curve up to any order. This can be seen from the Taylor series expansion of the curve C:
x = f(xl(S), x2(S),x3(S)) in the neighborhood of S = O. That is,

S2 S3
C = f(S) = f(O) + S f(O) + "2 f'(O) + "6 f"(O) + .... (3-5)
But
dx
f(s) =-=T
ds

f'(s) = d2x2 =1 N
dS p

f"(s) = d 3x=_l T _lQQ. N+_l_B


ds 3 p2 p2 ds Pa
At S = 0, f(O) = 0 then the equation (3-5) becomes

S2 1 S3 1 1 QQ. 1
xl(S)T+ x2(S)N+X3(S)B=S T+"2p N+"6(-p2 T -p 2 ds N+ p a B )+ ....

Using the Frenet trihedron {T, N, B}, we rewrite the above equation, up to fourth
order, in terms of p, and a. We have

Substituting the Characteristic numbers into equation (3-6), we obtain

(3-7)

4. THEORY OF CONTACT
Theories of contact for curves and surfaces have been discussed in differential geometry
(Kreyszig 1959; Struik 1950). Such theories have also been used for the definition of
409

smoothness of curves in CAGD since 1962 (see Boehm 1988a,1988b). In this section we
utilize the differential geometric results developed in the previous sections to develop a
theory of contact for spline curves. We will show that higher order contact conditions
of up to nth order can be described either algebraically in terms of the dimensionless
characteristic numbers or geometrically in terms of the principal evolutes and the
binormal indicatrices. This theory is geometric, independent of parametrization and
generalizes to any order. We first start with the well known mathematical definition of
contact from differential geometry.

.
'Definition:
Let xes) = (xl(S), x2(S), x3(S», s being its arc length, be a spline curve composed of
several segments in R3. Let x_ denote the right endpoint of a segment and ~ denote the
left endpoint of the adjacent segment on xes). The spline curve xes) has contact of order
n (exactly) at x+ and x_ , if
~(s) = xjs)

dffix dffi}L
--+ - - -
ds'.e - ds~
m = 1, 2, ... n
and if also the derivatives of order n+ 1 at x+ and x_ exist but

o
Remark: From the above definition, we say that a spline curve xes) has nth order
geometric continuity (G n) if all the curve segments of xes) have contact of order n
(exactly) at the joints. We can also say that the xes) has contact of at least order n if we
remove the inequality condition.
Now we show how the contact conditions can be written geometrically using the
tangents, curvature vectors of the principal evolutes and the torsions of the binormal
indicatrices. The results up to the third order are well known in CAGD literature.

4.3.1 First Order Contact


The definition of contact gives two conditions for n = 1:

(4-1)

(4-2)

We say that a spline curve has contact of order one if it is position continuous and unit
tangent continuous.
4.3.2 Second Order Contact
Forn =2, the defmition gives an additional condition to (4-1) and (4-2) which is
(4-3)
410

We say that a spline curve has contact of order two if it is position, tangent and curvature
vector continuous. Note that conditions for second order contact also imply Frenet
Frame continuity.

4.3.3 Third Order Contact


For n = 3 the definition of contact gives an additional condition to those of second order
contact which is

(4-4)

From (3-7), the equation (4-4) is equivalent to

1.1+ =1.1_ and )lI+ =)lI- (4-5)

This are the algebraic conditions characterizing contact of the third order in terms of the
dimensionless characteristic numbers A and)l. We can also observe from equation (2-7)
and (2-8) that the tangent of the principal evolute can be written as
1
Tel = --.J 2 2 (AI N + )lIB)
)lI+AI
With (4-5) and Frenet frame continuous, the condition (4-5) is equivalent to

TeI+ =T eI- and 1:+ =1:_ (4-6)

Therefore, a spline curve has contact of order three if, in addition to the conditions of
second order contact, the tangent of the principal evolute and the torsion of the original
curve are continuous, or algebraically if the curvature ratio Al and the torsion ratio )lI
are the same for the two curves at the point of contact.

4.3.4 Fourth Order Contact


For n =4 the additional condition is
d4x d4x_
--+ - - - (4-7)
ds +4 - ds -4

From (3-7), the condition (4-7) can be written in terms of the intrinsic characteristic
numbers as follows

~+=~- (4-8)

and )l2+ =)l2- (4-9)

From (2-9), (2-10) and (3-2) we know that the curvature vector of the principal evolute,
in terms of the characteristic numbers, is
411

1
Kel Nel = ~ (- 1( A I T - ro III N + ro A I B)
where 0 = J.lt + At, co = K ( 2J.lI - J.lj 0 Az)

With Frenet frame and AI> J.lI continuous, the condition (4-8) is equivalent to

KeI+Nel+ = KeI_Nel-
and with (3-4) and (4-6), the condition (4-9) is equivalent to

Accordingly, a spline curve has contact of order four if the curvature vector of the
principal evolute and the torsion of the binormal indicatrix are continuous, or
algebraically if the curvature ratio A2 and the torsion ratio J.l2 are the same for the two
curves at the point of contact.
4.3.5 Contact of Order n
From the above results, we can now generalize the conditions for the nth order contact
(n ~ 5) of spline curves (in addition to the conditions of the (n-l)th order contact):
A spline curve has contact of order n if

(a) the tangent of the (n;l)th principal evolute and the torsion of the (n-3)th binormal
indicatrix are continuous, for odd values of n,

(b) the curvature vector of the (n;2)th principal evolute and the torsion of the (n-3)th
binormal indicatrix are continuous; for even values of n,
The algebraic conditions in terms of the characteristic numbers are

"'n-2)+ = A(n-2)-
and J.l(n-2)+ =I!{n-2)-

4.3.6 Design of Quartic Spline Curves


The geometric theory of contact developed so far can be used (at least in theory) to
design parametric spline curves with any order of continuity. Here we will use the
theory to construct a G 3 quartic spline curve. The result is shown in Fig. 4. Note that
the circled points (extremes) on the principal evolute show the relative maximum or
minimum curvature of the corresponding curve. The cusps, the sharp points of the
curve, appear as singular points on the principal evolute. The segments of principal
evolute between two adjacent extremes show strictly monotone curvature of the
corresponding curve and they are tangent continuous (see Guggenheimer 1963). If we
412

design a quartic spline curve with only torsion continuity (which is a more relaxed
contact condition) then the tangent of the principal evolute is not necessarily continuous.
This is shown in Fig. 5.

Joint
Extreme
,,
,
,
,,
f
,
,
f
f

f
f
,
l'
"
f

,,
,,,
f

,,
'<,"" """ """" ,,
,,,
,
,,

')(
i' Polygon
__ ---~
Control

,.--
Fig. 4 G3 spline curve with tangent of principal evolute continuity

r-" sp~":~~~"""'"""",: ~::me


,
f

,,
f

, f

f
f
¥
,f

,,
'" """""""" "" ,,,
\ ,,
,,
')(
Principal Evolute ,/' Control
___ ---X Polygon

Fig. 5 A quartic spline curve with torsion continuity and


discontinuity on the tangent of principal evolute

5. CONCLUSIONS
In this paper, we have developed a new theory for geometric continuity in CAGD that is
independent of parametrization and generalizes to any order. This theory is based on
413

using geometric conditions on higher evolutes, polar curves and the binormal
indicatrices of a space curve. A set of characteristic numbers are introduced that capture
the differential properties of these curves in a dimensionless fashion. These numbers are
then used to derived geometric conditions for contacts of up to the nth order.
Acknowledgement
The fmancial support of this work by the National Science Foundation grant number
DMC-8796348 is gratefully acknowledged.
REFERENCES
Barner M (1961) Deuxieme Colloque de geometrie differentielle, tenu a Liege les 19,
20 et. (Organise par Ie Centre beIge de recherches mathematiques), Louvain,
Librairie universitaire, 1962, pp. 29-44.
Barsky BA (1981) The /3-spline: A Local Representation Based on Shape Parameters and
on Fundamental Geometric measures, Ph.D. dissertation, Univ. Utah, Salt Lake City,
UT.

Barsky BA, Beatty JC (1983) "Local Control of Bias and Tension in Beta-splines", ACM
Trans. on Graphics 2, April, pp.109-134.

Barsky BA, DeRose TO (1989) "Geometric Continuity of Parametric Curves: Three


Equivalent Characterizations," IEEE CG&A, Vol. 9, No.6, Nov., pp. 60-68.

Barsky BA, DeRose TO (1990) "Geometric Continuity of Parametric Curves:


Constructions of Geometrically Continuous Splines," IEEE CG&A, Vol. 10, No.1,
Jan., pp. 60-68.

Bezier Pierre E (1970) Emploi des machines a commande numerique, Masson, Paris,
France, Translated by Forrest, A. Robin and Pankhurst, Anne F. as Numerical
Control Mathematics and Applications, John Wiley, London, 1972.
Boehm W, Farin G,.Kahmann J (1984) "A Survey on Curve and Surface Methods in
CAGD", Computer Aided Geometric Design, 1, pp. 1-60.

Boehm W (1985) "Curvature Continuous Curves and Surfaces", Computer Aided


Geometric Design, 2, pp. 313-323.
Boehm W (1987) "Smooth Curves and Surfaces", Geometric Modeling: Algorithms and
New Trends, Edited by G. Farin, SIAM, Philadelphia, PA.
Boehm W (1988a) , "Visual Continuity", Computer Aided Design, Vol. 20, No.5, pp.
307-311.

Boehm W (1988b) "On the Definition of Geometric Continuity", Computer Aided


Design, Vol. 20, No.7, pp. 370-372.
414

Cauchy AL(1789-1857) Ler;ons sur les applications du calcul infinitesimal a la


geometrie, Paris, France, 1826.

DeRose TD (1985a) Geometric Continuity: A Parametrization Independent Measure of


Continuity for Computer Aided Geometric Design. PH.D. dissertation, University of
California, Berkeley, CA.

DeRose TD, Barsky BA (1985b) "An Intuitive Approach to Geometric Continuity for
Parametric Curves and Surfaces", Computer Generated Images - The State of the Art,
edited by Magnenat - Thalmann, Nadia and Thalmann, Daniel, Springer-Verlag,
Heidelberg, pp. 159-175.

Farin G (1982) "Visually C2 cubic splines", Computer Aided Design, Vol. 14, pp. 137-
139.

Farin G (1985) "Some Remarks on V2-splines," Computer Aided Geometric Design 2,


pp. 325-328.

Farin G (1988) Curves and Surfaces for Computer Aided Geometric Design, Academic
Press.

Freudenstein F (1965), "Higher Path-Curvature Analysis in Plane Kinematics,"Journal


of Engineering for Industry, Transactions of ASME, Series B,Vol. 87, No.2, May,
pp. 184-190.

Goldman R, Barsky B (1989) On beta-continuous functions and their application to the


construction of geometrically continuous curves and surfaces. Mathematical Methods
in Computer Aided Geometric Design, PP. 299-312, Academic Press.
Graustein WC (1935) Differential Geometry, The Macmillan Company, New York.

Guggenheimer HW (1963) Differential Geometry, McGraw-Hill.

Hagen H (1986) "Bezier-curves with curvature and torsion continuity" Rocky Mountain
J. Math. 16, pp.629-638.

Herron G (1987) "Techniques for Visual Continuity", Geometric Modeling: Algorithms


and New Trends, Edited by G. Farin.

Kreyszig E (1959) Differential Geometry, University of Toronto Press, Toronto.

Lagrange JL (1797)Traite des fonctions analytiques, Paris, France.

Lee C (1991) A Geometric Contact Theory for Computer Aided Geometric Design and
Kinematics. PH.D. dissertation, University of California-Davis, CA, under
preparation.
415

Loria G (1902) SpezieLLe aLgebraische und transzendente ebebe Kurven, Leipzig: B. G.


Teubner.

Manning JR (1974) "Continuity Conditions for Spline CUlves", Computer Journal, Vol.
17, pp. 181-186.

Mitrinovic DS, Ulear J (1969) Differential Geometry, Wolters-Noordhoff Publishing,


Groningen, Netherlands.

Monge G (1746-1818). Application de ['analyse a la geometrie. Paris, France, 1850.

Muller R (1891) "Uber die Kriimmung der Bahnevoluten bei starren ebenen Systemen,"
Z. Math. Phys., vo1.36, pp.193-205.

Nielson GM (1974) "Some Piecewise Polynomial Alternatives to Splines under Tension"


in Barnhill, R. E. and Riesenfeld, R. F. (eds) Computer Aided Geometric Design,
Academic Press, pp. 209-236.
Scheffers G (1915) Anwendung der Differential - und lntegralrechnungen auf
Geometrie I, Springer.

Struik DJ (1950) Lectures on Classical Differential Geometry, Addison-Wesley,


Cambridge, Mass.

Weatherburn CE (1947) Differential Geometry, Volume I, Cambridge University


Press, Cambridge, England.

Authors:
Chih Lee is currently a CAD/CAE engineer in the MCAE
Integration Department at IDM, San Jose. Before joining IDM
(1990), he was a research assistant in the Computer Integrated
Design and Manufacturing Laboratory at University of
California-Davis. His research interests include Computational
Geometry, Solid Modeling, Computer Graphics, and Kinematics.
From 84 to 89, he worked as an associate instructor in the
Department of Mechanical Engineering.
Lee received his BS and MS in Mechanical Engineering from
National Cheng-Kung University at Taiwan and University of
California at Davis in 1978 and 1983, respectively. Currently, he
is also a PhD Candidate in Mechanical Engineering at University
of California-Davis.
Address: A74/124 IDM, MCAE Integration, 5600 Cottle Road,
San Jose, CA 95193.
416

Bahram Ravani received his B.S. degree Magna Cum Laude


from Louisiana State University, Baton Rouge, LA in 1976; the
M.S. degree from Columbia University in New York in 1978 and
the PHD degree from Stanford University, Stanford, CA, in
1982, all in Mechanical Engineering.
From 1982 to 1987, he was on the faulty of Mechanical
Engineering at the University of Wisconsin-Mads ion first as an
Assistant Professor and later as a tenured Associate Professor. He
then joined the University of California-Davis where he is
presently an Associate Professor of Mechanical Engineering. In
1985, he was on leave from the University of Wisconsin and
worked for the Manufacturing Systems Product Division of mM
Corporation in Boca Raton, Florida. He was also a visiting
Professor in the Department of Mechanical and Production
Engineering at the Katholieke Universiteit of Leuven in Belgium
during the summer of 1987. He is presently the director of the
Manufacturing Automation and Productivity (MAP) program at
University of California-Davis. His current areas of interest are
CAD/CAM and Robotics, Mechanical Design and Manufacturing.
He is, presently, an Associate Editor for ASME Transactions,
Journal of Mechanical Design and on the editorial boards of the
Journal of Manufacturing Systems and the International Journal
of Robotics and Automation.
Address: Department of Mechanical Engineering, University
of California, Davis, CA 95616

An Tzu Yang, Professor of Mechanical Engineering at the


University of California-Davis. He received D. Eng. Sc. in
Mechanical Engineering at Columbia University, 1963. He is a
member of the Mechanical Engineering faculty at the University
of California-Davis, since 1964 , a member of Sigma Xi and New
York Academy of Sciences and ASME Fellow. His research
interests are Kinematic Geometry and Mechanism Theory.
Address: Department of Mechankal Engineering, University
of California, Davis, CA 95616
Generalization of a Family of Gregory Surfaces
Kenji Veda and Tsuyoshi Harada

ABSTRACT

The Gregory patch is the result of a modification of the Bezier patch using Gregory's solution for
the twist incompatibility in a Coons patch. The main feature of the Gregory patch is that the
four cross boundary derivatives are independent of each other. We present a generalized form
of a set of patches satisfying this and other properties of the Gregory patch. This generalized
surface patch is a Bezier form whose control points are expressed as an interpolation between
two points, similar to the interior points of the Gregory patch.

Key Words: Gregory patch, rational boundary Gregory patch, Brown's square, rational Bezier
patch, convex combination surface

1 INTRODUCTION

Many surfaces in geometric and solid modeling are constructed by means of a network of rect-
angular patches. Bezier surfaces, Coons surfaces and B-spline surfaces are popular surfaces.
The Gregory patch (Chiyokura 1983) is one of such surface patches. This patch is developed
by applying the Gregory's square (Gregory 1974), which is a solution of the incompatibility at
the corner of the Coon's patch (Coons 1967), to the Bezier patch. In this surface, a Bezier
point is expressed as a convex combination of two independent sub-control points, and the cross
boundary derivatives on four sides can be defined independently.

Local smooth surface interpolation methods including the Gregory patch have been discussed
and classified (Barnhill 1978, 1983; Peters 1990), and a Gregory patch for rational boundary
curves (Chiyokura 1990) and a curvature continuous Gregory patch (Takai 1990) have been
proposed. It is to be desired that the relation between continuity condition and the blending
function of sub-control points of these surfaces is clear.

In this paper a family of Gregory surfaces, that has properties which the Gregory patch does, are
discussed. The family includes most of proposed surface formulas originated from the Gregory
patch. A group of surfaces in this family can be expressed as a convex combination of two Bezier
patches as well as a tensor product. It is also shown that these surfaces have advantageous
properties in surface conversion.

417
418

2 GREGORY SURFACES

2.1 Gregory Patch

Smooth surfaces in computer aided design system are usually expressed as parametric surfaces.
A parametric surface is expressed as S( u, v) with parameters u and v. The Bezier patch is one of
the more widespread parametric surfaces. Bezier patches are intuitive and have good properties.
It is, however, difficult to join the surface patches with smoothness.

The Gregory patch (Chiyokura 1983) has been developed as a surface formula which interpolates
a surface between boundaries to preserve the continuity on each boundary. The Gregory patch
is the result of a modification of the Bezier patch using Gregory's solution (Gregory 1974) for
the twist incompatibility in a Coons patch (Coons 1967). The surface is defined as follows:

3 3
S(u,v) = LLBf(u)BJ(v)Pi,j(u,v) (0::; u ::; 1, 0::; v::; 1) (1)
i=O j=O

(i=0,3 j=0,3)

P 1,1(U, v)
uP~ 1
='
+ vP~1, ) _ UP~,2 + (
(1 - v )P~,2
P 12(U,V
, - u+ I-v ) (2)
u+v

P (u v) = (1 - U)P~,1 + VP~,1 (1- u)P~2 + (1- V)P~2


P (u v) - , ,
2,1, (l-u)+v 2,2, - (l-u)+(1-v)

where P~j and Pi,j are sub-control points that would be obtained from the cross boundary deriva-
tives of u- and v-direction, respectively. Bi(t) is the Bernstein basis function(7)(1- tt-it i .

The Gregory patch has the following properties:

• If the expression, Pi,j = Pi.j == P i,j, is satisfied for all control points, the surface is a
Bezier surface whose control points are the Pi/so

• The surface lies within the convex hull defined by all the sub-control points Pi./s and
Pi./ s .

• The value of the (first order) derivative of each boundary is calculated by Pi./s or Pi./s.

2.2 Gregory Surfaces

The Gregory patch can remove the incompatibility at a corner by defining the proper control
points of a Bezier surface as blending functions whose parameters are u and V. For example,
the incompatibility at (u,v) = (0,0) is removed by P 1,1(U,V).

There are various blending functions that can remove the incompatibility.
419

P O,3

P O,2
P¥,2 ,/P 2,2
0.

P"2,2
PO,1
• _________ - - __ 0 Pl',l
,0
P 2,1
".
v

/
,.: P 2,1

..
______ 6. ,

PO,O
PI,O

Fig. 1: Control points of a Gregory patch

Fig. 2: P;,j(u,v) in the cases that the denominator is u + v, u + v(l- v), u(l- v) + (1- u)v
oru(l-u)+v(1-v)

1. In case the denominator of a function is one of u + v, u + (1 - u )v, u(l - v) + v and so on,


the incompatibility at (u,v) = (0,0) can be removed.

2. In case the denominator of a function is one of u + v(1 - v), u + (1 - u )v(l - v) and so


on, the incompatibilities at (u, v) E {( 0, 0), (0, I)} are removed.

3. In case the denominator of a function is one of u(1- v) + (1 - u)v and so on, the incom-
patibilities at (u, v) E {(O, 0), (1, I)} are removed.

4. In case the denominator of a function is one of u(1 - u) + v(l - v) and so on, all the
incompatibilities at four corners are removed.

This situation is shown in the figure 2. In all cases, it IS shown that Pi,j{O, v) pu.
',}
and
P;,j(u,O) = Pi,j.

Selecting a blending function of control points, it is possible to define a surface having the same
properties as the Gregory patch. For example, the following control points define a surface
similar to a Gregory patch. There are two kinds of blending functions in the surface instead of
four kinds of functions as in the Gregory patch. We call such surfaces the Gregory surfaces.
420

U(l-V)pr,I+V(l-u)P¥,I UVpr,2+(1-u)(1-v)P¥,2
P I I (U, V ) = ---'---:--:"---=-7---;---:-...:..'.::. P I2 (U,V ) = --~---;-c-----,--;-:--~~
, u(l-v)+v(1-U) , uv+(l-u)(l-v)
(3)
(1-u)(1-v)P2,I+UVP~,I v(1-u)P22+u(1-v)P~2
P 21 (U,V ) = P (UV)- , ,
, (l-u)(l-v)+uv 2,2, - v(l-u)+u(l-v)

3 A FAMILY OF GREGORY SURFACES

3.1 The Surface Formula of a Family of Gregory Surfaces

A family of Gregory surfaces can be formulated by the following general form. In the remaining
part of this article, only the surfaces defined by this form are the Gregory surfaces.

m n
S(U,V) = L:L:Bi'(u)Bj(v)Pi,i(U,V) (0 :S u :S 1, 0 :S v :S 1) (4)
i=Oi=O

If (pi), + qi)' + ri)' + silo > 0) 1\ (pi),, ,


I I "
+1 rV. + silo > 0), PiJCU, v) is moves on the line between
qi)'.+, 3 I

P~i and Pi,i in accordance with the changes of the parameters u and v. The surface S( u, v) is
also interpolated smoothly.

3.2 Derivatives

First of all we provide formulas for the derivatives of the surface to discuss the continuity on
the boundaries of the surface. The first and second order derivatives of the Gregory surface are

8 mnd mn 8
8u S(u, v) = ~.t; du Bi(u)Bj(v )Pi,i(u, v) + ~.t; Bi( u)Bj( v) 8u Pi,i(u, v) (6)

~ m n ~ m n d 8
8u2S(u,v) = ~.t;du2Bi(u)Bj(v)Pi'i(U,V) + 2~.t;duBi(u)Bj(v)8uPi'i(U,V)
m n 82
+ L:L:Bi(u)Bj(v)8u2Pi,i(U,V) (7)
.=0)=0

82 mnd d mnd 8
8u8v S (u,v) ~.t; du Bi(u) dv Bj(V)Pi,i(u,v) + ~.t; du Bi(u)Bj(v) 8v Pi,i(u, v)
mn d 8 mn 82
+ ~.t; Bi(u) dv Bj(v) 8u Pi,i(u,v) + ~.t;Bi(u)Bj(v) 8u8v P i ,i(u,vX8)
421

The first and second order derivatives of each control points are expressed as following general
forms.

P(u v) _ f(u,v)PV + g(u,v)PU (9)


, - f(l1,V)+g(u,v)

a [a a ] pu _ pv
(10)
auP(u,v) = f(u,v)aug(u,v) - a}(u,v)g(u,v) (f(u,v) +g(U,V))2

a2 [a2 a2 ] pu _ pv
au 2P (u, v) = f( u, v) au2g(u, v) - au 2f(u, v)g(u, v) (f(u, v) + g(u, v))2

a
-2 [f(u,v) ( aug(u,v)
)2 + a}(u,v)aug(u,v)(f(u,v)-g(u,v))
a a

a2 )2 ] pu _ pv
( (11)
- au}(u,v) g(u,v) (f(u,v)+g(u,v))3

3.3 Independence of Derivatives at Each Boundary

The following conditions are necessary to establish that the value of the k-th order derivatives
along the boundary u = 0 is only dependent on Pi~/s, that is, Ck-continuity at the boundary
u = 0 is controllable only by Pi,j's.

Pi,j(O,v) pu (i ~ k)
',J
ak+1- i
au k+1-i Pi,j( u, v)
I 0 (i < k)
(12)
U=O

To satisfy these conditions, pi,j must be greater than k + 1 - i. Hence, if pi,j is greater than or
equal to k + 1 - i for i ~ k, it is proper that p'[,j is zero. (This condition is always satisfied for
the control points whose sub-control points are the same.)

For the other boundary, i.e. u = I, v = 0 or v = I, the continuity at the boundary is determined
independently by controlling the value of (qi,j,qU, (ri,j,r~j) or (si,j,s'f,j).

3.4 Gregory Surfaces of Rational Form

If the boundary curve is a rational Bezier curve, Gregory surface must be expressed by the
following expression similar to the rational Bezier surfaces.

m n
L L B;"( u)Bj( v)Qi,j( u, v)
5(u,v) = i:Oj~O == Q(u,v) (13)
L L B;"(u)Bj(v)Wi,j( U, v) W(u, v)
i=Oj=O
422

The derivatives of the surface S( u, v) are expressed as combinations of the derivatives of the
numerator Q( u, v) and the denominator W( u, v) of the surface S( u, v).

8 S (u,v) = W(u,v)2
8u 1 [ 8
W(u,v) 8
8u Q(u,v) - Q(u,v) 8u ]
W(u,v) (16)

1
W(U,V)2 [~ ~]
W(u,v)8u2Q(u,v) - Q(u,v)8u 2W(u,v)

8u
8
u, v) W( [ 8
-2 W(U,V)3 W(u,v)8u Q (u,v)-Q(u,v)8u W (u,v)
8] (17)

1 [ 82 8 8
W(u,v)2 W(u,v)8u8v Q (u,v)- 8u Q (u,v)8v W (u,v)

8 8 82 ]
- 8u W(u, v) 8v Q(u, v) - Q(u, v) 8u8v W(u, v)
8 8
:)3
aW(u,v)aW(u,v)
+ 2 u W(u, Q(u, v) (18)

The numerator and denominator have the same formulas as those of the usual Gregory surface.
If Q( u, v) and W( u, v) satisfy the following conditions, the value of derivatives on the boundary
°
u = is determined only by w~/s and P~/s.

Qi,j(O,V) wupu
I,) t,)
(i :::; k)
8k+l-i I (19)
8Uk+l-i Qi,j( U, v) 0 (i < k)
u=o

Wi,j(O,v) w',J
u . (i :::; k)
8 k +1 - i I (20)
8Uk+l-i Wi,j( U, v)
u=o ° (i < k)

If p't,j is greater than k + 1 - i, the conditions are satisfied. This is same as the requirement in
the case of non-rational Gregory surfaces.
423

4 VARIOUS GREGORY SURFACES

Several Gregory surfaces have been proposed and some functions which are suitable for blend-
ing the sub-control points have been developed. In this section various Gregory surfaces are
surveyed.

The following notation of a control points Pi,j(u,v) is introduced.

u pr'i(l- u)qr.ivTr'i(l - v),r,ipi,j + u Pi ,j(l - u)qi,jv Ti,j(l - v),i,jpi,j


P;)u,v)
uP~,j(l - u)qr,jvTr,j(l - v),r,j + uPi,j(l - u)qi,jvTi,j(l - v)'i,j
[Pi,j,Pi,j]
(21)

In this notation the values of pi,j and pi,j are represented by P;,j' Positive P;,j means that pi,j is
Pi,j and pi,j is zero. Negative P;,j means that pi,j is -Pi,j and Pi,j is zero. When both Pi,j and
pi,j are zero, P;,j becomes zero. This convention is also applied to qi,j, 7";,j and Si,j' On condition
that Pi,j(u,v) = Pi,j = Pi,j == Pi,j, Pi,j(U,V) is expressed as [P;,j,P;,j]/h*,*,*]. The asterisk
( *) means an integer greater than or equal to zero.

4.1 Bicubic Gregory Patch

The bicubic Gregory patch is the surface presented in section 2.1. The control points are the
following:

Since the 'surface controlled by the above points must have C 1 continuity, PO,j, q3,j, 7"i,O and Si,3
are necessary to be greater than or equal to 2. But they are all *, because the sub-control points
of PO,j{u, v), P 3,j(u,v), P;,o(u,v) and P;,3(U,V) coincide. Unless the sub-control points do not
coincide, the control points are as follows. This form is analogous to Little's generalization
(Barnhill 1977) of Gregory's correction.
The rational boundary Gregory patch is C 1 Gregory patch for rational boundary curves. The
control points of bicubic rational boundary Gregory patch are as follows:

In this surface the sub-control points coincide with PO,j( U, v), P 3,j(u, v), Pi,o(u, v) and P;,3(U, v)
but the weights of sub-control points may be different. So PO,j, Q3,j, r;,o and 8;,3 are 2. The
non-zero Pi,j, qi,j, ri,j and 8i,j of the interior control points, which must be greater than 0, are
also 2.

4.3 C 2 Gregory Patch

The control points of C 2 Gregory patch (Barnhill 1983; Takai 1990) are as follows. The reason
why the surface is biquintic is that the second order derivatives on two boundaries U = 0 and
U = 1 must be independent.
425

1
W ,3 P1 ,3
W~,3PI,3
WO,3 PO,3

WC;,2 P O,2

WO,2 PO,2

WC;,IPO,I oWU pu
_, _,_ -' --- 1,1 1,1

WO,IPO,I p W ,I2 P2,1 3


W ,2 P3 ,2
,:' w1,IPI,1
v W~,2P3,2

I
: v pv
:' W2,1 2,1

wC;,oPo,o W1,oP1,o

WO,OPO,O W~,OPI,O
W2,OP2,O

W2,OP2,O

The Brown patch is the result of applying Brown's blending function (Barnhill 1978) for two
surfaces to blend the sub-control points (Barnhill 1988), The control points of the bicubic Brown
patch are as follows:
The denominator of Brown's formula is u 2(1 - u? + v 2(1 - v)2 and Brown's formula for C 2
continuity is also shown to be u 3 (1 - U)3 + v 3 (1- V)3, see (Barnhill 1983). In the case that the
sub-control points of outer control points coincide as well as the Gregory patch, the denominator
can be replaced by the form u(l - u) + v(l - v) (see (Gregory 1983)) as follows:

[Po,o, Po,o] [Po,!, PO,I] [PO,2, P O,2] [PO,3, P O,3]


[+1, +1, -1, -1] [+1, +1, -1, -1] [+1, +1, -1, -1] [+1, +1, -1, -1]

[PI,o, PI,o] [pr,I' pr,l] [pr,2' pr,2] [P I,3, P I,3]


[+1, +1, -1, -1] [+1, +1, -1, -1] [+1, +1, -1, -1] [+1, +1, -1, -1]
(27)
[P2,o, P 2,o] [P~,I' P~,I] [P~,2' P~,2] [P2,3, P 2,3]
[+1, +1, -1, -1] [+1, +1, -1, -1] [+1, +1, -1, -1] [+1, +1, -1, -1]

[P 3 ,o, P 3 ,o] [P 3 ,I,P 3 ,d [P 3 ,2,Pd [P 3,3, P 3,3]


[+1, +1, -1, -1] [+1, +1, -1, -1] [+1, +1, -1, -1] [+1, +1, -1, -1]

5 UNIFORM GREGORY SURFACES

5.1 Uniform Gregory Surfaces

We call the subset of Gregory surfaces, in which all blending functions of control points have the
same form as that of the Brown patch, the uniform Gregory surfaces. Since this kind of blending
functions must remove the incompatibility at the four corners, th~ control points Pi,j(u,v) are
expressed as follows. In the case of p = q = r = 5, we call them the Brown patches (of order p).

p.(u v) = u P(l - u)qPi,j + vr(l - v)"P~j = [Pi,j,p~J


(28)
.,J , uP(l-u)q+vr(l-v)' [+p,+q,-r,-s]

The uniform Gregory surface is transformed as follows.

S(u,v)
427

m n m n
uP(l- u)qI: I: Bi(u)Bj(v)Pi.i + vr(l- v)'I: I: Bi(u)Bj(v)Pi.i
i=oi=o i=oi=o
up(l - u)q + vr(l - v)'
uP(l - u )qSU( u, v) + vr(l - v )'SV( u, v)
up(l - u)q + vr(l - v)'
uP(l - u)q vr(l - v)' (29)
----:,--~;--_.:..,.-__:_;; S U( U v) + . S v (u v)
up(i - u)q + vr(l - v)' , up(l - u)q + vr(l - v)' ,

The uniform Gregory surfa.ce is represented a.s a convex combination of two Bezier surfaces
SU(u,v) and SV(u,v) with the same degrees. The properties of this surface are summarized in
the following sections.

5.2 Uniform Gregory Surfaces of Rational Form

The rational form of the uniform Gregory surface is transformed as follows:

S(u,v)
~ ~ Bm(u)B n v) (UP(l - u?w'f,j + vr(l - v)'Wi,j)
~ko' J( up(l-u)q+vr(l-v)'

~
o ~
~
Bm(u)Bn(v)
JI
(uP(l - u)qw V.p':'. + vr(l _ v)Swu.p utJ.)
til I,) I,)
i=Oj=O
:t t
k=OI=O
B;;'(u)Bf(v) (uP(l - utWk.l + vr(l - V)'WJ:.l)

:t t Bi( u )Bj( V) (m n uP(l - U )qw'f,jPi.j + vr(l - v)'wi,jPi,j )(30)


i=Oj=O E~B;;'(u)Bi(v) (uP(l- U)qW k.1 + vr(l - V)SWJ:.l)

Thus the rational form of the uniform Gregory surface is recognized as the Gregory surface
having the following control points:

If the derivatives of these control points are calculated, the derivatives of the surfaces are cal-
culated using the formulas in section 3.2 instead of those found in section 3.4.

5.3 Conversion of Gregory Surface to Rational Bezier Surface

Since the Gregory surface is originally a rational polynomial surface, it can be converted to a
rational Bezier surface. The degree of the converted surface is raised by the degree of the least
428

common multiplier of all the denominators of original control points. For example, the bicubic
Gregory patch and rational boundary Gregory patch can be converted to a rational Bezier
surface of degrees 7 x 7 and 11 x 11, respectively (Takamura 1990). The Gregory surface
cannot be subdivided into two Gregory surfaces. The rational Bezier surface can be divided
into two rational Bezier surfaces.

Generally, complicated calculations are reqired to convert non-uniform Gregory surfaces to ra-
tional Bezier surfaces, including the basis conversion from the power basis to the Bernstein basis.
A Gregory surface with less variation of blending functions for its control points is converted to
a rational Bezier surface of lower degree.

In the case of the uniform Gregory surface, the least common multiplier of all the denominators
of control points is the uniform denominator itself. The conversion of a uniform Gregory surface
to a rational Bezier patch is achieved as shown below without the basis conversion.

The expression u P(l- u)qSU(u,v) can be converted to a Bezier surface.

m n

;=0 j=O

;=0 j=O
m+p+qn+T+.
=> I: I: B;n+p+q(u)Bj*+'(v)P~J' (32)
;=0 j=O

Here, the following relations are held between Pi,j' Pi,; and Pi,1'-

(m +p+ q)p Uf
(7)p~j (33)
i +P ',3

(n + r + S)pUff
j ',3
' " j)
min(n,
L...J
(+) ( )
r
'-k
S n
k
pu f
.,k
(34)
k=max(O, j-T-.) J

The other expression v T(l- vySV(u,v) can be converted in the same way. The numerator of
the surface S( u, v) yields the sum of two Bezier surfaces. The denominator can be converted to
the form of a Bezier surface by replacing all control points Pi,j and Pi,j in the numerator with
1.

u P(l- u)qSU(u,v) + v T(l- vySV(u,v)


S(u,v)
u P(1- u)q + v T(l - vy
m+p+qn+T+.
I: I:B;n+p+q(u)Bj*+'(V)Pi.J
;=0 j=O
=> m+p+q n+r+.
I: I: B;n+p+Q(u)B;+r+'(v)(u P(l - u? + v r (l - v)')
;=0 j=O
429

m+p+qn+r+s
~
L..J
~ Bm+p+q(u)Bn+r+s(v)w·
L...J" J
.p.I!1
",3 1,}
;=0 j=O
m+p+qn+r+s (35)
L: L: B;,,+p+q(u)Brr+S(V)Wi,j
;=0 j=O

Thus a uniform Gregory surface of the degrees m x n is convertible to a rational Bezier surface
with the degrees (m + p + q) x (n + r + s). It is obvious that the uniform Gregory surface
with rational form is convertible to a rational Bezier surface in the same manner. For example,
the bicubic Brown patches of order 1 which correspond to the bicubic Gregory patches are
convertible to 5 x 5 rational Bezier patches and the bicubic Brown patches of order 2 which
correspond to the bicubic rational boundary Gregory patches are convertible to 7 x 7 rational
Bezier patches. However, there are the zero control points and weights at the four corners
according to the p, q, r and s. These zeros result from the singularity at the corners of Gregory
surfaces.

(~~~ ......... ~~~ ) (36)


.... '" .... .
Q qr ... Q qs

Q pq is the zero matrix of degrees p x q.

5.4 Other Properties

Two more properties of uniform Gregory surfaces are introduced.

If the degree of non-uniform Gregory surfaces are raised, new formulas for blending functions are
required to express the new control points. It is impossible to change the degree of non-uniform
Gregory surfaces using original blending functions. In the uniform case, the degree of surface
can be raised simply as follows:

S(u,v)
m n m n
uP(l - u)qL: L: B;"( u)Bj( v )Pi,j + vr(l - v)sL:: L: B;"( u)Bj( v)Pi,i
;=OJ=O ;=OJ=O

m' n' m' n'


uP(l - u)qL: L B;'" (u)Bj' (v)Pi,; + vr(l - v)'L: L:: B;'" (u)Bj' (v)PiJ
;=OJ=O ;=OJ=O

(37)
430

:/1'> .
3,3

f~

Fig. 4: The common control points of surfaces in Figures 5 and 6

In the case that p, q, rand s in the blending function are equal like a Brown patch, the curves
S( u, u) and S( u, 1 - u) are integral Bezier curves instead ofrational curves. Because the control
points are free from parameters, when v = u or v = 1 - u.

P ( ) u P(I- u)qp~>,}. + v r (1 - v)Sp",.


>,}
uP(1 - u)P(PV>,}. + p",.)
>,} Pi,j + Pi,j (38)
i,j U,V = u (l- u)q + v (l- v)'
p r = up(l- u)p + up(l- uy 2

5.5 Comparison Between Gregory Surfaces

The shape of the Gregory surfaces with sub-control points Pi,j's and Pi,j'S depends upon its
blending functions. Different blending functions interpolate the surface differently. In Figures 4,
5 and 6, the common control points, a Gregory patch and the contour lines of bicubic uniform
and non-uniform Gregory surfaces are shown. The interior control points of the surfaces in
Figure 6 are' as follows:

[PI ,l' P¥,IJ [PI2,P¥2J


, , [PIl,P¥IJ
, , [pr 2,P¥2J
[+1, +1, -1, -IJ [+1, +1, -1, -IJ [+1,0, -1, OJ [+I,O,O,-IJ
(a) : (b) :
[P~,I' P~,IJ [P~ ,2,P~2J
, [P~,I,PhJ [P~,2' P~,2J
[+1, +1, -1, -IJ [+1, +1, -1, -IJ [0, +1, ~1, OJ [O,+1,O,-IJ
(39)
[PI,I' P¥,IJ [PI2,P¥2J
, , [PIl,P¥IJ [PI2,P¥2J
[+2, +2, -2, -2J [+2, +2, -2, -2J [+2,0, -2, OJ [+2,0,0, -2J
(c) : (d) :
[P~,I' P~,IJ [P~,2' P~,2J [P~,I' P~,IJ [P~,2' P~,2J
[+2, +2, -2, -2J [+2, +2, -2, -2J [0, +2, -2, OJ [0, +2, 0, -2J

Non-uniform Gregory surfaces, which remove the incompatibility locally like a Gregory patch,
may generally be more sensitive from variation of their sub-control points than the uniform
Gregory surfaces like a Brown patch.
431

Fig. 5: A Grogory patch

Fig. 6: Uniform Gregory patches (a, c) and non-uniform Gregory patches


432

Fig. 7: Two adjacent Grogory patches

6 CONCLUSION

A investigation of the theory of Gregory surfaces has been made and the expressions -and proper-
ties of various Gregory surface formulations has been provided. Using Gregory surfaces reduces
the joining problem of surface patches to the problem of joining two Bezier surfaces on their
common boundary curve (Figure 7). Once the smoothness conditions (Kahmann 1983) at each
boundary are satisfied, the interior of the surface is interpolated smoothly by the Gregory sur-
face.

Uniform Gregory surface is a convex combination surface and can be easily converted to a
rational Bezier surface. It is possible to construct a CAGD system based on rational Bezier
surfaces with the aid of the Gregory surface. Note that some Bezier control points and their
weights are zero.

It is obvious that various Gregory surfaces, which satisfy some smoothness conditions, can
be defined based on the same sub-control points (Figure 6). These surfaces may have different
blending functions from the functions presented in this work. The optimal surface depends upon
the system implementation or its application. The difference in shapes of Gregory surfaces with
the same sub-control points affects the surface conversion and approximation. Further research
on the shapes generated by such Gregory surfaces and evaluation of the calculation costs of
them are required.
433

Acknowledgement

We would like to thank Dr. Hiroaki Chiyokura, Assistant Professor at Keio University, for
advising us in our research, and Dr. Hideko S. Kunii, General Manager of Software Division of
RICOH Co., Ltd., for encouraging us and giving advice.

REFERENCES
Barnhill RE (1977) Representation and Approximation of Surfaces, Mathematical Software
III, Ed. by Rice JR, Academic-Press, pp. 69-120
Barnhill RE, Brown JH, Klucewicz 1M (1978) A New Twist in Computer Aided Geometric
Design, Computer Graphics and Image Processing, Vol. 8, pp. 78-91
Barnhill RE (1983) Computer Aided Surface Representation and Design, Surfaces in Computer
Aided Geometric Design, North-Holland, pp. 1-24
Barnhill RE, Farin G, Fayard L, Hagen H (1988) Twist, Curvatures and Surface Interrogation,
Computer Aided Design, Vol. 20, no. 6, pp. 341-346
Chiyokura H, Kimura K (1983) Design of Solids with Free-form Surfaces, Computer Graphics,
Vol. 17, No 3 pp. 289-298
Chiyokura H, Takamura T, Konno K, Harada T (1990) G 1 Surface Interpolation over Irregular
Mashes with Rational Curves, Frontiers in Geometric Modeling, SIAM
Coons SA (1967) Surfaces for Computer Aided Design of Space Form, Report MAC-TR-41,
Project MAC, MIT
Gregory JA (1974) Smooth Interpolation Without Twist Constraints, Computer Aided Geo-
metric Design, Academic-Press, PP . 71-87
Gregory JA (1983) C 1 Rectangular and Non-Rectangular Surface Patches, Surfaces in Com-
puter Aided Geometric Design, North-Holland, pp. 25-33
Kahmann J (1983) Continuity of Curvature Between Adjacent Bezier Patches, Surfaces in
Computer Aided Geometric Design, North-Holland, pp. 65-75
Peters J (1990) Local Smooth Surface Interpolation: a Classification, Computer Aided Geo-
metric Design, Vol. 7, pp. 191-195
Takai K, Wang KK (1990) Curvature Continuous Gregory Patch: a Modification of Gregory
Patch for Continuity of Curvature, Japan-U.S.A. Symposium on Flexible Automation
Takamura T, Ohta M, Toriya H, Chiyokura H (1990) A Method to Convert a Gregory Patch
and a Rational Boundary Gregory Patch to a Rational Bezier Patch and Its Applications, CG
International '90 - Computer Graphics Around the World, Springer-Verlag, pp.543-562
434

Kenji VEDA is a member of the 3D CAD project at RICOH's Soft-


ware Division. His research interests include solid modeling, geometric
modeling, computer graphics, and their applications. He received BS
and MS degrees in computer science from Keio University in 1975 and
1977 respectively. He is a member of the Information Processing Society
of Japan.
Address: RICOH Co., Ltd. Software Division, 1-1-17, Koishikawa
Bunkyo-ku, Tokyo, 112, JAPAN
Tsuyoshi HARADA is a member of the 3D CAD project at RICOR's
Software Division, is interested in solid modeling, geometric modeling
and their applications. His current research includes the continuity of
rational free-form surfaces for rounding operations. He received a BS
and an MS in precision machinery engineering from the University of
Tokyo in 1986 and 1988 respectively. He entered the solid modeling
project at RICOH in 1988, which has now developed into the product
DESIGNBASE.
Address: RICOH Co., Ltd. Software Division, 1-1-17, Koishikawa
Bunkyo-ku, Tokyo, 112, JAPAN
A New Control Method for Free-Form Surfaces
with Tangent Continuity and its Applications
Kouichi Konno, Teiji Takamura, and Hiroaki Chiyokura

ABSTRACT

A popular method of representing a free-form surface is to interpolate the curve mesh which de-
scribes the boundary of the surface. When complex shapes with free-form surfaces are designed,
sometimes irregular meshes are created. Generally such irregular meshes are interpolated by
more than two subpatches, and distortion may occur. One reason why distorted surfaces are
generated is that the cross boundary derivatives are always approximated linearly. We present a
new surface interpolation method which can control the cross boundary derivatives, and there-
fore which can modify distorted surfaces. In this method, a biquartic Gregory patch is generated
if the boundary mesh consists of cubic Bezier curves. We define a new layout for the control
points of the Gregory patch which is based on the derivative vectors for controlling the surface
shape. Therefore the user can model the surface shape, keeping G 1 continuity, while moving
these control points intuitively. In addition, we apply this surface control to the generalized
Gregory patch which can interpolate arbitrary piecewise curve meshes that consist of n curve
segments.
Keywords: Gregory patch, generalized Gregory patch, Bezier patch, cross boundary derivative

1 Introduction
One of the most important requirements of solid object design in CAD systems is to be able
to model complicated objects easily. If an object which is to be created has free-form surfaces,
its designing method must be easy and intuitive for users. Generally, the method of defining
free-form surfaces for an object is to input the characteristic lines of its shape. In this method,
the user first defines a curve mesh as the surface boundary, and then interpolates surface patches
within the curve mesh. Various representations of surface patches exist, and each patch has its
own advantages and disadvantages.

The representation of surfaces in the computer was pioneered by Coons and Bezier[Barnhill 85],
who proposed the Coons patch[Coons 64] and Bezier patch[Bezier 66][Bezier 67]' respectively.
Many extensions and generalizations of these surface representations have been studied to pro-
duce high quality surfaces[Farin 88][Faux 79][Rogers 90][Sarraga 87][Shirman 87][Takai 90]. For
example, Gr€gory applied a compatibility correction[Barnhill 78] to the Coons patch, so that the
user does not have to specify cons·istent twist vectors for this surface representation[Gregory 74].
On the other hand, Chiyokura and Kimura proposed the Gregory patch where they applied this
compatibility correction to the Bezier patch[Chiyokura 88]. One characteristic of the Gregory
patch is the ability to define a first order partial derivative vector for each u-v direction along the
boundaries independently. This means that even if the curve mesh is irregular[Chiyokura 86],
it can be interpolated with G 1 continuity[Bartels 87]. In Chiyokura's method, a surface is in-
terpolated by using quadratic CBD(Cross Boundary Derivative) functions, which are defined

435
436

o
o
o
",,).. ...........

-'
,-'
-'

(a) (b)

Figure 1: Curve meshes.

by curve meshes[Chiyokura 86]. However, sometimes problems arise with this interpolation and
this causes distorted surfaces. These problems come from the fact that the CBD function is
always set by the tangent vectors of the boundary cur~es. The cross boundary derivatives
greatly affect the surface shape. Shirman and Sequin proposed a method of controlling the
surface shape by using a CBD function for setting some shape parameters[Shirman 90]. In their
method, the CBD function can be specified with less restriction. But, when two surface patches
which meet with G 1 continuity are applied, only symmetrical surfaces at the common boundary
can be generated. Due to that fact, the design of solid objects which include free-form surfaces
is limited.

Therefore we propose a new surface interpolation method to overcome the above problems.
In our method, a higher degree Gregory patch compared to the degree of the curve mesh is
generated. For example, we generate a surface consisting of a biquartic Gregory patch if the curve
mesh consists of cubic Bezier curves. The distorted surface shape can be modified by using this
surface representation. We propose a method of controlling the surface shape intuitively. The
control points of the surface are redefined, based on the cross boundary derivatives. Designers
can modify the surface shape freely, as the relationship between the surface shape and the control
points is clear. Furthermore we have adapted this shape control method to the generalized
Gregory patch, which is generated from applying a compatibility correction to the Coons patch.
Using this patch representation and shape control, we can locally modify the surfaces which are
generated on curve meshes consisting of piecewise curves.

2 Problems in interpolating curve meshes

Figure 1 shows two examples of typical curve meshes, and Figure 2 shows the cross sections of
the interpolated surfaces on the curve meshes using Chiyokura's method. In both these cases,
the surfaces are distorted with waves. In the following section, we describe Chiyokura's method
and the reasons why these surface waves are generated.

In Chiyokura's method, cubic Bezier curves are used for the boundary curves. The curve meshes
are interpolated by bicubic Gregory patches. The representation of a bicubic Gregory patch is
as follows (Figure 3):
3 3
G(u,v) = :E:EB?(u)B;(v)P;j(u,v) (1)
;=OJ=O
437

(a) (b)

Figure 2: Cross section.

Figure 3: A bicubic Gregory patch.

where
uP llO + VP lll
Pll
u+v
UP 120 + (1 - v)P l2l
u+(l-v)
(1 - u )P 210 + vP 211
(l-u)+v
(1 - u)P no + (1 - v)P 221
(l-u)+(l-v)

Bi( u) is a Bernstein polynomial:

(2)

The bicubic Gregory patch is defined by 20 control points as is shown in Figure 3. 12 control
points P ij (ij f= 11,12,21,22) represent the surface boundary, and the remaining 8 points P ijk
438

_ _ _ _ w ' . . _ _ _ _ _ _ . . . . __ _

s· s·

Lu Co

b~-------------~---

Figure 4: Joining two Gregory patches.

(ij = 1l, 12,21,22; k = 0,1) represent the cross boundary derivatives. The control points which
represent the patch boundaries coincide with the control points of the curve meshes, therefore
we show how the interior control points are determined.

Suppose that two patches sa and Sb meet at the common boundary curve Eo as shown in Figure
4. Vectors ai, b i (i = 0" .. ,3) and Cj (j = 0,1,2) represent the vectors between control points.
The derivative vectors at any point on the boundary Eo are derived from the polynomials of the
vectors ai, b i (i = 0", ,,3) and Cj (j = 0,1,2). When two patches are joined with G I continuity,
vectors b I and b 2 can be derived by solving the following equation:

OSb(O,V) = k( )osa(l,v) h( )oSb(O,v) (3)


ou v OU +v ov·
k( v) and h( v) are linear functions as follows:

k(v) = ko(l - v) + klv, (4)

h(v) = ho(1- v) + hlv, (5)


where ko, kI, ho and hI are real numbers. If a curve mesh is interpolated by a cubic Gregory
patch, the order of the left hand side of equation (3) becomes cubic. In this case, the derivative of
patch sa must be quadratic. This means that asymmetric patches are generated when symmetric
curve meshes are used. The equations for hI and b 2 are represented by the polynomials of a;
(i = 0,···,3) and Cj (j = 0,1,2). The tangent vectors of the boundaries are known, but the
two vectors al and a2 are unknown, therefore we cannot solve for vectors b I and h 2 .

These problems are solved by the use of a basis patch[Chiyokura 88]. When patches sa and Sb
are to be joined, a basis patch is virtually defined, and patches sa and Sb are joined by using
it. A CBD function, which represent cross boundary deriVatives along the boundary, is used
to define the basis patch. For example, in Figure 5, the CBD function go(t) of the boundary
curve Eo is defined by EI and E 2 • If two surfaces along the common boundary meet with G I
continuity, the CBD function go(t) will be defined by boundary curves E I , E 2 , E3 and E 4 • This
CBD function represents the first order partial derivatives of the surface to be interpolated along
the boundary, and this function greatly affects surface shape. Chiyokura represented this CBD
function by a quadratic Bezier function, as follows:
2
go(t) = EB;(t)a; (6)
i=O

where B;(t) is a quadratic Bernstein polynomial as shown in equation (2).


439

-'--.

---+----.,.. go(t)
Eo
ro

E,
------._-e------ Vo qo ----------0------_.
Po

Figure 5: Determining the CBD function


b3

~
a2

b2

a,
b,

~
bo

Figure 6: Basis patch

Now we describe the method of determining the CBD function. In Figure 5, the three vectors
Po, qo and ro and the three vectors PI, qI and rl come from the control points of the Bezier
curves of the boundaries. These vectors represent the direction of the tangent vectors of the
curves connected to the two end points Vo and VI. If the three vectors Po, qo and ro lie on the
same plane, and if the three vectors PI, qI and rl lie on the same plane, then the two surfaces
which have a common boundary will meet with C I continuity. Usually the elements of this CBD
function ai (i = 0, ... ,2) are set as follows:
+ qo Po (7)
ao=
+ qol'
IPo
PI + qI
a2 = (8)
IPI + qII'
ao + a2 (9)
al = --2-·

From the above equations, we see that the CBD function is linearly approximated. The 1 x 2
degree patch which consists of the elements ai (i = 0,1,2) can be defined virtually. This patch is
called the basis patch (Figure 6). The control vectors hI and h2 can be derived from equations
(3), (4) and (5) as follows[Chiyokura83):

(10)
440

(11)

In Figure 2, the generated surfaces are distorted, because the elements of the CBD function
are approximated linearly. The CBD function has length and direction as its elements, so these
two elements have to be specifically adapted to generate a non-distorted surface. For example,
in the case of Figure 2(a), the direction of the cross boundary derivatives is inappropriate, and
in the case of Figure 2(b), the length of the cross boundary derivatives is too long. Distorted
surfaces are generated in both cases. Therefore, we must consider a method of specifying a CBD
function that will generate smooth surfaces.

3 Control point layout of the Gregory patch

The bicubic Gregory patch is defined by 20 control points (Figure 3). 12 control points P ij
(ij =I- 11,12,21,22) represent the surface boundary and the remaining 8 points P ijk (ij =
11,12,21,22; k = 0,1) represent the cross boundary derivatives. The method of modifying the
surface shape by moving the boundary control points is intuitive and effective, because the
relationship between the surface shape and the boundary control points is clear. On the other
hand, if we consider the interior control points, the surface shape can be modified by moving
these points, but the relationship between the surface shape and control points is not so clear.
Moreover, the designer has to take the adjoining surfaces into consideration if he/she wishes
to move the interior control points while keeping G 1 continuity. Since generally these interior
control points cannot be moved freely, we will rename these control points, definition points,
and we will generate points that can control the surface and call them control points.

Q2
"'----
.:.:........... Q3
/Q6
v

Figure 7: New control point layout of the Gregory patch.

We propose the new control point layout of the Gregory patch as indicated in Figure 7. In this
figure, the boundary control points are coincident to the original boundary definition points
(Figure 3), but the interior control points are represented by only 4 points. The interior control
points are either initially calculated from the definition points when the patch is interpolated
by Chiyokura's method, or they can be specified by a user. For example, the point Ql is on
the boundary curve at surface parameter v = 0.5. The direction ,of vector QIQ2 is defined from
the partial derivative vector at point Ql' The other points Qi (i = 3, ... ,8) are represented in
the same manner. Next, the positions of the interior corrtrolpOints Qi (i = 2,3,6,7) have to
be defined. We consider the four control points (e.g., Q. (i= 1,'" ,4)) wmchhave the same
441

(a) (b)

Figure 8: Control point layout for distorted surfaces.

parameter direction. These points represent two end points and their cross boundary derivatives.
A cubic Bezier curve can be defined when the length of the vector Ql Q2 and Q3Q4 becomes ~
of the cross boundary derivatives. In this representation, the relationship between surface shape
and control points becomes clearer. For example, the points Ql,Q2,Q3 and Q4 represent the
Bezier curve S(u,O.5). A cross section of the surface shape becomes similar to a Bezier curve,
and if this curve shape becomes distorted, the surface shape also becomes distorted. Thus, this
curve shape can help in the modification of the surface shape. The relationship between this
curve and its control points is clear because it is just cubic Bezier curve control, and therefore
the surface shape can be modified by moving the interior control points easily and intuitively.
When the surface shape is modified by moving any of the control points, the surface has to be
interpolated again while keeping the new cross boundary derivatives constant.

The new control point layout for the two distorted surface cases of Figure 2 are indicated in
Figure 8. In the case of Figure 8(a), we know that the Bezier curve which is defined from the
control points Qi (i = 1,· ··,7) is distorted. Therefore, the generated surface is also distorted.
The cause of this distortion comes from the fact that the CBD functions along the boundaries
Eo, El and E2 are linearly approximated. In the case of Figure 8(b), the surface is also distorted,
because the length of the vectors Ql Q2 and Q3Q4 distorts the Bezier curve that is defined from
Ql,Q2, Q3 andQ4.

4 Joining two surface patches

In the previous section, we showed the new control point layout of the Gregory patch. The
control points represent the cross boundary derivatives at the middle points of the boundaries.
In this section, an algorithm is described to connect Gregory patches while having the constraint
of keeping the interior control points derived from the common boundary constant.

In our method, cubic Bezier curves are used for the boundary curve meshes. When we interpolate
the curve meshes, we raise the degree of the boundaries, and represent the meshes by biquartic
Gregory patches. The representation of the biquartic Gregory patch is as follows (Figure 9):
442

P330L
P3~i--_
-. - - - -'--b
P 12 P 22

POI : P 21 P 311
P 31 {--- __

_____ .&_______ -e_

P OO P 20
P OO

Figure 9: biquartic Gregory patch.

4 4
G(u,v) = "L,"L, B;(u)BJ(v)P;j(u,v) (12)
;=0 j=O

where
uP llO + VP ll1
u+v
UP l30 + (1 - v)P l3l
u+(l-v)
(1 - U)P 310 + VP 311
(l-u)+v
(1 - u)P 330 + (1 - V)P 331
(l-u)+(l-v)

Bt( u) is a quartic Bernstein polynomial as shown in equation (2).

In Figure 10( a), two surfaces sa and Sb have the common boundary curve C( v), which is a cubic
Bezier curve. Denote the first order partial derivative of the patch, S(u,v) (0:::: u,v :::: 1) for
each parameter u, v as:

as(u,v) = su (u,v,) as(u,V) = sv (u,v


. ) (13)
au av .
We define two general derivative vectors along the common boundary C( v) as S~(l, v) and
S~(O, v). We consider the joining of two Gregory patches sa and Sb with G 1 continuity by using
a basis patch. In Figure lO(b), the real patch sa is joined to the basis patch SC, and the real
patch Sb is joined to the basis patch Sd in Figure 10(c). The partial derivatives of each basis
patch, S~ and S~, along the common boundary have the same direction at all the common
boundary points. If patches sa and SC meet with G 1 continuity, and patches Sb and Sd meet
with G 1 continuity, then the two real patches sa and Sb will meet with G 1 continuity. In this
example, we will consider the joining of the real patch Sb and basis patch Sd.

The joining condition between the two patches with G 1 continuity is shown in equation (3).
k( v) and h( v) are arbitrary scalar functions. Suppose that the basis patch S~ is a quadratic
443

-
s· Sb S· Sc
C(v) C(v)

Lu .
v S· v s~ v S~ v S~

Lu ..J
Lu Lu
(a) (b)

a2
P2

-. --
Sd Sb\ S· Sb
C(v) C(v)
a,

Sd
Lu
v S~ V q,
Lu
PI
Lu Lu
V V

.J

ao

(e) (d)

Figure 10: Joining two surface patches.

function, then the CBD function is defined as in equation (6). The vectors Po, qo, P2 and q2 in
Figure 10(d) represent the vectors between the cop.trol points which are connected to the ends
of the common boundary. The elements ao and a2 of the CBD function are set as in equation
(7) and (8).

The vectors PI and qI are determined by the new control points of the surfaces. Suppose that
the CBD function g(t) is pre-defined as follows:

g(0.5) = PI + qI . (14)
IPI + qII

Then, vector al is calculated as:

g(0.5) - [BJ(0.5)ao + Bi(0.5)a2]


(15)
al = Bi(0.5) .
Now, we defined the 1 X 2 degree basis patch.

The three cross boundary derivatives at the surface parameter v = 0.0, v = 0.5 and v = 1.0 are
already defined. To satisfy equation (3), the scalar functions k(v) and h(v) must be quadratic
and they are defined in the following quadratic functions:

k( v)= ko(1 - v)2 + 2kI (1 - v)v + k2V 2, (16)

h(v) = ho(l - v)2 + 2hI(1- v)v + h2V 2. (17)


Coefficients ko, kJ, k2' ho, hI and h2 are real numbers. As is shown above, equation (3) is
represented by a quartic polynomial equation. Therefore, the degree of the generated patch
becomes biquartic.
444

b, /C(v)
--'"
a2

Sd
(c3 b3 S'
"\
Sd I:
C2 s·
"\

C2
b2 ,
c,
a,
c,
b,
v~
, Co
."=
aD =b D
(a) (b)

Sd
/ S~

\
(c)

Figure 11: The vector between control points.

We will derive the equation to connect patch Sb and the basis patch Sd Figure l1(a) shows
the vectors, ai (i = 0, 1,2), b i (i = 0,··· ,4) and Ci (i = 0,···,3), which come from the control
points. The derivatives S~(I,v), S~(O,v) and S~(O,v) can be represented by polynomials ofthe
above vectors. These polynomials are substituted in equation (3) to give:
4 2 3
2: Bt(v)bi = k(v) 2:B;(v)ai + h(v) 2:B;(V)Ci (18)
i=O i=O

The term L;=o Bl( v)ai is generally cubic if the degree of the patch is quartic. But we will restrict
this term to be a quadratic Bezier function. The boundary curve C( v), and the surface boundary
Sb(O, v) are coincident. Since the boundary curve C( v) is a cubic Bezier curve, the derivative
S~( 0, v) can be represented by a quadratic Bezier function. Therefore the term L7=o B;( V)Ci can
be reduced by 1 degree. Denote the vectors between the control points of the boundary curve
C(v) as cj (j = 0,1,2) (Figure ll(b)). The derivative S~(O,v) becomes as follows:
3 2
42:B;(v)Ci = 32:B;(v)c;. (19)

Now the coefficients of the scalar function k( v) and h( v) can be determined from the 3 values
of v. If v = 0.0, then
3
b o = koao + -hoc o·
I
(20)
4

For v = 1.0,
(21)

As shown in Figure l1(c), let the derivative vectors of the surfaces at the common boundary
points (v = 0.5) be a, band c. We get the following equation:
b = k(0.5)a + h(0.5)c. (22)
445

We can get the coefficients ko, k2' ho and h2 from the hvo equations (20) and (21), and also get
the values for kl and hI from the equations (16), (17) and (22).

Finally, we can solve equation (18), and the interior control vectors can be calculated as follows:

(23)

(24)

(25)

Therefore, the biquartic Gregory patch with C l continuity can be generated when the cross
boundary derivatives at the parameter 0.5 are specified. In this interpolation, the element al of
the CBD function is specified by the cross boundary derivatives at the middle of the boundary.
If the cross boundary derivatives at the middle of the boundary are appropriate, non distorted
surfaces are generated.

5 Controlling the surface shape

In the above section, we derived the equations for the control points when joining two patches by
using it quadratic CBD function. The biquartic Gregory patch is generated from these equations.
In this section, we present a method of controlling the surface shape of this patch.

Two Gregory patches sa and Sb which have a commOll houndary curve Co are shown in Figure
12( a). The interior control points Qi (i = 1, ... ,9) of the two patches which are calculated from
the definition points generated by Chiyokura's method and their cross sections are displayed.
Each interior control point represents the cross boundary derivative at the boundary(parameter
0.5). Assume that the two patches meet with C l continuity at the common boundary Co.
Now if a user wishes to change the direction of a tangent plane at QI, a user selects the interior
control point Q2 and moves it to an arbitrary position as shown in Figure 12(b), Then, the
surface shape is modified according to the vector QI Q2' At this time, we have to change the
direction of the adjacent control point Q3, so as to keep C I continuity. The cross section of the
modified patch is displayed in Figure 12(b).

Figure 13 shows what happens if a user wishes to change the length of the derivative vector
between QI and Q3 of Figure 12(b). In this case, the control point Q2 is not changed, because
the direction of the tangent plane is not modified. The cross section of the modified. patch is
displayed in Figure 13. The shape of the surface has changed, b1lt C I continuity is kept.

The.refore, from the above cases, if two patches meet with C I continuity, it is necessary to change
the control point Q3 when the direction of control point Q2 is changed. But when the length is
changed between control points QI and Q3, there is no need to touch the control point Q2'

U sing the above rules, we can modify a distorted surface manually. In this modification, we
locally change the control points which connect the common boundary. It is usually effective
if the surface is modified locally, but if the shape is distorted over a \vide area, it is sometimes
446

Q,

(b)
Figure 12: direction vector is changed.

Figure 13: vector length is changed.


447

QIQ2
. Q.Q.,
(1,' Q
'. 7 Q.
(16'
(a) (b) 10

Q7

(e) (d)

Figure 14: The algorithm of the global control (1).


S'

(e) (d)

Figure 15: Global control of the surface (1).

difficult to modify the surface shape manually. Therefore, we need a method of controlling
the surface shape automatically and globally. We propose two kinds of global operations. The
first kind is used in the case of the condition of not having GI continuity at both ends of the
boundary. The second kind is used in the case of the condition of having G I continuity at one
end of the boundary. In the followin!!;, we describe these two methods for the global control of
a surface shape.

The algorithm for the first case is now described. The original control points are shown in Figure
14(a). In this case, the continuity at Ql and QlO do not change, even if control points Q2 and
Q9 are moved.

1. Generate an arc of a cubic from the four boundary points QI> Q4, Q7 and QIO (See Figure
14(b )). If these points lie on the sanI(, line, generate a straight line.
2. Divide this arc (line) at the points Q" and Q7. Therefore, three arcs (lines) AI> A2 and A3
are generated (See Figure 14( c)).
3. Calculate the derivative vectors at the end points of each arc (line), AI, A2 and A 3 .
448

(a) (h)

~ (c) Q,

Figure 16: The i11gorithm of the global control (2).

4. The control point Q2 is calculated from the vector whose direction is the same as the
derivative vector of the arc Al at the point QI and whose length is ~ of the derivative
vector. The control point Q3 is calculated from the vector whose direction is the opposite
to the derivative vector of the arc Al at the point Q4, and whose length is ~ of the derivative
vector. In the same way, the control points Q-s, Q6, Qs and Q9 are calculated from the
arcs .42 and A3 (See Figure 14(d)).

An example of an application of the above al.e;orithm is shown in Figure 15. Cross sections of
the three surfaces sa, Sb and sc are distorted ~s shown in Figure 15(a). The control points, Qi
(i = 1"",10) of the distorted surface are shown in Figure 15(b). vVhen the control points are
moved as shown in Figure 15(c), smooth surfaces are generated as in Figure 15(d).

Next, we describe the algorithm for the second case. The original control points are shown in
Figure 16( a). In this case, the shape of the next patch is already determined, and the direction
of the control point Qr Q2 cannot be changed.

1. Generate an arc of a cubic from the control points QI and Q2 and tangent vector QI Q2
(Figure 16(b)).
2. The control points Q2 and Q3 are calculated from a vector whose direction is the same as
the derivative vector of an arc of a cubic at the point QI and Q4, and whose length is ~ of
the derivative vector (Figure 16( c)).

An example of the above algorithm is shown in Figure 17. Cross sections of the two surfaces
sa and Sb are distorted as shown in Figure 17( a). The control points, Q, (i = 1, ... ,4) of the
distorted surface are shown in Figure 15(b). When the control.points are moved as shown in
Figure 15(c), smooth surfaces are generated as in Figure 15(d).

Any distorted surface can be easily corrected by using these above algorithms. Referring back
to Figure 8( a), the direction of the control points Qi( i = 1,' .. ,7) is inappropriate so that the
generated surface patch is distorted. Therefore, when the control points are moved by the above
first method as shown in Figure 18(a), smooth surfaces are .e;enerated, as in Figure 19(a).

Figure 8(b) is another example, where the interior curve defined by the control points QI, Q2,
Q3 and Q4 is distorted, so that the generated surface is also distorted. Hence, when the control
449

Figure 17: Global control of the surface (2).

---

(a) (b)

Figure 18: Modify the control points automatically.


450

(a) (b)

Figure 19: Shading images.

points Q2 and Q3 are modified by the second method as shown in Figure 18(b), the surface
becomes smooth .

. From the shaded images, it is obvious that no distorted surfaces exist (Figure 19).

6 Applications for the generalized Gregory patch

In the surface interpolation of irregular curve meshes, interior curves are generated to subdivide
the patch[Chiyokura 86] so that topologically rectangular patches can be generated. If the
number of curves in the curve mesh is more than 4 in spite of the shape of the curve mesh
being rectangular, interior curves are generated. For example, Figure 20(a) shows a free-form
surface 51, and Figure 20(b) shows an approximation of the offset surface 52 that contains
many piecewise curves. When the offset surface 52 is created from the surface 51, the side
surfaces 5 i ( i = 3,···,6) are also created as in shown in Figure 20(b). These side surfaces
consists of n curve segments as their boundaries. Therefore, interior curves are created when
the side surfaces are interpolated. Sometimes these interior curves are distorted, and this leads
to distorted generated surfaces.

The bicubic generalized Gregory patch, which is created by applying a compatible correction to
a bicubic generalized Coons patch, is useful in this case because multiple curve segments can
be treated as one curve and so a non-four sided face, which is geometrically four sided, can
be interpolated by one patch. The bicubic generalized Gregory patch consists of the derivative
vectors of each parameter's direction and the boundary mesh curves. These derivative vectors
are represented by the Gregory patch's interior control points. Since the bicubic generalized
Gregory patch has a form similar to the Gregory patch, the shape control method explained in
section 5 can be applied .
451

(a)
(b)

Figure 20: A Surfaces consisting of piecewise curves.

S(u,l)
----::::::::0,.....,.-- Sv( u, 1)
Su(O,v)

I
S(l, v)

Su(l, v)

~u
Figure 21: A generalized Coons patch.

The generation process of the generalized Gregory patch is described next.

6.1 The generation of a generalized Gregory patch

A compatible corrected generalized Coons patch is represented by boundary curves and the
derivative vectors on the boundary. Let the boundary curves be S(u,O), S(u,l), S(O,v) and
S(l,v) and let the derivative vectors be Su and Sv as follows:

(26)

-~~
~=~
Figure 22: A construction of a generalized Coons patch.
452

P3
(a) Derivative vectors of sa (b) Bicubic Bezier patch sa'
Ps P9

P16~' ~----_d'l,'p6
P 17

Qo
(c) Bicubic Bezier patch Sb' (d) Surface S'

Figure 23: Process to generate a generalized Gregory patch.

(27)

The derivative vectors along the boundary are Sv(u,O), Sv(u,l), Su(O,v) and Su(l,v). The
generalized Coons patch S( u, v) (0 S u, v S 1) becomes as shown in Figure 21, and is represented
by composing the three polynomials[Barnhill 78], sa, Sb and SC as follows (Figure 22):

S(u,v) = sa + Sb _ sc. (28)

sa is defined by the two curves S(O, v) and S(l, v), and the curve's derivative vectors Su(O, v)
and Su(1, v). Sb is defined by the two curves S(u,O) and S(u,l), and the curve's derivative
vectors Sv( u, 0) and Sv( u, 1). And sc is defined by the duplicate parts of sa and Sb.

The derivative vectors Su and Sv of the compatible corrected generalized Coons patch can be
derived from the boundary piecewise curve mesh, by the same interpolation method for the
interior control points of the Gregory patch. Therefore, the generalized Coons patch can be
represented by the interior control points of the Gregory patch and the boundary piecewise
curves. We call the patch, which consists of the representation of the interior control points of
the Gregory patch and the boundary piecewise curves, the generalized Gregory patch. In the
following, a method of generating a generalized Gregory patch is shown. If we consider equation
(28) to generate a generalized Gregory patch, we see that the process consists of generating
surfaces sa, Sb and SC. Let's consider the surface sa. Vectors a, b, c and d are calculated
by linearly interpolating the ~ length derivative vectors Su(O,O), Su(O, 1), Su(l,O) and Su(1, 1)
of the boundary curves at the end points. Refer to Figire 23(a). Similarly, the control points
Po, P1, P2 and P3 are calculated from the ~ length derivative vectors Sv(O,O), Sv(O,l), Sv(1,O)
and Sv(l, 1). By using these vectors and control points, the bicubic Bezier patch is generated
as shown in Figure 23(b). Let this patch be sal. A cross section set of points on the patch
sa, where the parameter value is v, can be represented by a cubic Bezier curve by using the
453

boundary curves S( 0, v) and S( 1, v) as follows:


3
sa(u,v) = L:Bf(u)Qi (29)
i=O

where Qi( i = 0, ... ,3) is a control point of the Bezier curve. From this we define Qi as follows:
Qo=S(O,v), (30)

QI = S(O, v) + ~S~I(O, v), (31)

Q2 = S(l,v) - ~S~I(l,v), (32)

Q3=S(1,v). (33)

Similarly, if we consider the v direction of the surface parameter, the Bezier patch Sbl can be
defined to the surface Sb. Therefore, a curve Sb( u, v) on the patch Sb is defined as follows:
3
Sb(u,v) = L:Bf(v)Qi (34)
i=O

where Qi (i = 0, ... ,3) is a control point of the Bezier curve. From this we define Qi as follows:
Qo=S(u,O), (35)

QI 1 bl ( u, )
= S( u, 0) + 3"Su °, (36)

Q2 = S(u, 1) - ~S~I(U, 1), (37)


Q3 = S(u, 1). (38)

This process is shown in Figure 23( c).

As the surface sc represents the duplicate parts of surface sa and Sb, it can be represented
by the bicubic Gregory patch which is defined by the bicubic Bezier patches sal and Sbl. The
control points Pi (i = 0,· .. , 11) of the patch sal, and the control points Pi (i = 12, ... , 19) of
the patch sbI, are assigned to SC as shown in Figure 23(d).

6.2 Examples of modifying the shape

The modifying process of the curve meshes using the generalized Gregory patch described in
the above section is shown.

Figure 24(a) shows a curve mesh which contains n curve segments. At first, GI continuous
curve segments at each end point are selected, and these curve segments are treated as one
curve. In case of Figure 24(b), four curves Co, C I , C 2 and C 3 are selected. By using these
curves, the generalized Gregory patch is generated. Figure 24(b) shows the control points and
the cross-section curves of the generalized Gregory patch. Figure 24( c) shows an example when
the control point Q2 of the generalized Gregory patch is moved. And Figure 24( d) shows an
example when the length of vector QI Q2 of the generalized Gregory patch is changed. As shown
in these figures, the shape of the generalized Gregory patch can be controlled in a manner similar
to the Gregory patch.
454

,,
I
./~- ... ---- -..... - - ...

Figure 24: Examples of shape modifications.

7 Conclusion

An interpolation process of cubic Bezier curve meshes by biquartic Gregory patches is described.
In this method, a biquartic Gregory patch is generated by using a quadratic CBD function. The
three cross boundary derivative vectors, which are located at the two end points and middle
point of the boundary, can be set to generate a surface. The two derivative vectors at the end
points are restricted to come from the boundary mesh, but the middle derivative vector is set
freely. Therefore, the user can specified a middle cross boundary derivative vector freely while
designing a free form surface.

Using the above characteristics, the surface shape can be controlled by modifying the middle
cross boundary derivative interactively. Therefore, we define a new control point layout for the
Gregory patch, which is based on the cross boundary derivative ve.ctors. The user can control the
surface shape by modifying these control points, because the relationship between this control
points and the surface shape is clear. If the resulting patches are distorted, the shape of the
patches can be changed to smooth surfaces.

If curve meshes consisting of n curve segments appear, such as when an offset surface is gener-
ated, the generalized Gregory patch is useful. Using this patch, the curve meshes are treated
by four curves and interpolated by one patch. Therefore non-distorted surfaces are generated.
This patch has a form similar to the Gregory patch, and the user can control the surface shape.

This shape control method and the generalized Gregory patch have been implemented on the
solid modeler DESIGN BASE which RICOH Co. Ltd. developed and its effectiveness has been
verified experimentally.
455

8 Acknowledgement

We would like to thank Dr. Hideko S. Kunii, General Manager of RICOH's Software Division,
for her encouragement and advice; Mr. Aidan O'Neill of RICOH Co. for his valuable comments
and for assistance with this text; and the members of the DESIGNBASE group of RICOH Co.
for their help and discussion.

References
[Barnhill 78] BARNHILL, R.E. AND BROWN, J.H. AND KLUCEWICZ, I.M., "A new twist in
computer aided geometric design," Computer Graphics and Image Processing,
Vol. 8, 1978, pp. 78-9l.
[Barnhill 85] BARNHILL, R.E., "Surfaces in computer aided geometric design:A survey with
new results," Computer Aided Geometric Design, Vol. 2, No 1-3, 1985, pp. 1-17.
[Bartels 87] BARTELS, R.H. AND BEATTY, J.C. AND BARSKY, B.A., "An introduction to
splines for use in computer graphics & geometric modeling," Morgan Kaufmann,
Los Altos, California, 1987.
[Bezier 66] BEZIER, P., "Definition numerique des courbes et surfaces" , Automatisme,
Vol. 11, 1966, pp. 625-632.
[Bezier 67] BEZIER, P., "Definition numerique des courbes et surfaces (II)", Automatisme,
Vol. 12, 1967, pp. 17-2l.
[Chiyokura83] CHIYOKURA, H. AND KIMURA, F., "Design of solids with free-form surfaces,"
Computer Graphics (Proc. SIGGRAPH 83), Vol. 17, No.3, 1983, pp. 289-298.
[Chiyokura 86] CHIYOKURA, H., "Localized surface interpolation for irregular meshes," Ad-
vanced Computer Graphics (Proc. Computer Graphics Tokyo '86), T.L. Kunii,
Ed., Springer-Verlag, Tokyo, pp. 3-19.
[Chiyokura 88] CHIYOKURA, H., Solid modelling with DESIGNBASE, Addison Wesley, Read-
ing, MA, 1988.
[Coons 64] COONS, S.A., "Surface for computer-aided design of space figures", MIT, 1964.
[Farin 88] FARIN, G., Curves and surfaces for computer aided geometric design, Academic
Press, San Diego, California, 1988.
[Faux 79] FAUX, I.D. AND PRATT, M.J., Computational geometry for design and manu-
facture, Ellis Horwood Limited, London, 1979.
[Gregory 74] GREGORY, J .A., "Smooth interpolation without twist constraints", Barn-
hill, R.E and Riesenfeld, R.F eds., Computer Aided Geometric Design, Academic
Press, New York, 1974, pp. 71-87.
[Rogers 90] ROGERS, D.F. AND ADAMS, J.A., Mathematical elements for computer graph-
ics, Second Ed., McGraw-Hill, New York, 1990.
[Sarraga 87] SARRAGA, R.F., "Gl interpolation of generally unrestricted cubic Bezier
curves," Computer Aided Geometric Design, Vol. 4, No 1-2, 1987, pp. 23-39.
[Shirman 87] SHIRMAN, L.A. AND SEQUIN, C.H., "Local surface interpolation with Bezier
patches," Computer Aided Geometric Design, Vol. 4, No 4, 1987, pp. 279-295.
456

[Shirman 90] SHIRMAN , L .A . AND SEQUIN, C.H., "Local surface interpolation with shape
parameters between adjoining Gregory patches," Comput er Aided Geometric
Design , Vol. 7, No 5, 1990, pp. 375-388.
[Takai 90] TAKAI, K. AND WANG, K .K., "Curvature continuous Gregory patch: A modi-
fication of Gregory patch for continuity of curvature," Proceedings of the Japan-
U.S.A. Symposium on Flexible Automation, ASME, 1990, pp. 1205-1211.

BIOGRAPHY

Kouichi KONNO , a member of the 3D CAD project at RICOR's Soft-


ware Division , is interested in solid modeling, free-form surface interpola-
tion and rendering algorithms. His current research includes the control-
ling and generating of surfaces. He received a BS in Information Science
in 1985 from the University of Tsukuba. He entered the solid modeling
project at RICOH in 1985, whi ch h as now developed into the product
DESIGNBASE.
Address: RICOH Co., Ltd. Software Division, 1-17, Koishikawa-cho
l-Chome Bunkyo-ku, Tokyo, 112, Japan Tel: (03) 3815-7261; Fax: (03)
3818-0348
E-mail: konno@src.ricoh.co.jp

Teiji TAKAMURA , a member of the 3D CAD project at RICOH 's


Software Division, is interested in solid modeling, free-form surface in-
terpolation and computer graphics. His current research includes g eneral
free-form surface intersections and rational parametric surface interpola-
tions by using rational boundary Gregory patches. He received a BS in
Information Science in 1982 from the University of Tokyo . He entered the
solid modeling project at RICOH in 1984, which has now developed into
the product DESIGNBASE. He is a member of ACM SIGGRAPH and
the IEEE Comput er Society. Several of his papers have been selected by
NICOGRAPH .
Address: RICOH Co. , Ltd. Software Division, 1-17, Koishikawa-cho
l -Chome Bunkyo-ku, Tokyo, 112, J apan Tel: (03) 3815-7261; Fax: (03)
3818-0348
E-mail: takamura@src.ricoh.co .jp

Hiroaki CHIYOKURA is an associate professor of Faculty of the En-


vironment Information at Keio University. His research interests are solid
modeling, computer graphics , and their applications to computer-aided
design and manufacturing. He received his BS and MS in mathemat-
ics from Keio University in 1979 and 1980, respectively. He earned his
Dr.Eng. in precision machinery engineering from the University of Tokyo
in 1984. He has written a book " Solid Modelling with DESIGNBASE:
Theory and Implementation" , published by Addison-Wesley. He is a
member of ACM SIGGRAPH .
Address: Keio University, 5322, Endo Fujisawa-shi, Kanagawa, 252,
Japan Tel: (0466) 47-5111; Fax: (0466) 47-5041
E-mail: chiyo@sfc.keio.ac.jp
Hybrid Models and Conversion Algorithms
for Solid Object Representation
Leila De Floriani and Enrico Puppo

Abstract
Traditional object representation schemes are often inadequate for supporting the variety of
object manipulation tasks required in a modern solid modeling system. Hybrid schemes try to
achieve a higher representative power by embedding the characteristics of several traditional
models. The schemes we consider are Modular Boundary Models (MBMs), which describe the
boundary of a solid object as the combination of face-abutting object parts represented in a
boundary form, PM-Octrees, which are a combination of octrees and boundary representation,
and PM-CSG trees, which are essentially octrees whose leaves represent CSG primitives. In
particular, we will focus our attention on an MBM, called the Face-to-Face Composition (FFC)
model, which also stores interference information useful to perform Boolean operations. We dis-
cuss the problems involved in conversion algorithms which operate between traditional models,
and we review conversion algorithms on PM-Octrees and PM-CSG trees. We present a new
algorithm for boundary evaluation of an FFC model, and discuss the problem of producing an
FFC model from a boundary representation.

Keywords: Solid Modeling, Hybrid solid models, Conversion Algorithms, CAD/CAM, Data
Structures.

1 Introduction

In the last few years the problem of providing a description of geometric objects within computer
systems has become more and more important. The significant development of application fields
dealing with solid objects, like CAD/CAM and robotics, is an excellent motivation to look for
effective and efficient solutions. Solid modeling is an important discipline whose goal is to be
able to express the entire nature of three-dimensional objects [Sam90a], and to make computer
systems capable of answering geometric questions algorithmically [Miin87]. Key points in design-
ing a solid model are the capability of satisfying some general requirements, like completeness,
integrity, regularity etc., while giving a representation which is expressive and efficient enough
to be used in practical applications.

CAD/CAM systems must allow users to perform different operations on the solid objects
they represent. Two non-trivial, yet basic, operations are the creation (which involves an in-
teraction between the designer and the machine) and the display/drawing of an object. Other
important operations involve computations of integral properties of an object (e.g, volume or
mass computation). Furthermore, a modern CAD/CAM system should allow an efficient interac-
tion between different objects (i.e., perform Boolean and interference computations), supporting

457
458

graphic interactive modifications, and simulating/driving machining and assembly operations.

No representation scheme among the ones known at the present state of the art is the best
one for all operations. Most commercial solid modelers developed in the past ten years are based
either on Constructive Solid Geometry (CSG) [Voe77] or on boundary representation (BRep).
CSG describes an object as a Boolean combination of primitive components or half-spaces,
while in a BRep objects are defined by their enclosing surfaces. CSG schemes are especially
well-suited for the object creation task (interactive input), while a BRep is possibly the best
model for display purposes. It also allows an efficient computation of integral properties (only
in the case of objects with planar faces), and the explicit representation of tolerance and surface
finish information.
Spatial enumerations and, more recently, hierarchical space decomposition models (like
the Octree [MeaSO] and its variants) have been proposed mainly as secondary schemes in solid
modelers. Such models are especially well-suited for computing integral properties and Boolean
operations, and can be easily obtained from both CSG and BRep descriptions, but they give
only an approximate description of an object.

Since the early eighties the need for systems which are able to support several represen-
tations has been highlighted [Req82]. As a compromise between the two major traditional
schemes, many systems have been built which accept their input in a CSG form, and automat-
ically convert it to a boundary representation, which is used as a primary scheme for all other
system operations. Anyway, such systems suffer from disadvantages, like inefficiency in evalu-
ating Boolean operations and a heavy drawback in the complexity of the conversion algorithms
[Req85,Miin87].
Modeling systems using more than one representation clearly require conversion algorithms
capable of translating data among the different schemes. Conversion algorithms have to satisfy
both theoretical and practical requirements. In other words, the algorithm must guarantee
that the output model is always correct and consistent with the input one; at the same time,
the algorithm must be efficient enough to be usable in a practical system. Consistency is
an important issue if one wants to keep different models of the same object, and perform an
automatic update of all the models when one of them is modified.
The general problem of converting between different representations has been discussed by
Requicha [Req80] and Miintylii [Miin87]. In principle, exact totally invertible conversions would
be desirable, that could guarantee the full consistency of two models. Unfortunately, the total
consistency is possible only if the modeling system is limited to support those operations on
solids that can be mapped on each representation. As a consequence, if the above requirement
is fulfilled, the more representation schemes we include in a system, the more limitations we get
on the objects we can represent and on the operations we can perform on them.

Another aspect of conversion is when one wants to build a representation of a solid object
from another representation, without the need of converting back to the original model after
the second one has been modified. This is the case when an object model suitable for manu-
facturing purposes must be constructed from the model produced through the design phase. A
representation for manufacturing purpose should describe the object in terms of the so-called
form features [Wil85] (e.g., through holes, pockets, slots, chamfers) which are related to a spe-
cific machining and assembly processes. The algorithms which construct from a BRep a CSG
model or a Modular Boundary Model through a feature recognition and organization process
are examples of such irreversible conversions.
459

In section 3, we discuss the problems related to the conversion between boundary repre-
sentations and constructive models. While the conversion from CSG to BRep (called boundary
evaluation) is a computationally intensive task, the inverse problem, i.e., the creation of a CSG
model from a boundary representation, is still an open problem. CSG models are not unique,
and thus prevent a priori invertible conversions. Hence, it is still impossible to convert data
from CSG to a boundary representation, modify them (with any tool/algorithm working on the
BRep) and give them back (algorithmically) to the designer for further interactive intervention.
Recently, some hybrid models have been proposed, which are capable of incorporating
useful features of classical models or can play a role as intermediate structures in conversions.
The PM-Octree [Bru85,Car85,FUj85,Sam90a] can be interpreted as a combination of Octree
and BRep, and can be easily obtained from a CSG description and converted from and to a
BRep. A different combination of models is the PM-eSG [Wyv85], which is essentially an octree
structure whose leaves refer to CSG primitives. Modular Boundary Models [DeF88,DeF89b] try
to incorporate the advantages of CSG and BRep models, while encoding information useful for
Boolean and interference computation. Such models seems especially well-suited for feature-
based design, and for describing machining and assembly operations [DeF89a].
In section 4 we present algorithms for converting from and to a hybrid model. This shows
that hybrid models, like PM-Octrees or PM-CSG trees, can be used either as basic representation
scheme in a solid modeling system or in connection with a traditional scheme like BREp or CSG.
MBMs are more suitable to be the single representation on which a solid modeling system is
based than to be a support structure for other representations. The algorithm for evaluating
an MBM presented in section 4.5 shows that a boundary representation can be easily produced
from an MBM. This ensures the compatibility of a system based on an MBM with traditional
system using a BRep.

2 Object Representation Schemes

A broad classification of representation for CAD/CAM applications is into volumetric, boundary


and hybrid schemes. Volumetric schemes describe an object in terms of solid primitives covering
its volume, while boundary schemes describe an object in terms of the surfaces enclosing it. Hy-
brid schemes, like the PM-Octree and the modular boundary representations, are combinations
of the two approaches.
Volumetric representation schemes can be classified into decomposition models, which de-
scribe an object as a collection of simple primitive cells combined with a single "glueing" op-
eration, and into constructive models, which describe an object as the Boolean combination of
primitive point sets.
Decomposition schemes most commonly used within CAD systems are space-based adaptive
schemes, like the Octree and the Bintree [MeaBO,Sam90a]. Such schemes describe an object in
terms of the 3D space it occupies by recursively subdividing the space into regular volume
elements. The object representation depends on the position in the space occupied by the
object, and is generally an approximated representation, dependent on the resolution of the
spatial subdivision.
Constructive Solid Geometry (CSG) defines a family of schemes for representing solids as
Boolean combination of primitive components. The most natural way to represent a CSG model
460

is the so-called esc tree, which is a binary tree in which internal nodes represent operators,
which can be either rigid motions or regularized union, intersection or difference, while terminal
nodes are either primitive leaves which represent subsets of E 3 , or transformation leaves con-
taining the defining arguments of rigid motions. CSG schemes are unambiguous, but not unique:
different CSG trees may describe the same object. The domain of a CSG scheme depends on the
set of primitive solids and on the motional and Boolean operators available. If the primitives of
a CSG scheme are bounded, then any CSG tree is a valid representation if the primitive leaves
are valid.
A boundary representation (BRep) of an object is a geometric and topological description
of its enclosing faces. The object boundary is segmented into a finite number of bounded sub-
sets, called faces. Each face is, in turn, represented by its bounding edges and vertices. In order
to describe objects with multiply connected faces or internal cavities, two derived topological
entities are defined, the loop and the shell. A loop on a face f is a closed chain of edges bounding
f. A shell of an object S is defined as any maximal connected set of object faces. In a BRep
a clear separation is made between the topological and geometric information. Topological
information is concerned with the adjacency relations between pairs of individual topological
entities (e.g., edges and vertices). The data structures proposed in the literature to encode a
BRep are characterized by the number and kinds of relations they store [Bau72,Ans85,Wo085j.
The geometric description consists of the shape and location in space of each of the primitive
topological entities. Boundary schemes can represent a wide variety of solid objects at arbitrary
levels of detail. Such schemes are unambiguous, if faces are represented unambiguously, but gen-
erally they are not unique. Storage requirements for boundary representations are usually quite
large, especially when curved-faced objects are approximated by polyhedral models. Boundary
representations are especially useful to generate graphical outputs, because of the availability
of boundary information, and for describing tolerances and surface finish information. Integral
properties can be easily and efficiently computed from a BRep when operating in a planar-faced
environment.
Hybrid representation schemes have been defined as combinations of the approaches dis-
cussed above. There are two major classes of hybrid models, PM-Octrees and PM-CSG trees,
which combine a boundary representation or a CSG model with an octree, and Modular Bound-
ary Models (MBMs), which combine a BRep with a restricted set of Boolean operations. Such
models are discused in detail in the next three subsections.

2.1 PM-Octree

The major drawback of classical adaptive space-based decomposition models, like the Octree or
the Bintree [MeaSO,Sam90aj, is that they provide an approximate description of a solid object.
To overcome this problem, several authors have proposed schemes which combine the octree with
a boundary representation [Aya85,Car85,Fuj85,Diir89,Sam90aj. These schemes have different
n;unes, but their underlying ideas are very similar. Samet [Sam90aj discusses such structures,
ir.t.roducing the name PM-Octree, that we use here.
Like an Octree, a PM-Octree is based on the recursive regular subdivision of a finite cubic
universe containing the object into octants. The root of the PM-Octree describes the universe,
while internal nodes represent oct ants. Each non-terminal node is divided into eight equally
sized octants. Terminal nodes can be full or void, as in the octree, or can be face, edge or vertex
nodes (Diirst and Kunii introduce a more complex classification of nodes to handle special cases).
461

Figure 1: Face, edge and vertex nodes in a PM-Octree [Nav89).

The latter three kinds of nodes are partially occupied and they are introduced to maintain exact
information: the portion of a node which is contained in the object can be computed by the
geometric information associated to the node itself. Face nodes are crossed by a single object
face, edge nodes contain exactly two edge-adjacent faces together with a part of their common
edge, vertex nodes contain exactly one vertex and portions of all the edges and faces incident
on it (see Figure 1).

The underlying philosophy is to use the recursive subdivision in order to reduce locally the
boundary information to a simple form: partial nodes will involve generally just one to three
planes crossing them, though the planes could be more than three for some vertex nodes. For
the proper execution of many operations, the fact that a node is of type vertex, edge or face is
not sufficient to characterize the object completely. Carlbom et al. [Car85] store with each node
the polygons determined by the boundary of the object that intersects the node. This requires a
considerable amount of extra information. Navazo [Nav89] uses a more efficient representation
by including sufficient information to allow the classification of any point with respect to that
node (i.e., if a point inside the node is inside, outside or on the boundary of the object).

PM-Octrees have been first defined for planar-faced objects. They have been extended
to deal with faces described by bi-quadratic patches. PM-Octree schemes are unambiguous,
but not unique because of positional nonuniqueness and of nonuniqueness of boundary schemes.
The main advantage of PM-Octrees is the simplicity of the algorithms which implement Boolean
operations. These algorithms are based on a parallel traversal of the two input trees and are
conceptually quite similar to algorithms for performing Boolean operations on octrees. Also
visualization and computation of integral properties can be performed quite efficiently on PM-
Octrees [Bru89,Nav89,Sam90a].

2.2 PM-eSG tree

Wyvill and Kunii [Wyv85] propose a variant of the CSG tree (with bounded primitives) that
we call a CSG-DA G. This scheme describes an object as the combination of previously defined
objects, or primitive objects (e.g., cylinders, blocks). The Boolean operations allowed are set
addition (disjoint union) and set subtraction. The CSG-DAG is a directed acyclic graph in
which the internal nodes are disjoint union or set subtraction, and the leaf nodes correspond to
462

Figure 2: An example of PM-CSG octree [Wyv85].

objects, which can be primitive or non-primitive. If a leaf node is a nonprimitive object, then
the node contains a pointer to the CSG-DAG describing it. A dag is used instead of the tree
because the same subobject (primitive or non-primitive) can be used several times in the same
structure without replicating its description. Each occurrence of a subobject in the CSG-DAG
contains also two matrices describing the relative spatial location and shape of the instance of
the subobject that is being referred by this node.

The CSG-DAG has the same properties as classical CSG. This representation has been
defined to facilitate the construction of a spatial index (called a PM-GSG tree) on top of the
representation. The PM-CSG tree [Wyv85] is a combination of an Octree with a CSG. In
essence, the underlying concept is similar to the one the PM-Octree is based on: they vary the
definition of a leaf node so that it refers to a primitive object instead of a vertex, edge or face,
as in the case of the PM-Octree. The decomposition criterion is such that only one primitive
object is allowed to occupy each cell. In a PM-CSG tree we can have five types of leaf nodes:

(i) a full node, which is entirely inside a primitive object;


(ii) an empty node, which is entirely outside the object;
(iii) a positive boundary node, which describe the boundary between empty space and exactly
one positive object;
(iv) a negative boundary node, which contains the boundary between a primitive object 51, and
another primitive object 52, where 52 is subtracted from 51, in such a way that the space
corresponding to 52 is actually empty;
(v) a nasty node, which is a node at the finest level of resolution (which thus cannot be further
decomposed) and cannot be classified in the previous types.

Nasty nodes can contain more than one primitive object; they usually occur along edges where
primitive (positive) objects meet. The model cannot be considered completely exact, be-
cause nasty cells occur wherever two different primitive objects meet. In praCtical applications
463

CI

C3 ,,' U( D(CI. C2). C3)

Figure 3: Components Cl is combined with C2 by restricted difference, and the result is combined with
C3 by disjoint union. The three components are all face-adjacent at the hatched connection face.

[Wyv85], however, nasty cells can be ignored since the volume of such cells at a sufficiently high
level of resolution is very small. Figure 2 shows an example of a PM-CSG tree.

The decomposition criterion is realizable because in the CSG-DAG only allows a restricted
set of boolean operations: a pure CSG representation would allow general union of objects,
making relevant portions of space to be occupied by more than a single primitive object.

2.3 Modular Boundary Models: the FFC Model

Modular Boundary Models (MBMs) describe the boundary of a solid object as the combination
of face-abutting object parts, each of which is described by a boundary scheme. The information
required to combine the boundaries of the various parts is stored as a graph [DeF88,DeF89bJ.
The purpose of MBMs is to combine the ease of design of CSG schemes with ease of display and
surface representation of conventional boundary schemes.

One of the major limitations of classical CAD models (CSG, BRep, Octrees) is their
lacking capability of describing form features and their relations. "Pure" solid modelers cannot
be used for assembly and machining planning because they do not contain information (e.g.,
tolerances and dimensions, form features, materials) required for these tasks [FaI89]. Modular
boundary models are an attempt to fill the gap between the design and manufacturing phases
because of their capability of representing form features as model components. Moreover, the
MBM description of the object produced by the designer, in which components represent design
features, can be locally modified (because of the modular nature of the model) to produce an
MBM description in terms of manufacturing features.

Each part forming an MBM is called a component. A component is a solid object bounded
by a compact, orient able two-manifold surface. There are two kinds of components in an MBM:
positive and negative components. The kind of a component is defined by the directions of its
surface normals: in a positive components they are directed outward, in a negative component
they are directed inward. Positive and negative components are combined along faces through
a glue operator, according to the following composition rules (see Figure 3):
464

(i) if two components C j and Cj are both positive or both negative, then they must intersect
only along their boundaries and share portions of faces (in this case, the glue operation is
a disjoint union);
(ii) if Cj and Cj have opposite signs, then one must be contained into the other one, and the
two components must share portions of faces (in this case, the glue operation is a restricted
difference ).

Being based on a boolean combination of components represented in a boundary form, an


MBM could be considered as a hybrid boundary-CSG representation. The conceptual difference
is that, due to the restricted operations allowed, an MBM describes connection information
between face-adjacent components explicitly, while the two operators are implicitly described.
Thus, an MBM is an unevaluated representation from which the reconstruction of the object
boundary is a task involving only two dimensional representations.

First MBMs, like the Hierarchical Face Adjacency Hypergraph (HFAH) [DeF88], or the
Object Decomposition Graph (ODG) [DeF89b,DeF89c], describe only face-adjacency relations
between pair of components in the form of a directed graph. The Face-to-Face Composition
(FFC) model [DeF89a] is an improvement over first generation MBMs, since it stores also
spatial interference information among components. We call any face of a component which has
a non-empty bi-dimensional intersection with a face of at least another component a connection
lace. If we consider both the spatial interference among components and the partition of the
component connection faces into subfaces shared with other face adjacent components, we obtain
a partition of the boundary of each component C j into portions of its original faces, called lacets.
A facet I of Ci a maximal connected portion of a face of Cj such that the regularized intersection
of I with any other component is either void or equal to f. A facet of a component is either a
connection facet, when it is a subset of a connection face adjacent to at least a face of another
component, or a boundary facet otherwise (see Figure 3).
Let S be a solid object and M D be a family of positive and negative components defining
a modular decomposition of S into face adjacent components, which, when combined through a
glue operator, give S. The collection of the connection and boundary facets of all components
of M D defines a fragmentation F of the union of the boundaries of the components of S (see
Figure 4). Each facet h in a fragmentation F can be classified with respect to a component C j
as follows:

- fj is a connection facet for C j (i.e., fj belongs to the boundary of C j and is shared by at


least another component face-adjacent to Cj )
-h is a boundary lacet for Ci (i.e., fj belongs to the boundary of C j and is not shared by
any other component)
- h is internal to Cj (i.e., h belongs to the boundary of another component and is contained
into Cj )
-h is external to C j (i.e., h belongs to the boundary of another component and has no
interference with C j )

A connection facet is called homogeneous if it is a connection facet for an even number


of components, otherwise it is called non-homogeneous. Homogeneous connection facets do not
belong to the boundary of the object expressed by the FFC model.
465

Figure 4: An object S (a) and its modular decomposition (b) and the fragmentation of the faces in the
modular decomposition (c).
466

f4

I
f5 I

:0
I

I
I
f1

fll CI

U( D(CI. C2). C3)

Figure 5: A modular decomposition (a) and its FFC graph (b)

A modular decomposition M D of an object S and the fragmentation F of the faces of the


components in MD define the FFC Model, denoted M = (MD,F). A high-level description of
an FFC model is given by a hypergraph, called the FFC Graph. The nodes of the FFC graph
describe the components in M D, the hyperarcs the internal and the connection facets.

More formally, the FFC graph is a pair G = (N, A), where each node in N corresponds
to a component C i in M D and each hyperarc hr corresponds to a facet fr in F; hyper arc hr is
an ordered k-tuple = (Crll C r21 ••• ,Cr.) where the Cr" s = 1, ... ,k are the components in M D
sharing or containing fro Facet fr must be either a connection facet for at least two components
in h" or an internal facet for at least two components in hTl and not external with respect to
any component in hr. An attribute is associated with each hyperarc hTl which is an ordered
k-tuple (a r ! , aT" ... , aT,), where ar , can be I (internaf), C (connection) or B (boundary). Figure
5 shows an example of FFC graph.

Note that the external relation is not represented in the FFC graph. Also, a facet which is
only a boundary facet, and is not internal to any other component, is not described in the FFC
graph. A hyperarc in the FFC graph is called a connection arc if at least one of its attributes is
of type connection. The spanning subgraph of the FFC graph containing all the connection arcs
467

is called the connection subgraph of the FFC graph (since it describes the connection relations
among the FFC components).

An FFC model described by a modular decomposition MD can be represented as a subdi-


vision of the portion of 3D space defined by the union of the components in MD into pairwise
quasi-disjoint 3D cells, called a cellular representation of the FFC model [DeF91]. An encod-
ing of this representation as a modification of the radial edge data structure [WeiS6] is used
as data structure for the FFC model in a geometric modeler based on the FFC model under
development.

3 Conversion Problems on Classical Models

3.1 Conversion from CSG to boundary representation

The problem of converting from a CSG model to a BRep is well known in solid modeling under
the name of boundary evaluation. During the eighties, the increasing popularity of combined
CSG and BRep schemes has made boundary evaluation algorithms become more and more
important as a core tool of many solid modelers. Despite the importance of the problem,
the literature on this topic is surprisingly poor. For quite some time only few descriptions of
boundary evaluation algorithms were made available, mainly in reports with limited circulation,
where the attention was essentially focused on implementation details and efficiency heuristics.

A theory underlying the boundary evaluation (and other important problems in compu-
tational geometry) was presented in 19S0 by Tilove [TiISOb], who developed the concept of set
membership classification. Only in 19S5, Requicha and Voelcker published a paper based on
such a theory which addressed the problem in a general way and offered a high-level description
of the approach and of the algorithms. Since then, few progresses have been made in designing
efficient algorithms that convert from CSG to BRep, though many efforts have been spent in
finding ways to reduce the complexity in the edge/face detection task (see, for instance [RosS9]).
The boundary evaluation problem is inherently hard, because two complex yet deeply dif-
ferent structures must be handled, and a great amount of information which is only implicitly
encoded in the input model must be made explicit. All known algorithms for boundary evalu-
ation are based on set membership classification mentioned above. The classification M(X, S)
of a candidate set X with respect to a reference set S is a segmentation of X into three subsets
X wrtS = (X inS, X onS, X outS), where the subsets contain the parts of X which are inside, on
the boundary of, or outside S respectively. Classifications can be combined, making set member-
ship classification a powerful tool. If one is able to combine classifications through regularized
set operators [TiISOa], the boundary of an object described by a CSG tree can be evaluated by
recursive application of the combination algorithm boundary merging to pairs of components in
the tree.

The problem of computing a combination of two classifications is defined as follows: given


classifications XwrtA and XwrtB, find XwrtS, where S = A 1/9 B, and 1/9 denotes one of the
regularized set operators [TiISOa]. Combined classifications are obtained very simply if objects
do not have overlapping boundaries (e.g., XinS = XinA U XinB, XoutS = X -* XinS). In
the case of overlapping boundaries, singularities can arise, and more information is needed in
order to compute combinations: a classification must be augmented including neighborhood
468

paradigm FACE_GENERATE-AND_TEST(S, B)j


/ / S is the input objectj B is the output BRep / /
< generate a sufficient set of tentative faces:F for S >j
for every F E :F do
FwrtS ~ CLASSIFY -FACE(F, S)j (a)
ADD_TO_BREP(FonS, B)j
end for
end FACE_GENERATE-AND_TESTj

function CLASSIFY(X, S):classificationj


/ / X is the candidate se~j S is the reference set encoded by CSGj
return classification will be (XinS,XonS, X outS). / /
begin
if IS-A_PRIMITIVE(S) then
ret urn( CLASSIFY _WRT -PRIMITIVE( X, S)) (b)
else
ret urn( COMBINE( CLASSIFY( X ,LEFT ...sUBTREE( S)),
CLASSIFY(X,RIGHT...5UBTREE(S)),OPERATOR(S)))
end if
end CLASSIFYj

Figure 6: Generate-and-Test paradigm (a) and Divide-and-Conquer classification function (b) for bound-
aryevaluation.

information for the elements of X onS. A ugmented classifications are discussed both in [TiISOb]
and in [ReqS5], and are fundamental in practical applications; neighborhood information are
somehow complex to represent and essentially related to the treatment of geometric singularities.
Classification algorithms deeply differ according to the nature of the sets under considera-
tion: the dimensionality of geometric entities determines the structure of an algorithm. Several
algorithms for point/face, line/face, point/solid, line/solid classification have been developed and
discussed in brief by Requicha and Voelcker [ReqS5]. Sophisticated point/face and point/point
algorithms working directly on BReps are also described in detail by Miintylii [MiinS7]. All such
algorithms can be part of a boundary evaluation (or a boundary merging) algorithm, depending
on the strategy used to design the algorithm itself.
Generally speaking, set membership classification is applied to the boundary evaluation
problem by exploiting the following fact: given an object S, if a superset X of the boundary as
of S is known, then as = XonS. In the (conceptually) simplest case, a superset F ofthe faces
of S is considered, in such a way that F onS will contain exactly the collection of the faces of
S. In this case, a Generate-and- Test paradigm is applied, which is described in Figure 6a.

Tentative sets of faces can be easily generated by the recursive application of the following
relation: for every pair of solid objects A and B, a(A ® B) C (aA u aB). The classification
of geometric entities with respect to a solid object described by a CSG is instead a delicate
subject. The hierarchical structure of CSG leads to a Divide-and-Conquer approach, which is
schematized in Figure 6b (names of subfunctions are self-explanatory).
In principle, face/solid classification could be implemented directly through the above
paradigm, in order to obtain procedure CLASSIFY-FACE used in the Generate-and-Test pa-
469

radigm. In practice, explicit face/solid classification can be very complex to perform, because
Boolean operations on face subsets are involved; thus, such approach is seldom used.
Since the faces of a solid are bounded by closed loops of edges, an alternative edge-based
approach is usually preferred, which follows an appropriated Generate-and-Test paradigm. In
this case, edge generation is a fundamental task that usually involves plane intersections and
two-dimensional line/face classification algorithms.
Requicha and Voelcker described in [Req85] a boundary merging edge-based procedure,
which is based on the arguments discussed so far. The procedure involves two- and three-
dimensional edge/face classifications, boolean operations on edge sets, various geometric line/-
plane and line/line tests, as well as a rather complex part concerning the treatment of singular-
ities through neighborhood information.
The definition of an incremental boundary evaluation algorithm which uses recursively
such a procedure is straightforward by applying a Divide-and-Conquer paradigm, provided that
BReps for primitives are available or can be computed easily (e.g., through intersection of
enclosing planes).
Requicha and Voelcker do not give an explicit analysis of the computational complexity
of the boundary evaluation, boundary merging, set membership classification, and combination
algorithms. All algorithms are polynomial with high exponentials, but worst-case complexity
seems to be quite pessimistic with respect to experimental results. The average complexity has
not been analyzed at all, and anyway experiments show that such algorithms are still quite
inefficient.
Several heuristic short cuts have been proposed to improve performance by trying to avoid
useless and cumbersome computations. Only recently, a formal approach was taken by Rossignac
and Voelcker which address the problem of computing only on entities and over regions of space
that can affect the desired final result [Ros89]. Rossignac and Voelcker introduce and formalize
the concept of active zone in CSG. Roughly speaking, the active zone Z of a node A in a CSG
representation of a solid S, is the spatial region in which changes to A affect S and hence as.
Active zones can be computed as combination of certain nodes of the CSG tree (depending on the
position of A in S) and can be used for boundary evaluation as follows. If A is a subcomponent
of object S, and X C aA is a candidate set we want to classify with respect to S, then if Z is
the active zone of A in S, the classical set membership classification can be substituted by an
approach which trims X with respect to Z. The use of active zones allows on the average to
early reject classified elements, thus avoiding redundant computations.
Alternative solutions to the boundary evaluation have been proposed by some authors,
which avoid a complete reconstruction of the BRep. Such solutions are mainly concerned with
specific tasks and offer sometimes efficient evaluations of the CSG model, but they do not
give a support to systems based on combined CSG and BRep. Thibault and Naylor [Thi87],
for instance, propose conversion from CSG to an intermediate, hierarchical structure, called
a Binary Space Partitioning Tree [Sam90a], which is a binary tree representing a recursive
partition of the 3D space by hyperplanes. This structure allows a fast evaluation of set theoretic
Boolean expressions and the development of efficient ray-tracing algorithms.
470

3.2 Converting to CSG Schemes

The problem of converting any representation to a CSG description is quite hard, because of
the intrinsic nonuniqueness of CSG schemes: the same object can be expressed as the Boolean
combination of primitives in many different ways. Also, CSG is essentially a method' for ex-
pressing the wayan object is designed: using a CSG scheme makes sense only if the designer
can interact with it.
No much work has been done on conversion algorithms to CSG, and all existing algorithms,
however, produce descriptions which are useless for practical purposes. The existing papers
dealing with such problem convert form a boundary representation or from a PM-Octree.
Woo [Woo82] and Juan [Jua88,Jua89] developed algorithms based on convex hull tech-
niques that convert from BRep to CSG. The algorithm by Woo produces a CSG which is a
Boolean combination (through disjoint union and subtraction only) of convex parts, while the
algorithm by Juan produces a CSG based on half-spaces, and whose internal nodes are regular-
ized unions and intersections.
An algorithm which converts from a PM-Octree to a CSG based on half-spaces has also
been presented by Juan [Jua89]. The algorithm essentially computes the CSG tree describing
the portion of the object within each node in the PM-Octree, then it tries to condense together
CSG trees related to adjacent blocks, according to some consensus rules. The resulting CSG tree
can be quite complex and somehow "non natural" because the same half-space can be replicated
many times inside it.
A conversion from the PM-CSG model [Wyv85] to CSG-DAG has not been attempted so
far. Yet, the PM-CSG seems to be better suited to this task than any other model. In fact, the
way the PM-CSG description of an object relates to its corresponding CSG-DAG is similar to
the way the PM-Octree relates to the BRep, In other words, all information present in the CSG-
DAG is maintained in the PM-CSG tree (mainly, all primitive objects are conserved). It would
be interesting to investigate inverse transformation algorithms from PM-CSG. In particular, it
would be important to understand if the application of such algorithms to objects which have
been modified while they are encoded by a PM-CSG, would produce the correct modifications
on the original CSG.
Among the algorithms described in the first part of this section, the method proposed by
Woo has been possibly the only attempt towards the use of CSG representations as solid models
for manufacturing: the algorithm produce a description of an object S where each cavity of S
can be expressed as the convex hull of the cavity plus its protrusions, and so on. A disadvantage
of that method is that the resulting convex volumes are neither necessarily related to specific
machining tasks, nor they correspond to predefined primitive components. No information is
produced for classifying these volumes as specific form features, and thus no guidance can be
provided to choose the manufacturing method appropriate for each part.
This subject is quite wide and concerns not only the conversion problem, but also the gen-
eral use of CSG models within a CAD/CAM system and the way the design and manufacturing
techniques can be interfaced with computerized systems. Roughly speaking, an object should
be designed in a way as better related as possible to a manufacturing sequence that produces
it. Also, if a CSG is used to encode the design, the compositions of primitives within the model
should also correspond to assembly or machining operations. While some machining operations
471

are essentially material removal and thus can be expressed as subtraction operations, assembly
operations are more effectively expressed as face-to-face part combinations.
It might be useful to compute a primitive-based eSG from a BRep as a description for
numerical control machining. In this sense, a eSG can be automatically derived from a BRep
by methods for recognizing the so-called form feature& of an object (holes, pockets, slots, etc.).
Several methods have been proposed in the last few years for automatic feature recognition.
Kyprianou [KypSO] uses a feature grammar to recognize features characterized by protrusions
and depressions from a boundary description of the solid (see also (JarS4]). Henderson [HenS4]
extracts cavities form a BRep by using expert system rules. Joshi and ehang (JarS4] extract
polyhedral features, such as pockets, slots, steps, holes, from a boundary description, by using
a graph matching approach. The features to be recognized are described by graphs represent-
ing the adjacency relations among the part faces. A method for feature recognition based on
topological information is described in [DeFS9b], which identifies a certain class of form features
by using connectivity information extracted from a graph description of the object boundary.
A conceptually similar, but simpler approach is described in [DeFS9c]. A combination of this
method with Kyprianou's approach is described in [FalS9].
These latter methods, however, produce a description of the object where the form features
are extracted in terms of a modular boundary model. An MBM gives a more flexible and
powerful representation of form features, since form features are attached to object faces and
have tolerance information associated with them. Thus, MBMs are considered better suited to
describe form features than eSG schemes.

4 Conversion Algorithms on Hybrid Models

4.1 Conversion from Boundary Representation to PM-Octree

In [BruS5] Brunet and Navazo propose a conversion algorithm which takes a boundary represen-
tation of an object with planar faces and converts it to a PM-Octree. The algorithm follows a
block cla&sification paradigm: the space is recursively subdivided into octants; at each recursion
step a list of active faces for ·each octant is selected from the list of active faces at the upper
level. The recursive subdivision is performed until such information can be encoded in a leaf
node. The authors use a linear DF-expression [KawSO] for the output tree, where six different
types of nodes are used, namely: B (full, or black), W (empty, or white), F (face), E (edge), V
(vertex), and G (partial, or grey).
Nodes of type F, E and V must contain a pointer to a structure describing the face, edge or
vertex contained in the node respectively. Given the list of plane equations of all the object faces
(from the boundary model), it is sufficient to store pointers to such list: one face is described
by one plane, one edge by two planes, and one vertex by three or more planes intersecting at
the vertex itself. If planes are encoded in the list in such a way that the normal vectors of the
corresponding faces are always directed towards the interior [the exterior] of the object, such
normal vectors define the configuration of the block, i.e., the portion of the block which is void
and the one which is full (see Figure 7).

The main procedure is shown in Figure S, which performs the recursive subdivision and
outputs the PM-Octree. The output structure is considered global and is updated by procedure
472

PUTOUT which adds the new nodes to the DF-expression and sets links to the list of planes.
At its first call, the procedure is passed the whole universe and the complete list of faces from
the boundary representation.

In their implementation, the authors maintain also a list of active vertices within the block,
that is used and recomputed by procedure CLIPPING and is consulted by the branch tests to
improve efficiency. Procedure CLIPPING must select the list of faces [of vertices] contained in
the current block, from the list of faces [of vertices] active in the parent block. Such task a priori
involves computing cumbersome and numerically unstable intersection algorithms, as outlined
in the previous sections. The authors propose to apply for each face the following cascade of
heuristic tests to avoid, if possible, such computations:

- detect if any vertex of the face lies inside the block (to accept the face)
- use a max-min test between the block and the parallelepiped containing the face (to discard
the face)
- see if the plane of the face intersects the block (to discard the face)
- try to intersect the edges of the face with the faces of the block (to accept the face)

If all the previous tests do not give sufficient information to accept or discard the face, an explicit
intersection between the face and the block is computed.
When the condition for a full or void block holds, the color of the output node can be
decided efficiently on the base of its adjacent face, edge and vertex nodes. The color can be left
undefined first, and it can be set to B or W by visiting the output tree after all "boundary"
nodes are detected. Alternatively, a connected component algorithm can be applied to label
black nodes; this latter approach has been developed first in an algorithm for converting a
BRep to a bintree by Tamminen and Samet [Tam84].
Some tests in the if cascade inside procedure BUILD_OCTREE require computing inter-
section -of planes, and eventually the point or edge location with respect to the current block.
Special cases, like a block containing two planes which do not intersect inside it, must be treated.
The complexity of the algorithm is linear in the number of nodes of the output tree, though
each step requires more computation than a corresponding block classification algorithms for
computing a standard Octree. On the other hand, Brunet and Navazo show a comparison table
between the resulting Octree and PM-Octree of the same solid object: the byte length of the
output is reduced of a factor 1000 and the number of leaf nodes even of a factor 10,000 in the
case of the PM-Octree.

a b
Figure 7: Two possible configurations for an edge node in a PM-Octree [Sam90a].
473

Procedure BUILD_OCTREE(x,y,z,scale,/acelist);
/ / x, y, z and scale define the block that is analyzed (its origin and the length of its edge);
/acelist is the list of active faces in the block. / /
begin
sq +- scale/2;
for i +- 1 to 8 do
compute the origin Xi,Yi,Zi of the ith octant;
CLIPPING( Xi, Yi, zi ,sq,/acelist,/acelistl);
if /acelistl is empty then
if the node is interior then
PUTOUT("B")
else
PUTOUT("W")
end if
else if only one face in /acelistl then
PUTOUT("F" Jacelistl)
else if only two faces meeting at an edge in /acelistl then
PUTOUT("E" Jacelistl)
else if all faces in /acelistl meet at a point inside the block then
PUTOUT("V" Jacelistl)
else if sq > minscale then
PUTOUT("G");
BUILD_OCTREE(xi, Yi, zi,sq,/acelistl)
else
PUTOUT("G" Jacelistl)
end if
end BUILD_OCTREE;

Figure 8: Recursive procedure that converts from BRep to PM-Octree.

The algorithm proposed by Carlbom et al. [Car85) differs from the previous one in using
clipped polygons instead of complete faces as active information at each recursion step. The
main effort in the subdivision task is devoted to the clipping algorithm: when subdividing a
block into oct ants, three orthogonal clipping planes are activated in a sequence. Each polygon
is traversed sequentially along its edges, and intersections with the current clipping plane are
computed. The vertices of the polygon are parted into three classes, one each for the vertices
on either side of the plane, and the third for the vertices on the plane. The subdivision step
(computing clipped polygons) seems more complex with respect to the one of the previous
algorithm. On the other hand, when the leaf level is reached, the information related to each
block can be stored explicitly (i.e., through vertices and edges) without any further computation,
while Brunet and Navazo store it implicitly (i.e., through plane equations).
Like in the previous algorithm, in the first step only the boundary of the object (i.e., face,
edge and vertex nodes) is built. Carlbom et al. propose to decide the color of void and full
blocks using a ray-tracing technique like the one described in [Rot82).
Finally, Durst and Kunii [Dur89), propose another algorithm which, while following the
classical subdivision paradigm, tries to improve its performance by exploiting all the geometrical
information available in the BRep: namely, vertices, edges and faces. At each subdivision step,
active vertices are checked first against the current block. If more than one vertex is still active,
474

the block is recursively split without any further computation, if just one vertex is active, a
vertex leaf is returned. Similarly, the algorithm checks in an if cascade if more than one, one
or none edge, more than one, one or none face is still active, in order to perform branches. The
algorithm is slightly more complex than the previous ones because the output structure is more
sophisticated.

4.2 Conversion from CSG to PM-Octree

Algorithms that convert eSG schemes to exact octree models are quite similar to corresponding
algorithms for converting from eSG to approximate octree models. Navazo, Fontdecaba and
Brunet propose an algorithm [Nav87) which follows an approach developed first for standard
octrees by Samet and Tamminen in [Sam85a,Sam85b).
The algorithm takes a eSG based on planar half-spaces (that the authors call an expanded
CSG tree) as input. First the model is scaled and translated into a cubic universe with an edge
of length 2ft. The algorithm traverses the eSG tree and builds the PM-Octree recursively, by
checking at each recursion step the information active within the current block (subuniverse).

The original input is a eSG tree where internal nodes are intersection and union operations
and leaves are halfspaces. As in the algorithm for conversion to the Octree, full or void leaves
can appear when pruning the eSG tree to restrict it to a subuniverse. Again, the subdivision
process stops when the current block can be encoded as a PM-Octree node, or if the block is at
the voxel level.
The main recursive procedure is passed initially the entire universe, the complete eSG
tree, and the list of all halfspaces in the tree. The output format is a linear octree as in the
algorithm by Brunet and Navazo presented in the previous section, and is again updated by
procedure PUTOUT. Such procedure also sets the pointers to the appropriate plane equations
and computes the node configurations for nodes of type face, edge and vertex. The pseudo-code
description of the main procedure is shown in Figure 9. The names of procedures and functions
are self-explanatory.
The PM-Octree produced by the previous procedure can be pruned by traversing it, and
condensing each group of eight adjacent nodes at the same level which are similar into a unique
node at the higher level. The authors propose to prune the tree dynamically at each new
insertion. The pruning procedure is called by procedure PUTOUT and checks the last nine
elements of the output list: if there is one G node followed by eight similar sons, then they
are condensed and the procedure is called recursively using the updated list. The concept of
similarity among nodes is not straightforward when nodes of type E or V are present.
Procedure SUBDIVIDE handles the special case of a block intersecting just two planes
which do not intersect inside the block itself. In this case, the block is recursively split until all
leaf blocks obtained are blocks of type F, B or W).
Procedure CLIPPING determines then which halfspaces in the list are still active in passing
from a node father to a node son. The procedure substitutes each halfspace that is no more
active (i.e., the plane defining it does not intersect the block) with a B or a W node. Then, the
eSG tree is pruned by traversing the eSG tree in depth-first order and removing inactive eSG
nodes (the pruning procedure is analogous to the one developed by Samet and Tamminen in
475

procedure BUILD_OCTREE( z,y,z,scale, CSGT,spacelist);


/ / z, y,z and scale define the block that is analyzed (its origin and the length of its edge);
CSGT is the CSG tree related to the current block.
spacelist is the list of active halfspaces wrt the current block.
Local symbols 11"10 11"2 are used to denote halfspaces.
Symbol [BOj denotes a generic Boolean operation. / /
begin
if CSGT = "B" then
PUTOUT("B")
else if CSGT = "w" then
PUTOUT("W")
else if CSGT = 11"1 then
PUTOUT("F" ,CSGT)
else if CSGT = 1I"1[BOj1l"2 then
if EDGE.IN _CUBE( z,y,z,scale,spacelist) then
PUTOUT("E" ,CSGT,spacelist)
else
SUBDIVIDE(z,y,z,scale,CSGT,spacelist)
end if
else if ALLPLANES_THROUGH-A-POINT(spacelist, V) and
POINT.IN _CUBE( V,T,y,z,scale)then
PUTOUT("V" ,CSGT,spacelist)
else
newsc +- scale;
if newsc = minscale then
PUTOUT("G" ,CSGT,spacelist)
else
PUTOUT("G")i
for i +- 1 to 8 do
COMPUTE_COORD( i,z,y,z,newsc,zn,yn,zn)i
CLIPPING(zn,yn,zn,newsc,CSGT,spacelist,NCSGT,nspacelist)i
BUILD_OCTREE(zn, yn,m, newsc, NCSGT, nspacelist)i
end for
end if
end if
end BUILD_OCTREE

Figure 9: Recursive procedure that converts from CSG to PM-Octree

[Sam85a]).

4.3 Conversion from CSG-DAG to PM-CSG

In [Wyv85j, Wyvill and Kunii propose an algorithm for coverting from a eSG-DAG to a PM-
eSG. The algorithm follows a bottom-up approach and it is an interesting application which
clarifies the data structures encoding both the eSG-DAG and the PM-eSG.

The first step of the algorithm is devoted to an independent evaluation of the positions of
all instances of primitive objects in the input model. As the eSG-DAG is a structure whose
476

proced ure CREATE-LISTS( x,y,z,scale,lacelist, vertexlist);


/ / x, y, z and scale define the block that is analyzed (its origin and the length of its edge);
lacelist is the list of faces of the object; vertexlist is the output list of vertices of the object.
tree is a global array storing the linear PM-Octree.
pa is a global index of the current position in the tree; pa is set to the root (Le., the first
posi tion) at the first call. II
begin
sq .... scale /2;
for i .... 1 to 8 do
pa .... pa + 1;
if TYPE( tree[pa]) = "G" then
CREATE-LISTS(x + aXi * sq,y + aYi * sq,z + aZi * sq)
else if TYPE( tree[pa]) = "V" then
compute the new vertex v of the BRep as intersection of its faces;
insert v into vertexlist;
for every face I E lacelist incident on v do
store a pointer from I to Vj
store pointers from I to the two faces incident on v and adjacent to Ij
end for
end if
end for
end CREATE-LISTS;

Figure 10: Recursive procedure implementing the first step of the conversion from PM-Octree to BRep

leaves refer to primitive objects in a library, instances of such objects must be located in the
three-dimensional space by traversing the DAG and evaluating the transformation matrices
stored in the internal nodes. Once all primitives are located, an elementary PM-CSG tree is
built for each one of them, which consists of a single leaf-node (recall that to contain a single
primitive is a sufficient condition to be a leaf node for a block in a PM-CSG tree). The leaf node
will contain a pointer to all semantic information about the primitive (e.g., position, functional
definition), and a flag indicating if the primitive is full (positive) or empty (negative).

The second step of the algorithm traverses again the CSG-DAG and recursively composes
the PM-CSG trees. Both steps can be implemented by a unique recursive procedure, which
carries on both the location and the composition tasks; in fact, the recursion must expand first
to the leaves of the DAG, (completing the location), and then rise up the hierarchy, to compute
the space division (i.e., the PM-CSG tree compositions). As all Boolean operations in a CSG-
DAG are disjoint unions and differences, the space division is performed by algorithms which
compute disjoint unions and differences of PM-CSG trees. We do not specify here the rules
applied to perform such Boolean operations. List of the branch tests performed to compute
each operation can be found in [Wyv85j, though general paradigms followed by such procedures
are just extensions of Boolean operations on quad-trees described in [Sam90bj. Of course, such
operations involve recursive subdivision of the space, and branch tests on blocks and primitive
objects. The basic inclusion test of a primitive against a block is performed by computing
the function(s) defining the object itself on the eight vertices of the block, transformed to
the original location of the primitive model through inverse matrices. Such test returns three
possible values, namely it indicates if the block is completely inside, completely outside, or it
crosses the primitive. In [Wyv85j, Wyvill propose a fast technique for computing such test,
477

which exploits a projection of the transformed block on a plane: geometric computations in 2D


are simpler than in 3D, and simple considerations allow to turn 2D results back to the 3D case.

4.4 Conversion from PM-Octree to boundary representation

The PM-Octree structure can be regarded as a superimposition of a spatial index on a boundary


representation: all geometric information present on the BRep is implicitly maintained in the
PM-Octree. For this reason, converting a PM-Octree structure to a BRep is a simple task.

In [Bru85] Brunet and Navazo describe an algorithm which performs the conversion from
a PM-Octree to a BRep in two steps. First, the PM-Octree is traversed, and only nodes of type
vertex are considered. For every vertex, its coordinates are computed; moreover, for every face
incident into the vertex, a pointer to the vertex itself and two pointers to the two neighboring
faces are stored. This is simple, because a list containing all faces of the object (more precisely,
their equations) is part of the data structure encoding the PM-Octree, and pointers to the faces
are stored in a cyclic order in a vertex node. After the first step, all vertices, faces, and relations
face-vertex, vertex-face and face-face are obtained. Figure 10 presents the pseudo-code of a
procedure implementing the first step of the algorithm. The time complexity of such procedure
is linear in the number of nodes of the PM-Octree.
In a second step, the algorithm scans the list of faces, and uses the information stored in it
to detect closed polygons bounding the faces. This corresponds to compute all the edges of the
object, and the relations edge-vertex and face-edge, thus completing all topological information.

4.5 Conversion from the FFC model to boundary representation

The problem of converting an FFC model into a boundary representation is termed boundary
evaluation (in analogy with the problem of converting from CSG to BRep). While the evaluation
of a CSG model is a difficult task because a large amount of information about the object
boundary is implicitly encoded in a CSG, the evaluation of an FFC model is considerably
easier, since the information about the object boundary is clearly represented. The evaluation
task is further simplified if we do not require that the descriptions produced at an intermediate
step are valid boundary models.

The evaluation of an FFC model M = (M D, F) of an object S consists of the iterative


application of a glue operation to pairs of face-adjacent components. At each step, two face-
adjacent components Ci and Cj are joined together along their common connection facets. If a
connection facet is homogeneous, then it is eliminated as soon as any two components sharing it
are merged. If a connection facet is non-homogeneous, then it remains as part of the boundary
of S. The boundary evaluation process can thus be regarded as the elimination of homogeneous
connection facets in F. Note that the resulting boundary description of the object may contain
edge-adjacent facets belonging to the same surface because of the effect of fragmentation F. In
a planar-faced environment a segmentation of the boundary of S into maximal connected faces
could be produced by applying a face-growing algorithm to the segmentation produced by the
evaluation process.

The boundary evaluation process can produce invalid representations of object parts at
intermediate steps. To be sure to have valid intermediate results, the algorithm should follow
478

algorithm FFC_EVALUATION(G');
/ / G' is the connection subgraph of the FFC graph of object S / /
begin
repeat
let Gi be a node of G';
for every node Gj adjacent to Gi in G' do
for every hyperarc f in G' incident on Gi and Gj do
DELETE-EXTREME-NODES(J,Gi,Gj);
if EXTREME-NODE-SET(J) = 0 then
DELETEJIYPERARC(J);
if not ISJIOMOGENEOUS(J) then
ADD_TO_BOUNDARY( Gi, /);
end if
end for
end for
until G' has a single node
end FFC_EVALUATION.

Figure 11: Algorithm that computes the conversion from FFC to BRep.

the design sequence, i.e., the sequence according to which the object has been created by the
designer from the original components.

The boundary evaluation of the FFC model can be expressed as a merging algorithm
applied to the connection subgraph of its FFC graph. Performing the union of two components
Ci and Cj is equivalent to merging the corresponding two nodes in the connection graph. This
leads also to a modification of the hyperarcs incident on both nodes: Ci and Cj must be
eliminated from the set of extreme nodes of such hyperarcs. A hyperarc h is eliminated in this
process when all of its extreme nodes have been merged together. The corresponding algorithm
is described in Figure 11. The names of the functions and procedures used are self-explanatory.

At the end of the execution of the algorithm FFC..EVALUATION, the single node in G'
contains a description of the boundary of S. Note that each time we eliminate a hyperarc
describing a non-homogeneous connection facet, such facet is added as a boundary facet to the
component Ci into which a face-adjacent component Cj is merged. The time complexity of the
algorithm is linear in the number of hyperarcs in G', i.e., of facets belonging to fragmentation
F.

The above merging algorithm can be applied for local modifications of the FFC model, for
example to form complete components from elementary components. When a merge is applied
to a single pair of face-adjacent components, we want to be sure that the boundary description of
the resulting component is valid. Validity checks on such descriptions can be performed locally
by looking at boundary and connection facets as described in [DeF91].

5 Concluding Remarks

We have considered the problem of converting between representations of solid objects, which
is a basic issue for building efficient and robust solid modeling systems. No representation is
479

good for the different operations that must be performed on it in a solid modeler. Thus, there is
the need of using several representations in a single modeler or to have a representation scheme
capable of embedding the useful features of several existing scheme. In this direction, we have
considered hybrid models which are an attempt to combine existing models. A distinction must
be made between PM-Octrees and PM-CSG trees on one side and modular boundary models,
like the FFC model. The former essentially operate as a spatial index on top of a boundary
or CSG representation, while the latter combines the principles of CSG and BRep to provide a
feature-based object description.
Since many modeling systems are based on both boundary and CSG representation, we
have discussed the problems involved in conversions between the two schemes. PM-octrees
can also be used as an intermediate structure for boundary evaluation of a CSG tree, since
efficient algorithms exist which convert from CSG to PM-Octree and from PM-Octree to BRep.
PM-Octree seem to be a very interesting hybrid model to be maintained in connection with a
boundary representation, since the conversion between a BRep and a PM-Octree is the only
invertible one.
Conversion problems on an MBM have been only considered as far as conversion to and
from a boundary representation is concerned. The problem of converting from a boundary
representation to an MBM involves form feature recognition. The algorithms mentioned in
section 3.2 produce all a description of the form features extracted in terms of their boundary.
The algorithms described in [DeF89b,DeF89c,Fal89] organize also the features extracted in an
MBM: they are methods for building an MBM from a BRep. The purpose of producing an
MBM from a BRep is not to facilitate operations on the BRep, as in the case of conversions to
Octrees or PM-Octrees, but to generate a solid model suitable for manufacturing and assembly.
Much work must be done in investigating and experimenting all potential capabilities of
an MBM. In our opinion, it is worth investigating the combined use of an MBM and a PM-
Octree in a solid modeler. The PM-Octree would be useful in improving efficiency of Boolean
operations and integral property computations on an MBM.

References

[Ans85] Ansaldi S., De Floriani L., Falcidieno B., (1985), "Geometric modeling of solid objects
by using a face adjacency graph representation", Computer Graphics, 19, 3, pp. 131-139.
[Aya85] Ayala D., Brunet P., Juan R., Navazo I., (1985), "Object representation by means of
non minimal division quad trees and octrees", A CM Transaction on Graphics, 4, 1, pp.
41-59.
[Bau72] Baumgart B.G., (1974), "Winged-edge polyhedron representation", Technical Report
STAN-CS-320, Computer Science Department, Stanford University, Stanford, CA.
[Bru85] Brunet P., Navazo I., (1985), "Geometric modeling using exact octree representation of
polyhedral objects", Proceedings EUROGRAPHICS'85, pp. 159-169.
[Bru89] Brunet P., Navazo I., (1989), "Solid representation and operation using extended oc-
trees", A CM Transaction on Graphics, 8.
480

[Car85J Carlbom I., Chakravarty I., Vanderschel D., (1985), 'A hierarchical data structure for
representing spatial decomposition of 3-D objects", IEEE Computer Graphiu and Appli-
cations, 5, 4, pp. 24-31.
[DeF88J De Floriani L., Falcidieno B., (1988), "A hierarchical boundary model for solid object
representation", ACM Transactions on Graphics, 7, 1, pp. 42-60.
[DeF89aJ De Floriani L., Maulik A., Nagy G., (1989), "Manipulating a modular boundary model
with a face-based graph structure", Geometric Modeling for Product Engineering. IFIP WG
5.2, Wozny M.J., Thrner J.U., Preiss K., Eds., North-Holland, pp. 131-143.
[DeF89bJ De Floriani L., (1989), "Feature extraction from boundary models of three-dimen-
sional objects", IEEE Transactions on Pattern Analysis and Machine Intelligence, 11, 8,
pp. 785-798.
[DeF89cJ De Floriani L., Bruzzone E., (1989), "Building a feature-based object description from
a boundary model", Computer Aided Design, 21, 10, pp. 602-610.
[DeF91J De Floriani L., Maulik A., Nagy A., (1991) "Representation of solid objects by a
modular boundary model ", Computer-aided mechanical assembly planning, Homemde-
Mello L.S., Lee S., Eds., (to be published).
[Diir89J DUrst M.J., Kunii T.L., (1989) "Integrated polytrees: a generalized model for the
integration of spatial decomposition and boundary representation", Theory and Practice of
Geometric Modeling, W. StraBer, H.P. Seidel, Eds., Springer-Verlag, pp. 329-348.
[Fal89J Falcidieno B., Giannini F., (1989), "Automatic Recognition and Representation of
Shape-Based Features in a Geometric Modeling System", Computer Vision, Graphics and
Image Processing, 48, 1, pp. 93-123.
[Fuj85J Fujimura K., Kunii T.L., (1985) "A hierarchical space indexing method", Proceedings
of Computer Graphics'85, Tokyo, TI-4, pp. 1-14.
[Hen84J Henderson M.R., (1984), "Extraction of feature information from three-dimensional
CAD data", PhD Thesis, Purdue University.
[Jar84J Jared G.E., (1984), "Shape features in geometric modeling", Solid Modeling by Com-
puters: from Theory to Applications, Plenum, New York.
[Jar84] Joshi S., Chary T., (1988), "Graph-based heuristics for recognition of mechanical fea-
tures from a 3D solid model", Computer Aided Design, 20, 2.
[Jua88] Juan R., (1988), "Boundary to Constructive Solid Geometry: a step towards 3D con-
version", Proceedings EUROGRAPHICS'88 Conference, Duce D., ed., North-Holland, Am-
sterdam, pp. 129-139.
[Jua89J Juan R., (1989) "On Boundary to CSG and extended octree to CSG conversions",
Theory and Practice of Geometric Modeling, Strasser W., Seidel H.P., Eds., Springer-Verlag,
pp. 349-367.
[Kaw80] Kawaguchi E., Endo T., (1980), "On a method of binary picture representation and
its application to data compression", IEEE Transactions on Pattern Analysis and Machine
Intelligence, 2, 1, pp. 27-35.
481

[Kyp80] Kyprianou L.K., (1980), "Shape classification in Computer-Aided-Design", PhD Dis-


sertation, Computer laboratory, University of Cambridge, England.
[Miin87] Miintylii M., (1987), An Introduction to Solid Modeling, Computer Science Press,
Rockville, MD.
[MeaSO] Meagher D., (1980), "Octree encoding: a new technique for the representation, the ma-
nipulation, and display of arbitrary 3-d objects by computer", Technical Report, Electrical
and System Engineering IPL-TR-80-111, Rensselaer Polytechnic Institute, Troy, NY.
[Nav87] Navazo I., Fontdecaba J., Brunet P., (1987), "Extended octrees, between CSG trees and
boundary representations", Proceedings EUROGRAPHICS'87, North-Holland, pp. 239-247.
[Nav89] Navazo I., (1989), "Extended octree representation of general solids with plane faces:
model structure and algorithms", Computers & Graphics, 13, 1, pp. 5-16.
[Req80] Requicha A.A.G., (1980), "Representations of rigid solids: theory, methods, and sys-
terns", A CM Computing Surveys, 12, 4, pp. 437-464.
[Req82] Requicha A.A.G., Voelcker H.B., (1982), "Solid modeling: a historical summary and
contemporary assessment", IEEE Computer Graphics and Applications, 2, 2, pp. 9-24.
[Req85] Requicha A.A.G., Voelcker H.B., (1985), "Boolean operation in solid modeling: bound-
ary evaluation and merging algorithms", Proceedings IEEE, 73, 1.
[Ros89] Rossignac J.R., Voelcker H.B., (1988), "Active zones in CSG for accelerating boundary
evaluation, redundancy elimination, interference detection, and shading algorithms", A CM
Transaction on Graphics, 8, 1, pp. 51-87.
[Rot82] Roth S.D., (1982), "Ray casting for modeling solids", Computer Graphics and Image
Processing, 18, pp. 109-114.
[Sam85a] Samet H., Tamminen M., (1985), "Bintrees, CSG trees, and time", Computer Graph-
ics, 19, 3, pp. 121-130.
[Sam85b] Samet H., Tamminen M., (1985), "Approximating CSG trees of moving objects",
Technical Report, CS-TR-1472, University of Maryland, College Park, MD.
[Sam90a] Samet H., (1990) The Design and Analysis of Spatial Data Structures, Addison-
Wesley, Reading, MA, 1990.
[Sam90b] Samet H., (1990) Applications of Spatial Data Structures, Addison-Wesley, Reading,
MA.
[Tam84] Tamminen M., Samet H., "Efficient octree conversion by connectivity labeling", Com-
puter Graphics, 18, 3, pp. 43-51.
[Thi87] Thibault W.C., Naylor B.F., (1987), "Set operations on polyhedra using space parti-
tioning trees", ACM Computer Graphics, 21,4, pp. 153-162.
[TiI80a] Tilove R.B., Requicha A.A.G., (1980), "Closure of Boolean operations on geometric
entities", Computer Aided Design, 12, 5 pp. 219-220.
[TiI80b] Tilove R.B., (1980), "Set membership classification: a unified approach to geometric
intersection problems", IEEE Transactions on Computers, C-29, 10, pp. 874-883.
482

[Voe77] Voelcker H.B., Requicha A.A.G., (1977), "Geometric modeling of mechanical parts and
processes", IEEE Computer, 10, 12, pp. 48-57.
[Wei86] Weiler K., (1986), "Topological structures for geometric modeling", Ph.D. dissertation,
Department of Computer and System Engineering, Rensselaer Polytechnic Institute, Troy,
NY.
[Wil85] Wilson P.W., Pratt M., (1985), "Requirements for support of form features in a solid
modelling system", Technical Report, Geometric Modelling Project, CAM-I.
[Woo82] Woo T.C., (1982), "Feature extraction by volume decomposition", Proceedings Con-
ference on CAD/CAM in Mechanical Engineering, MIT, Cambridge, MA.
[Woo85] Woo T.C., (1985), "A combinatorial analysis of boundary data structure schemata,
IEEE Computer Graphics and Applications, 5, 3, pp. 19-27.
[Wyv85] Wyvill G., Kunii T.L., (1985), "A functional model for constructive solid geometry"
Visual Computer, 1, 1, pp. 3-14.
[Wyv85] Wyvill G., (1990), "Geometry and modelling for CAD systems", Tutorial Notes - CG
International'90 Conference, Singapore.
483

Leila De Floriani is Professor of Computer Science at the University


of Genova, Genoa, Italy. She received an advanced degree in Math-
ematics from the University of Genova in 1977. From 1977 to 1981
she was a Research Associate at the Institute of Applied Mathemat-
ics of the Italian National Research Council, Genoa, Italy, and from
1981 to 1982 an Assistant Professor at the Department of Mathemat-
ics of the University of Genova. From 1982 to 1990 she has been a
Senior Scientist at the Institute of Applied Mathematics of the Ital-
ian National Research Council. In 1981 she was a visiting professor at
the Department of Computer Science of the Univeristy of Nebraska.
Between 1985 and 1989 she has been a visiting professor at Rens-
selaer Polytechnic Institute, Troy, New York. Leila De Floriani has
written over 80 technical publications on the subjects of algorithms
and data structures, geometric modeling, computational geometry, and
graph theory. Her present research interests include computational ge-
ometry, geometric modeling, and computer graphics. Leila De Flori-
ani is a member of ACM, IEEE Computer Society, Computer Graph-
ics Society (CGS), International Association for Pattern Recognition
(IAPR), and Italian Association for Automatic Computation (AICA).
Address: Department of Mathematics, University of Genova, via L.B.
Alberti, 4, 16155, Genoa, Italy.

Enrico Puppo is Senior Scientist at the Institute for Applied Math-


ematics of the Italian National Research Council. He received an ad-
vanced degree in Mathematics from the University of Genova (Italy) in
1986. From 1986 to 1988 he has been Reserach Associate at the In-
stitute of Applied Mathematics of the Italian National Research Coun-
cil (C.N.R.). In 1989 and 1990 he has been Visiting Research Scien-
tist at the Center for Automation Research (University of Maryland).
His research interests include computational geometry, geometric mod-
eling, computer vision, design of algorithms and data structures.
Address: Institute for Applied Mathematics, Italian National Re-
search Council (C.N.R.), via L.B. Alberti, 4, 16155, Genoa, Italy.
Surface Generation Using Implicit Cubics
Baining Guo

Abstract
Modeling physical objects with low-degree algebraic surfaces shows promise for applications
where manipulating and reasoning about physical objects are important. In this paper, we
present an algorithm for free-form surface constructions using implicitly defined cubic surface
patches. The input data for the algorithm is an arbitrary polyhedron with a normal prescribed
at each vertex of the polyhedron. Using a Clough-Tocher like splitting scheme, the algorithm
constructs a smooth piecewise cubic surface interpolating the vertices of the polyhedron and the
prescribed normal at each vertex. The free-form surface construction in the algorithm is local
and quadratically precise. In addition, the shape of the free-form surfaces can be manipulated
through a set of intuitive shape parameters without knowing the details of the algorithm. The
implementation results are reported.

Keywords: Geometric modeling, object representation, free-form surface, Bernstein-Bezier


representation, implicit patch, design.

1 Introduction
While developing a geometric modeling system for representing, manipulating and reasoning about
physical objects, we derived and implemented an algorithm for constructing geometric models for
smooth objects of arbitrary shapes and topologies. Such geometric models are important for solid
modeling, computer-aided design, visualization, computer graphics, and robotics.
The geometric models of arbitrary smooth objects are represented by closed free-form surfaces.
The algorithm we drive generates a free-form surface from the input data of an arbitrary polyhedron
with a normal prescribed at each vertex of the polyhedron. Using a Clough-Tocher like splitting
scheme, the algorithm constructs a smooth piecewise cubic surface interpolating the vertices of the
polyhedron and the prescribed normals.
The algorithm we derive has the following features. First, the algorithm is local, so modifying a
piece of input data affects only nearby surface patches. Second, the algorithm has quadratic preci-
sion, which means that if the input data is taken from a quadric surface,' the algorithm reproduces
the quadric surface. Finally, the shape of the free-form surfaces produced by the algorithm can be
controlled through a set of intuitive shape parameters without knowing the details of the algorithm.
An important motivation of our work is to construct geometric models that facilitate manipulat-
ing and reasoning about physical objects (Hopcroft and Krafft 1986; Hoffmann 1989). Traditionally,
the building blocks for free-form surface constructions are parametric patches. As far as design and
display are concerned, parametric patches are very successful. But when it comes to manipulating
and reasoning about physical objects, parametric patches run into serious problems. Parametric
patches are not closed under some elementary operations in geometric modeling, such as sweeping
and Minkowski sum (Bajaj and Kim 1987). The intersection of two parametric patches is extremely

485
486

difficult to represent and evaluate (Hoffmann 1989) because the algebraic degree of the intersection
is prohibitively high. As an example, we notice that in general the intersection of two commonly
used bicubic patches is a space curve of degree 324 (Hopcroft and Krafft 1986).
These problems can be avoided by building free-form surfaces from low-degree implicit patches.
Implicit patches are closed under all common operations required by a geometric modeling system
(Bajaj 1989), and the intersection of two degree n implicit patches has degree n 2 , which is small if
n is. Low-degree implicit patches also allow the use of algebraic techniques as opposed to numerical
techniques in reasoning about physical objects (Hopcroft and Krafft 1986). These features make
implicit patches a superior choice for applications where manipulating and reasoning about physical
objects are important. In addition, from a practical point of view, implicit patches are compact to
store and relatively easy to ray trace.
An inviting class of implicit patches for free-form surface constructions is the class of quadric
patches. When the input data is a polyhedron without normals prescribed at its vertices, a free-
form surface can be constructed using quadric patches. However, quadric patches have fundamental
limitations that make it impossible to allow prescribing normals in the input data. Roughly speak-
ing, when a free-form surface is constructed by replacing the facets of the input polyhedron with
quadric patches, they introduce a correlation between the normals at the adjacent vertices of the
input polyhedron. We have investigated the role of quadric patches as primitives for free-form
surface constructions, and we hope to report the results elsewhere.
Being able to prescribe the normals in the input data is important. Prescribing normals is a
measure to control the patches in the free-form surfaces so that only a few patches are needed for
representing a smooth object that would otherwise requires thousands of polygons to approximate.
One way to overcome the limitations of quadric patches is to split the edges of the input polyhedron,
as was done by Dahmen (Dahmen 1989). However, from a theoretical point of view, Dahmen's
method cannot handle arbitrary input polyhedron because his method requires the existence of
"transversal systems", which no one knows how to construct in general; from a practical point
of view, splitting the edge of the input polyhedron causes oscillations in the free-form surfaces,
making it impossible to produce free-form surfaces of pleasing shapes. In this paper, we show that
the limitations of quadric patches can be overcome by cubic patches.

1.1 Previous work


There is a rich literature on surface constructions using parametric patches, and a recent survey can
be found in (Mann et al. 1990). Modeling complex objects with implicit patches was introduced
in recent years and is becoming an increasingly prominent area of research. General techniques
for implicit modeling are developed by researchers worldwide (Nishimura et al. 1985; Bloomenthal
and Wyvill 1990; Dahmen 1989). In particular, many authors have demonstrated the power of
implicit patches in deriving blending surfaces (Blinn 1982; Middleditch and Sears 1985; Hoffmann
and Hopcroft 1987; Rockwood and Owen 1987) and in surface fitting and approximation (Bajaj
and Ihm 1989; Patrikalakis and Kriezis 1989).
Seder berg proposed using Bernstein-Bezier representation of implicit patches in free-form sur-
face constructions (Sederberg 1985). Subsequently, various techniques are developed for construct-
ing free-form surfaces using implicit patches (Patrikalakis and Kriezis 1989; Bajaj and Ihm 1989;
Moore and Warren 1990; Sederberg 1990). In particular, Patrikalakis et al., Sederberg, and Bajaj
et al. demonstrated the complications and pitfalls of modeling with implicit patches (Patrikalakis
487

and Kriezis 1989; Bajaj and Thm 1989; Sederberg 1990).


Dahmen (Dahmen 1989) gave an algorithm for constructing free-form surfaces from quadric
patches. But the algorithm cannot handle arbitrary polyhedra, and the splitting scheme in the
algorithm prevents it producing pleasing shapes. There are also algorithms for constructing free-
form surfaces with implicit patches of degree six and degree five (Moore and Warren 1990; Bajaj
1990).

2 Conceptual overview
The algorithm described in this paper builds free-form surfaces from the input data of a polyhedron
with a normal vector prescribed at each vertex of the polyhedron. The input data is denoted by
(P, N), where P is an arbitrary polyhedron with vertex set {Xb ... , xd, and N is a set of normals
{nb···, nk} with ni being the normal vector prescribed at Xi. The facets of P are assumed to be
triangular.
The basic idea of the algorithm is very simple. The free-form surface to be built must be
in the neighborhood of the input polyhedron P, so we construct a neighborhood ~ of P using
tetrahedra and creat a cubic polynomial for each tetrahedron used. By ensuring C l conditions
between adjacent tetrahedra, we obtain a global C l function that is a cubic polynomial in each
tetrahedron. The zero contour of this global C l function within the neighborhood ~ is the free· form
surface to be generated.
The following three aspects are crucial to the success of the algorithm.
1. The construction of a neighborhood ~ of the input polyhedron using tetrahedra. The neigh-
borhood must locally contain the tangent plane determined by the prescribed normal at each
vertex of the input polyhedron P, and the neighborhood must have the same topology as the
polyhedron P.

2. A scheme for defining a globally Cl function which is a cubic polynomial over each tetrahedron
within the neighborhood ~. The scheme must leave free control points in the definition of
each cubic polynomial so that the zero contour of the cubic polynomial can be controlled by
these free control points.

3. A mechanism to control the cubic polynomial defined for each tetrahedron so that the zero
contour of the cubic polynomial inside the tetrahedron is a single-sheeted cubic patch without
holes, extraneous sheets, self intersections, or other topological anomalies.
These three aspects will be stressed throughout the development of the algorithm.

3 Algorithm details
Now we address the three aspects of the algorithm in detail. In this paper, we use [Xl··· xnl to
denote the convex hull of point set {Xb· .. , X n }.

3.1 The construction of the neighborhood L:


The basic spatial elements used to build the neighborhood ~ of the polyhedron P are tetrahedra.
Tetrahedra are chosen for two reasons: one, tetrahedra are simple and flexible three dimensional
488

Figure 1: Filling gaps between two double tetrahedra

space units; two, tetrahedra facilitate the use of Bernstein-Bezier representation, which is the base
for this work.
The neighborhood ~ is constructed as follows. For each facet F = [XIX2X3] of the polyhedron
P, two points X4 and Y4 off each side of the facet are chosen, and they determine two tetrahe-
dra, [XIX2X3X4] and [XIX2X3Y4]. These two tetrahedra form a double tetrahedron denoted by
([XIX2X3X4], [XIX2X3Y4]). Consider an adjacent facet F' = [X~X2X3] and its double tetrahedron
([X~X2X3X~], [X~X2X3Ym. Between the double tetrahedra of facets F and F', there are two gaps.
One gap is between the tetrahedra [X~X2X3X~] and [XIX2X3X4]; the other is between [XIX2X3Y4]
and [X~X2X3Y~]. The first gap is filled with a pair of tetrahedra [X~X2X3X4] and [X~X2X3X~], and
the second gap is filled with another pair of tetrahedra, [Y~X2X3Y4] and [Y~X2X3Y~]. Here x~ and
Y~ are points on the line segments [X4X~] and [Y4Y~] respectively. All these are shown in Figure 1.
As an auxiliary geometric structure for the free-form surface construction, the neighborhood ~
must satisfy the following condition. At each vertex Xj of the polyhedron P, the neighborhood ~
should locally contain the tangent plane defined by nj. In other words, there is a disk D around
the vertex Xj in the tangent plane at Xj such that

DC ~.

3.2 A scheme for enforcing C 1 conditions over ~


Having built a neighborhood ~ of the polyhedron P, we construct a C 1 function j over the neigh-
borhood ~ so that
f(Xj) = 0, V' f(xj) = nj, i = 1" . " k. (1)
The zero contour of f within ~ is the free-form surface to be generated.
The construction of f can be outlined as follows. First, we split the the tetrahedra that have
facets of P as faces: the neighborhood ~ is kept the same except some of its tetrahedra are split.
Then, the function f is defined by constructing a cubic polynomial for each tetrahedron within the
neighborhood ~.
To show the splitting scheme, we take a facet [XIX2X3] and its double tetrahedron ([XIX2X3X4],
[XIX2X3Y4]) as an example. Let w be a point in the facet [XIX2X3]. We split the double tetrahedron
489

'"0 .. ".

Y' Y"
OD03 t3000
o control points to be decided
o control points that are free or are decided by free control point .
• control points from the input data
• control points from C t conditions

Figure 2: The C t conditions between two adjacent double tetrahedra

into six tetrahedra: [XiXjWX4] and [XiXjWY4] for 1 ::; i < j ::; 3. For symmetry and robustness
reasons, W is often chosen to be the centroid oftriangle [XtX2X3], while Y4 and X4 are chosen to be
on the line that passes through wand is perpendicular to[XtX2X3]'
The construction of cubic polynomials over the tetrahedra within ~ takes two steps. Consider a
facet F' = [X~X2X3] adjacent to facet F = [XtX2X3] and the double tetrahedron of F', ([X~X2X3X~],
[X~X2X3Ym. The facet F' and its double tetrahedron are split at the centroid w' of F' in the
same way that ([XtX2X3X4],[XtX2X3Y4]) is split at w. For the facets F and F', the first step takes
place over tetrahedra V t = [X2X3X4W], V2 = [X2X3X~W/], W t = [X2X3X~X4], W 2 = [X2X3X~X~],
V{ = [X2X3Y4W], V{ = [X2x3Y~W/], W{ = [X2X3Y~Y4], and W~ = [X2X3Y~Y~] as in Figure 2. We
construct the cubic polynomials over tetrahedra Wt, W 2 , WI, and W~. At the same time, the cubic
polynomials over tetrahedra Vt, V2 , VI, and V{ are partially determined through C t conditions,
The same process is carried out between every pair of adjacent facets of P, so at the end of the first
step, the cubic polynomials over the tetrahedra [XiXjX4W] and [XiXjY4W] are partially constructed
for all i < j ::; 3. Then, the second step completes the construction of these cubic polynomials
490

according to C 1 conditions.
Now we describe the first step in detail. Throughout this description, we assume i = 1,2
whenever i appears. By doing so we are taking advantage of the symmetry of the problem in
consideration.
Let the cubic polynomials Ii over Vi, II over V;', gi over Wi, and g; over WI be expressed in
Bernstein-Bezier forms as follows.

li(X) = L: a~B~(Ti)' (2)


1,\1=3

gi(X) = L: i 3
b,\B,\(Pi), (3)
1,\1=3

II(x) = L: C~B~(Tf), (4)


1,\1=3
and
g;(X) = L: d~B~(pi), (5)
1,\1=3
where Ti, TI, P: and Pi are the barycentric coordinates on Vi, V;', WI, and Wi respectively. We call
the a,\ 's, b,\ 's, c,\ 's, and d,\ 's the control points ofthe cubic polynomials Ii, If, gi, and g; respectively.
Our task is to determine these control points.
For notational convenience, if two tetrahedra sharing a common face, we equal the control
points of the associated cubic polynomials on the common face to ensure CO continuity. Hence
such control points will be defined only once.
All the control points over tetrahedra Vi, V;', Wi, and W[ that can be determined from the
input data are as follows. The fact that the zero contours of Ii, If, gi, and g; pass through X2 and
X3 implies

a~300 = a~030 = 0,
c~300 = C~030 = 0,
b~300 = b~30 = 0,
and
d~300 = d~30 = o.
More control points are determined by the normals at the vertices X2 and X3. For example,

Similar expressions are used to determine the control points a;ei +e k for j = 2,3 and k = 1, 4, C~i +e'
for j = 2,3, b;ei+e1 for j = 2,3, and d~ei+el for j = 2,3.
Before determining the rest of the control points according to C 1 conditions, we have to choose
some control points to be free control points whose values will be left unspecified at this point. This
is because creating a piecewise cubic Cl function lover the neighborhood ~ is only an intermediate
step in the free-form surface construction. Having defined a cubic polynomial whose zero contour
passing through some vertices of a tetrahedron does not guarantee the existence of a taut cubic
491

Figure 3: A handle on a cubic patch

patch inside the tetrahedron. There may not be a cubic patch inside the tetrahedron at all, or even
there is, the cubic patch can have self intersections, holes, and extra sheets. Figure 3 is a more
dramatical example: a handle appears on an otherwise nice cubic patch. If this cubic patch is in a
free-form surface constructed from the input polyhedron P, the topology of the free-form surface
is bound to be different from that of P.
We choose the following free control points a;e4 +ei (j = 1,2,3,4), C;e 4 +ei (j = 1,2,3,4), b~oo]l
and d~OOl for Ii, I:, gi, and gi respectively. The intuition of these control points are as follows.
Control points a;e 4+ei (j = 1,2,3) are equivalent to the function values and gradients at X4 and x~,
and the control point b~OOl enables us to have complete control of the function values of gi along the
line segment [X4X~). The same statement can be made about control points C;e4+e i (j = 1,2,3,4)
and d~OOl . In 3.3, we will explain how these free control points affects the associated cubic patches.
Now we determine the rest of the control points to ensure C 1 conditions. Consider the C 1
conditions across faces [X2X3X4) and [X2X3X~) . Suppose
492

and
X~ = i3ix; + i3ix2 + i3~X3 + i3lx~.
Then, the C l conditions are the following.

blo02 = i3t a loo2 + i3;a~102 + i3;a~oI2 + i3~a~oo3'


bl101 = i3t at 101 + i3;a~201 + i3;a~111 + i3~a~lo2' (6)

bloll = i3~atoll + i3;a~111 + i3;a~21 + i3~a~oI2' (7)


and
bi llo = i3~ aillo + i3;a~2Io + i3;a~12o + i3~a~111· (8)
The first three equations can be viewed as the definitions for the control points bilOl' b;Oll' and
billO' leaving aiOll and ai101 to be determined. Equation (8) will be treated later.
Moving on to the C l conditions across [X2X3X~J, we see that if

then the C I conditions are the following.

b;ooo = III b~OOl + 1l2b~ool' (9)

blo2o = III b~ 101 + 1l2b~ 101, (10)

bl 200 = Illb~Oll + 1l2b~Oll' (11)


and
(12)
Again, the first three equations can be viewed as definitions for control points b~ooo, bl o2o , and
bl 200 ; and the last equation will be treated later. Notice that bl oll and bl101 in the above equations
are defined earlier by the equations (6) and (7).
Finally, we consider the C I conditions across faces [X2X3Y4], [X2x3yn, and [X2X3Y~]. All the
g:
control points of and some of the control points of II can be fixed in the same way as the control
points of Ii and gi. In doing so, we also have two equations left untreated.

(13)

and
di llo = 1)I C6ll1 + 1)2 C6lll' (14)
where the coefficients 1)'S and -y's come from the following relations:

Y~ = -y;x; + -yi + -yj + -YlY~


X2 X3

and
Y~ = IlIY4 + 1l2Y~.
Now we collectively treat the equations (8), (12), (13), and (14) as promised. These equations
can be rewritten as

(15)
493

and
JLla6111 + JL2 a6111 = ,B~aillO + ,B~a~210 + ,B~a~120 + ,Bia~111' {16)
Here C~lll can be determined from ai,s through the C 1 conditions across [XIX2X3] and [X~X2X3]'

(17)

where the a's come from

and
Y~ = a~x~ + a~x2 + a~x3 + a~x~.
The equations (15), (16), and (17) form a system of six linear equations with six unknowns,
a~lll' a{llO' and C~111' When the points X4, x~, Y4, and Y~ are in general position, the system
always has a solution. It may happen that there is a family of solutions, in which case we choose a
solution as follows. Using the degree-elevation property of Bernstein-Bezier representation, we can
compute default values for a{110 and C~lll from the prescribed normals. A solution to the system
can be selected from the family of solutions according to these default values.
This finishes the first step in the construction of the cubic polynomials over the tetrahedra
within the neighborhood ~. Now gi and g: are completely constructed, while Ii and If are partially
constructed.
The second step completes the construction of Ii and If. For this purpose, we shift our focus to
the double tetrahedron ([XIX2X3X4], [XIX2X3Y4]), which has been split into tetrahedra [XIX2X4W],
[XIX3X4W], [X3X2X4W], [X3X2Y4W], [X3X2Y4W], and [X3X2Y4W],
Consider the problem of completing the construction of the partially constructed cubic poly-
nomials for tetrahedra U1 = [X2X3X4W], U2 = [XIX3X4W], and U3 = [XIX2X4W], Denote the cubic
polynomials for Ui by
hi(vi) = "" i 3
L..J a).B>..(Vi),
1>"1=3
where Vi is the barycentric coordinate of Ui. It is easy to recognize that polynomial hi is the same
as II in the first step. More generally, the partially constructed functions hi is the result of carrying
out the first step for [XIX2X3] and the facet sharing edge [xmxn] (m ::I i, n ::I i, 1 ::; m, n ::; 3) with
[XIX2X3]. We denote I]' Vb and Tl by hI, U], and VI in the second step to reflect the new symmetry.
The task of ensuring Cl conditions between Ui'S is greatly simplified by taking advantage of
the fact that W E [XIX2X3]. The control points over Ui can be divided into four groups. The i-th
group, called i-th layer, is the set of at>...>...>... such that A4 = i. Because W E [XIX2X3], the Cl
conditions between Ui and Uj only involve control points from the same layer. So we can satisfy
the Cl conditions by examining each layer as if we were working on bivariate polynomials.
For the O-th layer, the control points a~1>"2>"30 are defined previously for all Al ::; 1. Determining
the rest of the control points in this layer is exactly the famous Clough-Tocher interpolation in
finite element analysis. Figure 4 illustrates a standard solution (Farin 1986).
For the I-th layer, the control points a~'>"2>"31 are defined earlier for all Al = O. Since this layer
can be viewed as a bivariate quadratic function, the known control points uniquely determine the
rest of the control points within the layer through the C 1 conditions (Farin 1986).
The control points in the 2-th and 3-th layers are trivially determined by the the function value
and gradient at X4.
494

o centroid of surrounding boxes


o constructed from previous step
v centroid of surrounding circles

Figure 4: The Clough-Tocher bivariate splines

To complete the second step, we carry out the same argument for tetrahedra [XIX2Y4W],
[X2X3Y4W], and [XIX3Y4W]. As for the C 1 conditions across [XIX2X3], notice that these condi-
tions only involve control points from the O-th layer and the I-th layers. From equation (17) and
the way the control points in the I-th layer are determined, it is easy to see that the C 1 conditions
across [XIX2X3] are indeed satisfied.
Therefore, we have constructed the global C 1 function f satisfying (1). If the free control points
are chosen so that a "nice" cubic patch is obtained inside each tetrahedron within the neighborhood
~, then the zero contour of f inside the neighborhood ~ is the free-form surface to be constructed.

3.3 Obtaining and controlling the cubic patches


As we mentioned earlier, creating a C 1 function f over the neighborhood ~ according to (1) is only
an intermediate step. In general, such a function rarely yields the free-form surface we expect. The
problem is that some of the control points of a cubic polynomial strongly affect the zero contour
of the cubic polynomial inside the associated tetrahedron. If we let these control points be decided
by the C 1 conditions, then the zero contour of the cubic polynomial inside the tetrahedron exhibits
various behaviors undesirable for free-form surface constructions.
The following situations may occur for the zero contour of a cubic polynomial inside a tetrahe-
dron.
1. There is no zero contour in the interior of the tetrahedron even though the zero contour is
known to pass through several vertices of the tetrahedron.

2. There are self-intersection points, or singular points on a cubic patch.


3. There are holes on a cubic patch caused by the zero contour of the cubic polynomial leaving
and coming back to the tetrahedron. See the left figure in Figure 5.
4. There are multiple sheets of the zero contour inside the tetrahedron. See the left figure in
Figure 6.
495

Figure 5: Avoiding holes in a cubic patch


496

Figure 6: Avoiding extra sheets in a cubic patch

5. More dramatically, there may be even handles etc. on a cubic patch. See Figure 3.
Notice that we listed singular points together with self intersection because for implicit patches,
singular points appear where self intersections occur.
We use tetrahedra [XIX2WX4] in Figure 2 as an example to explain how the situations listed
above can be avoided by controlling the free control points we have chosen. In this example, the
free control points are the function value h3(X4) and the gradient V'h 3 (X4)' The same argument
with minor modifications applies to the cubic polynomials defined for other tetrahedra.
Situation one can be avoided by properly choosing the function value at X4. Consider the line
segment from the centroid of [XIX2W], p, to X4' If the function value h 3(X4) is chosen to be have
a sign opposite to that of the function value at p, then there must be a point on the line segment
[X4P] where the cubic polynomial is zero. In other words, the zero contour passes through the
interior of the tetrahedron [XIX2WX4].
Situations two through five can be avoided by enforcing monotonicity conditions on the cubic
polynomial along the direction from W to X4. A function is monotone in direction a if the directional
derivative along a is positive. Let the cubic polynomial in [XIX2X4W] be
h 3(V3) = :E a),B~(V3)'
1)'1=3
A sufficient condition for the cubic polynomial h3 to be monotone along the direction form W to
X4 within the tetrahedron is that
a)'_e1 +e' - a), ~ 0, for all >. with >'1 ~ 1. (18)
497

• Location of shape parameter

Figure 7: The shape control scheme

When Al > 1, the condition (18) can be enforced by the function value and gradient at X4. As for
Al = 1, the control points involved in (18) are completely determined from the prescribed normals
in the input data, so the monotonicity conditions may not be satisfied for certain input data no
matter how the free control points h 3(X4) and Vh 3(X4) are chosen. But remember the prescribing
normals in the input data is only a measure to control the behavior of each cubic patch. If we
choose these normals within proper ranges, the condition (18) can be enforced.
In praCtice, the free control points are computed using the degree-elevation property of Bernstein-
Bezier representation. The idea is to extend the effects of prescribed normals to the free control
points. A quadric polynomial q over the tetrahedron [XIX2X3X4] can be determined from the fact

q(Xi) = 0, Vq(Xi) = ni, i = 1,2,3.


and the value q(X4) which is referred to as a shape pammeter. If this is done for all facets of
the input polyhedron P, then quadric polynomials over tetrahellra such as [X2X3X~X4] can be
determined also. These quadric polynomials are then degree elevated to cubic polynomials, whose
control points corresponding to the free control points are given to the free control points. This
method of choosing free control points works very well in practice. From our experience, the ranges
of free control points within which the cubic patches behave well are fairly large. As long as the
free control points are not in the relative small "bad" ranges, the cubic patches are in good shape.
Figure 5 and Figure 6 are two examples of how the above method works in the setting of Figure
2. In Figure 6, the left figure has an extra sheet due to badly chosen free control points. In the
right figure, the badly chosen free control points are corrected using the above method. Figure 5 is
similar except the problem is the hole in the left figure.

3.4 Features of the algorithm


The above free-form algorithm has several features. From the description of the algorithm, it is
easy to see that the free-form construction in the algorithm is local. In the following, we discuss the
498

1=2
'~"":::"::"""""'.:::::::::::::::""'"
........•:::: .. ............
' .
." .... """"......

1<0

......
.... ....
',,:.
: ", ....
..~,:...

• Location of shape parameter

Figure 8: Figure 7 with a shape parameter decreased

quadratic precision of the algorithm, and how to control the shape of the free-form surface without
knowing the details of the algorithm.
Quadratic precision is a measure of accuracy of free-form surface algorithms in terms of how
well the algorithms can reproduce a known surface if the input data is taken from the surface. A
question that users often ask about a free-form surface algorithm is that if the input data is taken
from a sphere, can the algorithm reproduces the sphere. For the algorithm we derive, the answer
is yes. In fact, the algorithm reproduces all quadrics.
Notice that the input polyhedron P, prescribed normals at the vertices of P, and the shape
parameters completely determine the free-form surface. If the input data is taken from a quadric
surface and the shape parameters are from the quadric surface, then the algorithm will produce
the same quadric surface. To ensure the shape parameters are properly chosen so that all quadric
surfaces can be reproduced, we must give certain default values to the shape parameters. For
example, an easy way to do so is as follows. Randomly choose enough vertices of P so that these
vertices determine a quadratic polynomial q such that the zero contour of q passes though the
chosen vertices, then compute the the shape parameters by evaluating q.
An important feature of the algorithm we derive is that it allows the users to control the shapes
of the free-form surfaces produced by the algorithm without knowing the details of the algorithm.
This feature is very important for applications like CAD/CAM, where the designers manipulate
the shape of the free-form surfaces to achieve functional or aesthetic design objectives.
Recall that for each facet, the cubic polynomials over the double tetrahedron containing the
facet is not completely fixed. A double tetrahedron has a vertex outside P, and we call the vertex
the apex of the double tetrahedron. At the apex of the double tetrahedron of each facet, the value
of the cubic polynomials is left as a shape parameter, as was shown in 3.3.
If we think of the algorithm as producing the global function lover the constructed neighbor-
hood ~ of P, then the value of 1 at each apex is a shape parameter. Since the interior of the
free-form surface is exactly the region where 1 < 0, decreasing a shape parameter at a apex pulls
the free-form surface towards the apex. Moreover, only nearby quadric patches are affected by
499

Figure 9: An example of shape control

this shape parameter because the free-form surface construction in the algorithm is local. So, the
apexes form a net which controls the shape of the free-form surface through the shape parameters
at the apexes.
Figure 7 and Figure 8 illustrate a two dimensional analogy of this shape control scheme. The
situation in the three dimension is the same but harder to draw. Figure 9 is an example of two
free-form surface having everything identical except the shape parameters .

4 Conclusions
We have presented an algorithm for generating free-form surfaces from the input data of an arbitrary
polyhedron with a normal prescribed at each vertex of the polyhedron. The algorithm constructs a
smooth piecewise cubic surface interpolating the vertices of the input polyhedron and the prescribed
normal at each vertex. The free-form surface construction is local and quadratically precise. In
addition, the free-form surface produced can be manipulated through a set of intuitive shape
parameters without knowing the details of the algorithm.
500

Figure 10: A skewed dodecahedron

Figure 11: A tea pot


501

Figure 10 and Figure 11 illustrate some implemented results. Figure 10 is a skewed dodecahe-
dron with 12 points, 20 facets, and 80 patches; Figure 11 is a tea pot with 45 points, 72 facets, and
266 patches. These two pictures, as well as the pictures shown earlier, are generated by polygonizing
the cubic patches and rendering the resultant polygon using Gouraud shading.
We hope to incorporate the free-form surface algorithm into a geometric modeling system and
to experiment designing, manipulating, and reasoning about complex smooth objects.

Acknowledgement
I am grateful to John Hopcroft for the support, guidance, and encouragement for this work.
also would like to thank Professors Chris Hoffmann, Joe Warren, and Doug Moore for providing
polygonizers which were used for implementing our results. This work is supported by DARPA
under ONR contract N00014-86K-0591, NSF Grant DMC-86-17355, and ONR Grant N00014-89J-
1946.

References
[Bajaj C (1990)] Surface fitting using implicit algebraic surface patches. Technical Report CSD-
TR-1001, Department of Computer Science, Purdue University, 1990.

[Bajaj C (1989)] Geometric modeling with algebraic surfaces. Technical Report CSD-TR-825,
Department of Computer Science, Purdue University, 1989.

[Bajaj C, Thm I (1989)] Hermite interpolation using real algebraic surfaces. In Proceedings of the
ACM Symposium on Computational Geometry, West Germany, pages 94-103.

[Bajaj C, Kim M (1987)] Compliant motion planning with geometric models. In Proceedings of
the ACM Symposium on Computational Geometry, Waterloo, Canada, pages 171-180.

[Blinn J (1982)] A generalization of algebraic surface drawing. ACM Transactions on Graphics,


1:235-256, 1982.

[Bloomenthal J, Wyvill B (1990)] Interactive techniques for implicit modeling. Computer Graph-
ics, 2:109-116.

[Dahmen W (1989)] Smooth piecewise quadric surfaces. In T. Lyche and L. Schumaker, editors,
Mathematical Methods in Computer-aided Geometric Design, pages 181-193, Academic Press.

[Farin G (1986)] Triangular Bernstein-Bezier patches. Computer-aided Geometric Design, 3:83-


127.
502

[Hoffmann C (1989)] Solid and Geometric Modeling. Morgan Kaufmann Publishers, Los Altos,
California.

[Hoffmann C, Hopcroft J (1987)] The potential method for blending surfaces and corners. In
Geometric Modeling: Algorithms and New Trends, pages 347-366.

[Hopcroft J, Krafft D (1986)] The challenge of robotics for computer science. In C. Yap and
J. Schwartz, editors, Advances in Robotics, Vol. 1:Algorithmetic and Geometric Aspects of
Robotics.

[Mann S, Loop C, Lounsbery M, Meyers D, Painter J, DeRose T, Sloan K (1990)] A survey of


pammetric scattered data fitting. Department of Computer Science, University of Washing-
ton. Preprint.

[Middleditch A, Sears K (1985)] Blend surfaces for set volume modeling systems. Computer Gmph-
ics, 19:161-170.

[Moore D, Warren J (1990)] Adaptive approximation of scattered contour data using piecewise im-
plicit surfaces. Department of Computer Science, Rice University. Preprint.

[Nishimura H, Hirai A, Kawai T, Kawata T, Shirakawa I, Omura K (1985)] Object modeling by


distribution function and a method of image genemtion. Journal of Papers Given at the
Electronics Communications Conference 1985, J68-D( 4).

[Patrikalakis N, Kriezis G (1989)] Representation of piecewise continuous algebraic surfaces in


terms of B-splines. The Visual Computer, 5:360-374.

[Rockwood A, Owen J (1987)] Blending surfaces in solid geometric modeling. In G. Farin, editor,
Geometric Modeling: Algorithms and New Trends.

[Sederberg TW (1985)] Piecewise algebraic surface patches. Computer-aided Geometric Design,


2:53-59.

[Sederberg TW (1990)] Techniques for cubic algebraic surfaces, IEEE Computer Gmphics and
Applications, 4:14-25.
503

Baining Guo is currently a doctoral candidate at Cornell University. He


works in the Modeling and Simulation Project in the Department of Com-
puter Science. His research interests include computer graphics , numerical
analysis, and theoretical computer science.
Guo received his BS in mathematics from Beijing University in 1982, and
he received his MS in computer science from Cornell University in 1989.
Address: Department of Computer Science, Upson Hall, Cornell Univer-
sity, Ithaca, New York 14853, USA
Chapter 9
Visualization in Engineering
Equilibrium and Interpolation Solutions Using
Wavelet Bases
Alex P. Pentland

ABSTRACT

Efficient solutions to equilibrium and interpolation problems can be obtained by using wavelet
basis vectors for use in problem discretization, or for use as a preconditioning transform. Good
approximations to these solutions can be obtained in only O( n) operations and O( n) storage
locations, a property that can be extremely useful in visualization applications.

Keywords: Finite Elements, Regularization, Lofting, Wavelets, Preconditioning.

1 INTRODUCTION

Physical equilibrium and interpolation problems are two of the most common analysis problems
found in scientific and engineering applications. These problems are standardly solved using either
the finite difference or finite element methods, and generally require at least O( n 2 ) operations
where n is the number of nodes used in the problem definition. As a consequence of this scaling
behavior, problem solution generally involves large computational expense.

In this paper I will show how efficient solutions to many of these problems can be obtained
by using wavelet bases both for constructing efficient new discrete formulations, and for use
in preconditioning existing formulations. The plan of this paper is to first briefly review both
dynamic equilibrium and interpolation problems, and how a careful choice of basis can be used
to obtain more efficient solutions. I will then construct iterative solutions, discuss the solutions'
computational complexity, and show numerical examples.

The wavelet formulation presented here is based on the work of Adelson and Simoncelli (Adel-
son 1987,Simoncelli 1990), and is compatible with their notation. The FEM formulation follows
the notation of Bathe (Bathe 1982) and Segerlind (Segerlind 1984). Additional detail concerning
the computation of wavelet basis functions, and transformation to and from these bases, can be
found in Appendix A.

507
508

2 BACKGROUND

2.1 Equilibrium Problems

Simulation of physical processes, for example deformation under an applied load, is normally
accomplished by use of a differential equation, known as the governing equation, which is projected
onto a discretization S of ~ containing n nodes, where for simplicity n = k· 2' and k, I are positive
integers. The resulting matrix equation is

MU+CU+KU=R (1)
where U is a dn x 1 vector of the displacements of the n nodal points, M, C and K are dn by dn
matrices describing the mass, damping, and material stiffness, R is a dn x 1 vector describing the
loads acting on the nodes, and d is the number of free coordinates of each node. For simplicity of
presentation, this paper will assume d = 1. At equilibrium, Equation 1 reduces to the following:

KU=R (2)
The solution of this static equilibrium equation is the most common objective of such physical
simulations.

Because the linear operators M, C, and K are normally dense, techniques such as finite
elements have been developed to construct sparse, banded approximations to the underlying
linear operators. In this paper K* will be used to indicate the linear operator within the governing
equation, and K to indicate the stiffness matrix which is its (approximate) projection onto S.

2.2 Interpolation Problems

Interpolation problems, often described as regularization problems in signal processing or as lofting


in mechanical engineering, are important in a variety of applications. Such problems are normally
posed as a variational problem, which when discretized reduces to the following:

AK6U - P 6(U) = 0 (3)


where A is a scalar constant, K6 is an n x n matrix known as the regularizing or smoothing term,
P 6 (U) is an n x 1 vector known as the data or penalty term, whose elements Pi are typically

di - Ui boundary conditions exist for node i


Pi ={ 0 (4)
no boundary conditions exists.

where the d i are the desired coordinates of certain nodes, and the Ui are the elements of U. The
smoothing term K6 is typically a finite differences approximation of a thin-plate (Poggio 1985).

Equation 3 is quite similar to the FEM equilibrium equation, as can be seen by a change of
notation. Rewriting the smoothing term as K = K6 and the data term as

P 6(U) = R- SU (5)
where S is a diagonal "selection matrix" with ones for nodes with boundary conditions, and zeros
elsewhere. Equation (3) becomes
AKU+SU=R (6)
509

Thus this type of interpolation/regularization/lofting problem may be viewed as fitting a thin


plate through certain measurement points, e.g., as the solution to an equilibrium equation that
is somewhat more general than Equation 2. In the following sections I will therefore first discuss
the solution to Equation 2 and then generalize that result to solving Equation 6.

2.3 Choice of Basis

To obtain an equilibrium solution U, one integrates Equation (1) using an iterative numerical
procedure at a cost proportional to nmk operations per time step, where n is the order of the
stiffness matrix and mk is its half bandwidth (Bathe 1982). Thus there is a need for a method
which transforms the Equation (1) into a form which leads to a less costly solution. Since the
number of operations is proportional to the half-bandwidth mk of the stiffness matrix, a reduction
in mk will greatly reduce the cost of step-by-step solution.

To accomplish this goal we can transform the problem from the original nodal coordinate
system to a new coordinate system whose basis vectors are the columns of an n x n matrix P. In
this new coordinate system the nodal displacements U become generalized displacements U:

(7)

Substituting Equation (7) into Equation (1) and premultipling by p T transforms the governing
equation into the coordinate system defined by the basis P:

(8)

where
(9)
With this transformation of basis, a new system of stiffness, mass, and damping matrices can be
obtained which has a smaller bandwidth then the original system.

The optimal basis ~ has columns that are the eigenvectors of M-1K (Bathe 1982). These
eigenvectors are also known as the system's free vibration modes. Using this transformation
matrix we have
(10)
where the diagonal elements of 0 2 are the eigenvalues of M- 1 K and remaining elements are zero.
When the damping matrix C is restricted to be Rayleigh damping, then it is also diagonalized by
this transformation.

3 WAVELETS AS BASIS FUNCTIONS

Although equilibrium problems may be solved in closed form by transforming them to a basis
constructed from the eigenvectors of M- 1 K, this approach is generally not useful for the solution of
equilibrium problems. This is because the basis vectors vary from problem to problem, and because
calculation of the basis vectors requires up to O(n 3 ) operations and O(n 2 ) storage locations. It is
desirable, therefore, to find a better set of basis vectors for discretizing and solving equilibrium
problems.
510

.""M
~llllr
6 ~I'li
rI"'M

.' 1 .1,

11 '1
11

JH
.I.
H

Figure 1: Five elements of a wavelet basis set and their Fourier power spectra. These elements
were constructed by use of a 9-tap five-level wavelet transform as discussed in Appendix A. Power
spectra magnitudes are plotted on a linear scale (From (Simoncelli 1990)).
511

3.1 General Principles

There are two general principles that guide the choice of basis. The first idea is to efficiently
approximate the linear operator K*. If a good approximation to K* can be obtained with only
a few basis vectors, then projecting K* onto that basis will result in a sparse, banded stiffness
matrix K. In a good basis, therefore, the projection of K* will be nearly diagonal.

The second idea is to improve the condition number of the stiffness matrix, as the convergence
rate of most matrix inversion algorithms depends either on the condition number or its square.
In a good basis, therefore, the projection of K* will be close to the identity matrix.

For physical equilibrium problems, the ideal basis would be both spatially and spectrally
localized, and very fast to compute. The desire for spectral localization stems from the fact that,
in the absence of boundary conditions, fractures, etc., many dynamics problems can be solved in
closed form in the frequency domain. In similar fashion, the projection of K* onto a spectrally-
localized basis will tend to produce a banded stiffness matrix K. The requirement for spatial
localization stems from the need to account for local variations in K's band structure due to, for
instance, boundary conditions, fracture, or material inhomogeneity.

3.2 Wavelet Bases

A class of bases that provide the desired properties are generated by functions known as wavelets
(Grossman, Morlet 1984, Meyer 1986, Mallat 1987, Daubechies 1988). A family of wavelets ha,b
is constructed from a single function h by dilation of a and translation of b

ha,b = larl/2 h ( x ~ b) , (11)

Typically a = 2; and b = 2; for j = 1,2,3 .... The critical properties of wavelet families that make
them well suited to this application are that:

• For appropriate choice of h they can provide an orthonormal basis of L2(lR), i.e., all members
of the family are orthogonal to one another.

• They can be simultaneously localized in both space and frequency.


• Digital projections and transformations using wavelet bases can be computed in only O( n)
operations.

Such families of wavelets may be used to define a set of orthonormal basis vectors over a
discretization S of lR containing n nodes, where again n = k . 2' with k and 1 positive integers.
The process of constructing a basis from the wavelet functions is a recursive one. The first n/2
basis vectors are taken to be the projections <Pm of hl ,2m, m = 1,2,3, ... k2'-1 onto 5. The remaining
basis vectors tPm are determined by orthogonalization with respect to the <Pm. The vectors tPm
define a new "lower resolution" or "larger scale" discretization 51 of lR with n/2 nodes.

This process of defining a basis is then repeated recursively. Thus the first n/4 basis vectors
<Pm for 51 are again taken to be the projections of h l ,2m onto 51, and the remaining basis vectors
tPm determined by orthogonalization. However because a was chosen equal to b when defining 51,
512

if we transform the ¢Jm basis vectors for SI back to the original discretization S we obtain exactly
the projections of h 2 ,4m onto S. Thus by continuing this recursion, basis vectors for all wavelets
up to hl,lm can be defined, with the remaining k basis vectors determined by orthogonalization. I
will call such a basis ~w, where the columns of ~w are the basis vectors.

The left-hand column of Figure 1 shows a subset of such a basis. This example shows (from
bottom to top) the basis vectors corresponding to a = 1, 2,4, 8 and b = n /2. The basis vector
shown at top is the remainder basis that was determined by orthogonalization with respect to the
other bases. The right-hand column shows the Fourier power spectrum of each of these bases; it
can be seen that they display good spectral localization. For higher dimensional problems a basis
set is normally taken to be the tensor product of one-dimensional bases. For additional detail, see
references (Mallat 1987, Albert 1990, Daubechies 1988).

The examples presented in this paper will all be based on the wavelet basis illustrated in this
figure. The coefficients that produced this basis are defined given in Appendix A. For simplicity of
presentation I will occasionally refer to this particular example of a wavelet basis as "the wavelet
basis," and to vectors defined relative to this basis as being in "the wavelet coordinate system."

3.3 Projection onto a Wavelet Basis

Having the defined a wavelet basis ~w, we must now be able to project a vector X defined on
S (the original nodal basis) onto the wavelet basis, thus producing a vector Y define relative
to the wavelet coordinate system. Because wavelet bases are defined recursively, this projection
transform may also be computed recursively.

One particularly illuminating way of looking at this transformation is as recursive splitting


of the "signal" described by the vector X into independent frequency bands using linear shift-
invariant filters. That is, to project a vector X onto a wavelet basis one convolves the elements
of X with both

• A filter whose impulse response fl is the projection of hl ,2m, and is therefore a wavelet basis
vector,

• The quadrature mirror pair to ft, whose impulse response is fo.

The Jutputs of these two convolutions are then critically subsampled by taking every other element
to produce the vectors Yo and Yt, both of which have length n/2.

The quadrature mirror pair filter fo is defined such that the even-displacement translates of
the pair fo and fl form an orthonormal basis for S. The filter It, whose impulse response is
illustrated at the bottom left of Figure 1, is a high-pass filter, as is illustrated at the bottom right
of Figure 1. The filter fo is thus a low-pass filter, with an impulse response similar to the impulse
response shown at the top of Figure 1, but with many fewer taps and correspondingly higher
cutoff frequency. Thus convolution by fo and fl split the vector X into a half-length vector Yo
whose elements are a low-pass version of X, and into a half-length vector Yt, whose elements are
a high-pass version of X.

The low-pass vector Yo is then recursively split by again convolving with fo and It and again
subsampling to produce Y 00 and Yo}, which are both of length n/4, and so forth. The resulting
513

vector
Y = [Y 1 , YOl> Y ool> ... Y 000...1, Y ooo...of (12)
is the projection of the vector X onto the wavelet basis. In higher dimensional problems, this
transform is normally accomplished by the tensor product of these one-dimensional transforms.
For additional details and discussion see Appendix A.

3.4 Equilibrium Solutions using Wavelet Bases

Because of the recursive nature of the computation, only O( n) operations are required to perform
the above transformation. In particular, to project load and displacement vectors onto a wavelet
basis, e.g., it = c)~R and U = C)wU, requires only O(n) operations and O(n) storage locations
for a problem of n nodes. This constrasts markedly with the O(n 2 ) operations and O(n 2 ) storage
locations need to perform a basis transformation using the standard method of matrix multipli-
cation. Capitalizing on this large computational savings is one of the central goals of the method
presented in this paper.

Given the ability to efficiently perform such a basis transformation, the critical question is:
what is the cost to solve the equilibrium equation in the wavelet coordinate system? As mentioned
above, the computational complexity of inverting a stiffness matrix K is proportional to its half-
bandwidth, and the convergence rate is (typically) proportional to the square of its condition
number. The questions to answer, therefore, are the following: If a stiffness matrix K is defined
by projecting the linear operator K* onto a wavelet basis, then how tightly banded will it be, and
what is its condition number?

3.4.1 An Example of Defining K

This section will show how the linear operator K* can be projected onto a wavelet basis in order
to construct the stiffness matrix K. I will then discuss the properties of K relative to stiffness
matrices defined using other methods.

The stiffness matrix K can be constructed using standard finite elements techniques, by inte-
gration of interpolants over the domain of interest:

(13)

for some choice of displacement interpolant functions H and strain-displacement interpolant func-
tions B. Let us choose these interpolant functions to be

(14)

B = [h~ "0' h~ 2" .. , h; ,kJ C)~ = h/C)~ (15)


where ha,b are the wavelet functions, h~,b are their derivatives, and in one dimension E = E,
Young's modulus. An example of one of these interpolants (in the original coordinate system)
is shown in Figure 2(a). It is qualitatively similar to sin(x)/x, but decays much faster and is
compactly supported.
514

1.BIlB 1.BB
B.SBB
B.allB B. BBB
B.alll!
B.7BB B.SBB
B.S0B B.60B
B.50B B.4BB
BABB
B.4BB
B.2BB
B.30B
B.21lB
B.2BB B.BB
A
B.lBB
B.BB
l
B.BB
-:r-:f -.2BB
(a) B.BB B.BB (b) B.BB 64.B (c) B.BB 64.B

Figure 2: (a) Wavelet-derived interpolant used to define stiffness matrix, (b) transform of a finite
difference stiffness matrix using +w, and (c) transform of a finite element stiffness matrix using
+w.
As the wavelets form an orthonormal basis for L2(~), these interpolant functions trivially
satisfy the normal finite element compatibility conditions. To see this we let Ps be the projection
operator of the previous section, e.g., PS(hI,I) = ¢iI. Then Ps(h) = +w, and

Ps(H) = Ps(h)+~ = +w+~ = I (16)

In the wavelet coordinate system, the stiffness matrix constructed from these interpolants is

K= +~K+w
= +~ [IsBTEBdSj+w (17)
= Is h,T Eh'dS
= E0 2

where 0 2 = Is h,Th'dS. If the derivatives h~,b formed an orthogonal basis, then K = E0 2


would be diagonal. While I have developed wavelet approximations with orthogonal derivatives,
the derivatives of most wavelets are not orthogonal. However if the extremely compact wavelets
developed for signal processing are used, then half-bandwith of K is much less than in a comparable
finite element formulation. For instance, using the wavelets presented in Appendix A, K has a
half-bandwith of seven. In the original nodal coordinate system, however, this same stiffness
matrix has a half-bandwith of W~-I, where w is the width of the kernel of ¢iI. Thus reduction to
a half-bandwith of seven is a considerable improvement.

3.4.2 A More General Answer

A more general answer to the cost of solving equilibrium equations using the basis +w is provided
by the work of Albert, Beylkin, Coifman, and Rokhlin, in an unpublished but co-temporaneous
paper (Albert 1990). They present an analysis of smooth linear operators in wavelet bases, and
prove that such linear operators can be approximated to arbitrary accuracy using only O( n log n)
coefficients in bases defined by wavelets with high-order vanishing moments.

Thus the linear operator K* can be accurately approximated using only the basis vectors
centered within K's bands, and consequently, the equilibrium solution can be obtained (using, for
515

instance, Shultz's method) in only O(nlog 2 n) operations. The extremely small half-bandwidth
of K as constructed using the wavelets of Appendix A is due to the extremely fast decay of that
wavelet, and to the truncation of elements smaller than 10- 3 .

4 APPLICATION TO PRECONDITIONING

The fact that transforming K to the wavelet coordinate system results in a significant reduction in
its half-bandwith suggests that ilw will be an effective preconditioning transform. As an example,
therefore, I generated two stiffness matrices for a thin beam, one using a five-node finite difference
formulation, and a second using a three-node finite element formulation. The two stiffness matrices
Kfd (the finite difference matrix) and K fe (the finite element matrix) were then transformed to
a wavelet coordinate system by the transforms:
-
Kfd = ilwKfdilw
T (18 )
and
- T
K fe = ilwKfeilw (19)
Note that these transformations require only O(n) operations as Kfd and Kfe are banded. Figure
2(a) shows a typical row from Kfd, and Figure 2(b) shows a typical row from K fe .

A measure of how well ilw diagonalizes these stiffness matrices is the magnitude of the off-
diagonal terms divided by the diagonal terms, e.g.,

(20)

For the finite difference stiffness matrix d was equal to 0.0625, and for the finite element stiffness
matrix d was equal to 0.0382. Thus ilw was effective at diagonalizing both of these stiffness
matrices. Furth~r, the c?ndition numbers of Kfd and Kfe were both near one, so that the
computation of K,l and K,; should converge rapidly.

4.1 Approximate Solutions for Visualization

Preconditioning transforms may also be used to obtain very fast approximations. The simplest
method of accomplishing this is to first transform a given stiffness matrix K (defined using finite
element or other methods) to the new basis, simplify the problem by discarding off-diagonal
elements, and then solve. Such a first-order approximation can be quite useful for visualization
purposes, or for the starting point of an iterative solution technique, as will be shown below.

First, I will compute a tightly-banded approximation to the stiffness matrix K by projecting


the elements of K onto the wavelet bases centered within K's bands:
n~ = diagb (iI~Kilw) (21)
where the "dia~" operation sets all elements to zero except those within ±b of the central diagonal.
Only O(n) operations are required even if K is dense, as n~ is banded. Given n~, then a first-
order approximation to the equilibrium solution is simply
(22)
The major advantage of this approximation is that its calculation requires only O( n) operations.
516

(a) (b)

Figure 3: (a) Vertical axis shows the nodal loading R to a 128 x 128 node thin elastic plate, (b)
Vertical axis shows the nodal displacements calculated using approximate equilibrium solution.
Total execution time: 3.96 seconds on a Sun 4/330.

4.1.1 An Example

The accuracy of Equations 21 and 22 was tested numerically, as is illustrated by Figure 3. Figure
3(a) shows the input loading R to a 128 x 128 node thin elastic plate. The vertical axis is
amplitude of R, and the horizontal axes are the node positions. In this example the mean load
was set to zero by adding a small uniform load in the opposite direction to the cross-shaped
vertical loading. The stiffness matrix was defined using a standard thirteen-node finite difference
computation.

The approximate equilibrium solution was computed using Equations 21 and 22 with b = O.
The vertical axis of Figure 3(b) shows the resulting displacements iT; the horizontal axes are the
node positions. This approximate solution has an error of less than one part in one hundred
(i.e., the variance of the residual error was 1/130 th of the variance of the solution surface). Total
execution time for this 128 x 128 node example was 3.96 seconds on a Sun 4/330 computer.
With b = 1 the solution has an error of less than one part in one thousand. This combination
of reasonable accuracy and extremely low computation time makes this sort of approximation
attractive for visualization purposes.

4.2 The Interpolation Problem

Interpolation problems are important in a variety of applications. In mechanical engineering, the


interpolation problem is referred to as lofting, and is posed as the problem of fitting a smooth
surface through a sparse set of possibly-noisy control points. This also a good example of one
type of problem where regularization techniques are often applied.

In the case that every node is a control point, so that the only problem is to remove measure-
ment noise, the above equilibrium solution may be trivially extended to the interpolation problem.
517

(a) (b)

Figure 4: A typical interpolation problem. (a) Vertical axis shows the height constraints placed on
a 64 x 64 node plate; these constraints were generated by a 10 % density random sampling of the
function z = 100[sin( kx) + sin( ky )]. (b) Vertical axis shows the height of the lofted or regularized
surface. After three iterations (approximately 4 seconds on a Sun 4/330) the algorithm converged
to within 1% of the true equilibrium state.

In this case the interpolation problem is

AKV+SV=R (23)

where S = I, the identity matrix.

Substituting V = ~wU and premultiplying by ~~ converts Equation (23) to

(24)

By employing Equation 21, we then obtain

(An~ + I)U = ~~R (25)

where n~ is a diagonal matrix. The first-order approximation U to the interpolation solution is


therefore
(26)

In the more usual case where not all nodes are control points, the interpolation solution can
be obtained interatively. In this case the sampling matrix S is diagonal with ones for nodes that
are control points, and zeros elsewhere. Again substituting V = ~wU and premultiplying by ~~
converts Equation (23) to
(27)
The matrix ~~S~w is not diagonal unless S = I. It is, however, strongly diagonally dominant,
so that the interpolation solution V may be obtained by iterating

VI+! = ~w(An~ + Srl~~RI + VI (28)

where S = diagb(~~S~w) and R' =R - KV ' is the residual loading at iteration t.


518

4.2.1 An Interpolation Example

Figure 4 shows an interpolation problem solved using Equation 28 and Equation 21 with b = 0 on
a 64 x 64 node grid. Figure 4(a) shows the height constraints for this example; the vertical axis
is the height of the constrained nodes (zero-valued nodes are not constrained). These constraints
were generated using a sparse (10% ) random sampling of the function z = 100[sin(kx) + sin(ky)].

Normally three to five iterations of Equation 28 are required to obtain an accurate estimate of
the interpolated surface. Figure 4(b) shows the result of this iterative process starting with the
constraints shown in Figure 4(a). The vertical axis is the height of the solution surface. In this
example Equation 28 converged to within 1% of its true equilibrium state by the third iteration.
Approximately 4 seconds of computer time on a Sun 4/330 was required to obtain this 64 x 64
node interpolated surface.

5 SUMMARY

I have shown that efficient solutions to dynamic equilibrium and interpolation problems can be
obtained by using wavelets as a basis. The general complexity of solutions in such a basis has
been shown by Albert, Beylkin, Coifman, and Rokhlin (Albert 1990) to have a computational
complexity of O(n log2 n). Further, good approximations to these solutions can be obtained in
only O( n) operations and O( n) storage locations, and are therefore potentially quite useful in
visualization applications.

A APPENDIX: DISCRETE WAVELET TRANSFORMS

The transform of a signal into a wavelet basis - and back again - may be described in many ways.
One of the most illuminating ways of viewing this transform is as recursive bandsplitting using
linear shift-invariant filters and their quadrature mirror pairs. That is, the elements of a vector X
are viewed as a sequence x[n], and decomposed into a hierarchy of smaller bandpass sequences YalnJ
which are the representation of X in the wavelet coordinate system. The formulation presented
here summarizes the presentation by Adelson and Simoncelli (Adelson 1987,Simoncelli 1990). The
notations used are compatible, to the extent possible, with that presentation.

A.I The One Dimensional Case

The first stage of the wavelet transform is formulated as a two-band critically sampled decomposi-
tion/reconstruction (D /R) filter bank problem, as illustrated in the schematic diagram in Figure 5.
The purpose of the decomposition section of the filter bank is to split the input sequence x[n] into
two half-density sequences yo[n] and ydn]. This decomposition corresponds to the first stage of
the recursive transformation from the original nodal coordinate system to the wavelet coordinate
system.

The reconstruction section then combines these sequences to form an approximation x[n] to
the original sequence. This reconstruction step corresponds to the first stage of the recursive
519

Decomposition section Reconstruction section

~
xlII]

Figure 5: A two-band decomposition/reconstruction filter bank in one dimension.

transformation from the wavelet coordinate system to the nodal coordinate system.

I
The notation used in this diagram is standard for digital signal processing. The boxes F;(w) I
indicate convolution of an input sequence with a filter with impulse response f;[n] and discrete
time Fourier transform (DTFT)

The boxes r2Il indicate that the sequence is subsampled by a factor of 2, and the boxes @]J
indicate thaT-tIte sequence should be upsampled by inserting a zero between each sample.

Using the definition of the DTFT and some well known facts about the effects of upsam-
pling and downsampling in the frequency domain, one can derive equations for the DTFT of the
representation sequences y;[n]:

Y;(w) 1[F;(-)X(-)
= -2 w w + F;(- W
w + 7r)X(- + 7r)] (29)
2222
and the D /R system output is

X(w) = Yo(2w)Go(w) + Yi(2w)G 1 (w).


Combining these equations gives the overall system response of the filter-bank:

X(w) = ~[Fo(w)Go(w) + F1(w)G1(w)]X(w)

+ ~ [Fo(w + 7r)Go(w) + F1(w + 7r)G1(w)]X(w + 7r). (30)

The first term is a linear shift-invariant (LSI) system response, and the second is the system
aliasing.

The filter fo[n] and its quadrature mirror pair it [n] are related by spatial shifting and frequency
modulation. We define

Go(-w) = H(w)
G1 ( -w) = e jw H( -w + 7r) (31 )

for H(w) an arbitrary function of w. This definition, which was proposed in (Simoncelli 1990),
corresponds to the linear algebraic notion for an orthogonal transform.
520

Figure 6: A non-uniformly cascaded decomposition/reconstruction filter bank.

I:~'~·':··· 'X
tx-x
. ..
~

Figure 7: Octave band splitting produced by a four-level pyramid cascade of a two-band D/R
system. The top picture represents the splitting of the two-band D/R system. Each successive
picture shows the effect of reapplying the system to the lowpass sequence (indicated in grey) of the
previous picture. The bottom picture gives the final four-level partition of the frequency domain.
All frequency axes cover the range from 0 to 7r.
521

With the choice of filters given in (31), equation (30) becomes

X(w) = ~[H(w)H(-w) + H(-w + 7r)H(w + 7r)]X(w)

+ ~ [H(w + 7r)H( -w) + ei1r H( -w)H(w + 7r)]X(w + 7r). (32)

The second (aliasing) term cancels, and the remaining LSI system response is

X(w) = ~ [H(w)H( -w) + H( -w + 7r)H(w + 7r)]X(w). (33)

Note that the aliasing cancellation is exact, independent of the choice of the function H(w).

The design problem is now reduced to finding a filter with DTFT H(w) that satisfies the
constraint
~ [H(w)H( -w) + H( -w + 7r)H(w + 7r)] = 1
or
IH(w)1 2 + IH(w + 7r)12 = 2. (34)
An example of such a filter is the 9-tap filter illustrated in the main body of the text, whose
coefficients are
0.02807382 0.02807382
0.060944743 -0.060944743
-0.073386624 - 0.073386624
-0.41472545 0.41472545
fo[n] = 0.7973934 ft[n]= 0.7973934 (35)
-0.41472545 0.41472545
-0.073386624 -0.073386624
0.060944743 -0.060944743
0.02807382 0.02807382
where fo[n] is the low-pass filter and !t [n] is the high-pass filter.

Once filters have been designed so that the overall system response is unity, the filter bank
may be cascaded to form multiple-band systems. An example of non-uniform or "pyramid"
cascading is illustrated in Figure 6. Such a pyramid cascade produces an octave-width sub-
band decomposition, as illustrated in the idealized frequency diagram in Figure 7. This cascaded
decomposition section performs the transformation from original coordinate system to the wavelet
coordinate system. The associated cascaded reconstruction section performs the transformation
from wavelet coordinate system to the original coordinate system.

Most formulations of the wavelet transform in two or more dimensions have involved separable
filters. A two-dimensional example is illustrated in Figure 8: the frequency spectrum is split
into low-pass, horizontal high-pass, vertical high-pass, and diagonal high-pass sub-bands. The
diagonal band contains the sum of the ±45 degree diagonal orientations. In two dimensions the
filters are constructed by Cartesian product of the one-dimensional high and low pass filters fo
and k The four two-dimensional filters are therefore fo[x] 0 fo[y] (the low-pass filter), fo[x] 0 !try]
(the vertical high-pass filter), ft[x]0 fo[Y] (the horizontal high-pass filter), and ft[x]0 fdy] (the
diagonal high-pass filter).

Acknowledgement. This research was made possible in part by the Rome Air Development
Center (RADC) of the Air Force System Command and the Defense Advanced Research Projects
Agency (DARPA) under contract No. F30602-89-C-0022.
522

----------~--------------

-----------~---------------
~y OO:: :.y OOy ~y
! ~

Figure 8: Idealized diagram of the partition of the frequency plane resulting from a four-level
pyramid cascade of separable wavelet filters. The top plot represents the frequency spectrum of
the original image. This is divided into four sub-bands at the next level. On each subsequent
level, the lowpass sub-band (outlined in bold) is sub-divided further.
523

References
[1] Adelson, E. H., Simoncelli, E., and Hingorani, R., (1987) Orthogonal pyramid transforms
for image coding. In Proceedings of SPIE, October 1987.

[2] Albert, B., Beylkin, G., Coifman, R., Rokhlin, V. (1990) Wavelets for the Fast Solution of
Second-Kind Integral Equations. Yale Research Report DCS.RR-837, December 1990.

[3] Bathe, K-J., (1982) Finite Element Procedures in Engineering Analysis. Prentice-Hall, 1982.

[4] Daubechies, I. (1988) Orthonormal Bases of Compactly Supported Wavelets. Communica-


tions on Pure and Applied Mathematics, XLI:909-996, 1988.

[5] Grossmann, A. and Morlet, J., (1984) Decomposition of Hardy functions into square inte-
grable wavelets of constant shape. SIAM J. Math, 15:723-736, 1984.

[6] Mallat, S. G., (1987) A theory for multiresolution signal decomposition: the wavelet repre-
sentation. IEEE Trans. PAMI, 11(7):674-693, 1987

[7] Meyer, Y., (1986) Principe d'incertitude, bases hilbertiennes et algegres d'operateurs. Bour-
baki Seminar, No. 662, 1985-1986.

[8] Pentland, A., and Williams, J., (1989) Good Vibrations: Modal Dynamics for Graphics and
Animation. Computer Graphics, 23(4):215-222, 1989.

[9] Pentland, A., (1990) Automatic Extraction of Part Deformable Models. International Jour-
nal of Computer Vision, 4:107-126, 1990.

[10] Poggio, T., Torre, V., and Koch, C., (1985) Computational vision and regularization theory.
Nature, 317:314-319, Sept. 26, 1985.

[11] Segerlind, L. J., (1984) Applied Finite Element Analysis. John Wiley and Sons, 1984.

[12] Simoncelli, E., and Adelson, E., (1990) Non-Separable Extensions of Quadrature Mirror
Filters to Multiple Dimensions Proceedings of the IEEE, 78(4):652-664, April 1990
524

Alex Paul Pentland received his Ph.D from the Massachusetts In-
stitute of Technology in 1982, and began work at SRI International's
Artificial Intelligence Center. He was appointed Industrial Lecturer in
Stanford University's Computer Science department in 1983, winning
the Distinguished Lecturer award in 1986. In 1987 he was appointed As-
sociate Professor of Computer, Information, and Design Technology at
M.LT.'s Media Laboratory, and was appointed Associate Professor in the
M.LT. Civil Engineering department in 1988. In 1988 he was awarded
the NEC Computer and Communications Career Development Chair. He
has published over 120 scientific articles in the fields of artificial intelli-
gence, machine vision, design, and computer graphics. In 1984 he won
the Best Paper prize from the American Association for Artificial Intelli-
gence for his research on problems of texture and shape description. His
last book was entitled "From Pixels to Predicates," published by Ablex
(Norwood, NJ) and he is currently working on a new book entitled "Dy-
namic Models for Vision," to be published by Bradford Books, M.LT.
Press.
Address: Room EI5-387, The Media Laboratory, M.I.T., 20 Ames St.,
Cambridge, MA., 02139
Dynamic 3D mustrations with Visibility
Constraints
Steven K. Feiner and Don~e Duncan Seligmann

Abstract

Illustrations are pictures that are intended to convey specific infonnation to their viewers; dynamic
illustrations adjust their design in response to user interaction. We have developed a set of techniques
that make possible dynamic illustrations that maintain a set of visibility constraints as a user modifies
the viewing specification. The visibility constraints that we support allow objects to be specified as
unoccludable; these constraints are maintained by having the system automatically identify and render
obscuring objects using transparency and cutaway effects modeled after those exploited by technical
illustrators. As the user navigates through an illustration, the system updates visibility changes
smoothly to avoid visual discontinuities. We discuss several approaches that exploit modern
z-buffer-based 3D graphics hardware to make possible near real-time perfonnance. These techniques
have been implemented as part of the IBIS Intent-Based Illustration System, a research testbed for the
automated design and rendering of technical illustrations.

Keywords: 3D graphics, knowledge-based graphics, automated picture generation, image synthesis,


technical illustration

1 Introduction
Computer graphics images are typically created by explicitly specifying all visual properties of the
objects to be depicted. Then, information about lighting and viewing specifications is provided to a
graphics system that renders one or more pictures, using scan-conversion and visible-surface
algorithms that model the objects' geometry, and shading algorithms that approximate the interactions
of objects with light. In contrast, technical illustrations are designed to communicate specific
infonnation about the objects being depicted. Although technical illustrators usually have access to
these objects (or to photographs of them), they rarely strive for photorealism. Instead, illustrators
selectively use a variety of techniques to highlight, subdue, suppress, and otherwise modify the
appearance of objects, to best convey the infonnation that they are attempting to communicate
[Thomas 68, Martin 89].

While some modern computer graphics drawing and paint systems make it possible for an illustrator to
produce many effects more easily than with conventional media, the illustrator must still design the
illustration, and specify each effect manually. The dependence of this process on a human technical
illustrator means that interactive 3D graphics capabilities cannot be fully exploited: the human time
needed to design an image is often far greater than the computer time required to render it, and the
illustrator's techniques rarely correspond directly to the simple rendering models provided. As well,
the illustrator may not be available when the illustration is to be designed. Furthermore, even if an
illustrator carefully prepares an illustration or an entire animation, the result shares one important
disadvantage with conventional material: the viewer cannot interact with it in any way that the

525
526

illustrator did not intend. For example, suppose that the user would like to examine an object depicted
in the illustration from another viewpoint in order to better understand it. If the illustration has been
designed to communicate information about certain objects and their properties, then even the most
basic graphical interaction, such as changing the viewpoint, may cause important objects to be
obscured or may make them difficult to recognize.

We have developed a set of techniques for supporting user-controlled navigation of 3D dynamic


illustrations. Our dynamic illustrations are incrementally redesigned to maintain a set of visibility
constraints automatically as the view changes, ensuring that selected objects remain visible. We use
simplified versions of the cutaway and ghosting effects employed by technical illustrators. These
visibility techniques are implemented in the mIS Intent-Based lllustration System [Seligmann and
Feiner 89], which is used as part of the COMET knowledge-based multimedia explanation generation
system [Feiner and McKeown 90a, Feiner and McKeown 90b, Elhadad et al. 89]. mIS designs
illustrations based on an input set of communicative goals that specify what its illustrations are
intended to accomplish. It uses a generate-and-test approach: given an initial set of input goals, IBIS's
rule-based control component builds and evaluates a representation of the illustration. Rules specify
methods for accomplishing and evaluating each kind of goal. IBIS can detect goal conflicts and can
backtrack to select alternative methods. Although mIS automatically determines all the constraints
that must be maintained in a dynamic illustration, the methods described here would also be useful in
applications in which a user exercises more direct design control.

2 Related Work
In previous work, we addressed the fully automated design of static illustrations and animated
presentations of 3D scenes, including the choice of objects to depict, lighting and viewing
specifications, rendering style, and screen layout [Feiner 85, Seligmann and Feiner 89, Karp and
Feiner 90]. We use rule-based systems to determine the relative importance of actions, objects, and
properties to be depicted, and to select effective methods for expressing and evaluating them
graphically. An earlier version of mIS used a simple ray-casting technique for evaluating visibility
constraints [Feiner and McKeown 90a] , but the tradeoff between accumcy and speed made it
inappropriate for dynamic illustration.

The earliest image-synthesis research that modeled techniques used by technical illustrators beyond
those of geometric projections and basic visible line and surface algorithms includes Appel, Rohlf and
Stein's haloed line rendering approach [Appel, Rohlf, and Stein 79], and Kamada and Kawai's work
on GRIP [Kamada and Kawai 87, Kamada and Kawai 88]. The GRIP graphics system is particularly
noteworthy because it allows users to define attributes for parts of each point, line, or surface, that
depend both on the object of which it is a part, and on the surfaces by which it is hidden. Recent work
by Saito and Takahashi [Saito and Takahashi 90] uses image-processing techniques to create pictures
that have enhanced outlines and curved cross-hatching effects. Dooley and Cohen [Dooley and Cohen
90a, Dooley and Cohen 9Ob] create pictures by combining images produced using modified visible-
curve and ray-tracing algorithms. Their system determines the visual characteristics of parts of curves
and surfaces based on the surfaces that hide these individual parts, and on user-assigned surface
importance values and drawing rules.

In contrast to these other approaches, the techniques we have developed were designed to use the
firmware and hardware rendering support provided by modern 3D z-buffer-based graphics systems.
For simplicity, we use polyhedml objects. By trading image quality for speed when appropriate, we
have made it possible for users to navigate in our illustrations while the system maintains visibility
(and other) constraints automatically.
527

3 Maintaining Visibility Constraints


Our techniques rely on the designation of selected objects as unoccludable, indicating that they should
not be occluded by other objects in an illustration. (While the detennination of which objects should
be unoccludable could be done by a human illustrator, in our system it is accomplished by a rule-based
illustration system [Seligmann and Feiner 89].) Each unoccludability designation defines a visibility
constraint that our illustration generation system must attempt to maintain. One approach to
maintaining visibility constraints is to select a viewing specification that ensures that designated
objects are fully visible. If many objects are being depicted, or if there are other constraints on the
illustration, such as a viewing constraint imposed by a user-selected viewing specification, one or
more additional objects may obscure the unoccludable objects from each possible view. In these
situations, technical illustrators use a number of techniques that our system can emulate to render the
parts of those objects obscuring the unoccludable objects. For example, obscuring objects can be
removed from the picture entirely, parts of them can be rendered as semi-transparent (ghosting), or
obscuring objects can be partially cut away to reveal objects behind them (cutaways) [Thomas
68, Martin 89].

Once the set of unoccludable objects has been identified, two steps are necessary to accomplish any of
the abovementioned visibility techniques:
• Classification. Objects must be classified relative to the unoccludable objects,
determining which objects obscure the unoccludable objects. (Since unoccludable objects
may obscure each other, they must be included in the classification process.)
• Rendering. Objects must be rendered so that unoccludable objects, obscuring objects, and
unobscuring objects are each treated appropriately.

Since these visibility techniques are used in an interactive system, each of the two steps must be
accomplished quickly. As well, our system supports smooth, incremental changes between successive
views that have differing visibility relationships, to avoid discontinuous transitions, a well-known
continuity problem in cinematic editing [Karp and Feiner 90].
3.1 Classifying Objects
We have investigated three approaches to the classification problem. The first two, which we briefly
summarize, rely on the fact that determining which objects are obscured by another can be
accomplished by using an analytic algorithm for determining shadows [Chin and Feiner 89].
3.1.1 Viewpoint-Centered Approach
The viewpoint-centered approach involves creating a shadow volume. [Crow 77] for each object in the
environment, relative to a light source at the viewpoint. Each object's shadow volume defines the
volume of space within which that object obscures any other object from the light (viewpoint). Each
unoccludable object must be compared with each shadow volume to determine whether the
unoccludable object is blocked by the shadow volume's object, and if so, how much blocking occurs.
(The' fraction of an unoccludable object polygon that is blocked by an obscuring polygon can be
computed as the ratio of the area of the pieces of unoccludable polygon that fall within the obscuring
object's shadow volume to the area of the entire unoccludable polygon.) Figure 1 shows in 2D the
relationships of several objects to a shadow volume. We accomplish this task using an algorithm
based on the BSP-tree Boolean set operation methods developed by Thibault and Naylor [Thibault and
Naylor 87]. (For greater efficiency, but decreased accuracy, simplified bounding volumes can be used
to represent each object during classification.)

The viewpoint-centered algorithm must be executed whenever the viewpoint changes. To avoid
processing objects that do not appear in the view, and to eliminate finding visibility relationships
528

\
\ I
\ I
V Viewpoint

Figure 1: Viewpoint-centered approach (in 2D).


Objects are shown as lines in 2D. Object A, whose shadow volume is shown in gray, is
being tested for whether it obscures unoccludable objects B, C, and D . Object B is not
obscured, object C is fully obscured, and object D is partially obscured, because they are,
respectively, fully outside, fully inside, and partially inside object A's shadow volume.

between clipped parts of viewable objects, shadow volumes can be clipped to the view volume.
3.1.2 Object-Centered Approach
Our second method derives from the observation that the point light source shadow-volume algorithm
described by Chin and Feiner [Chin and Feiner 89] constructs a BSP tree that partitions space into
volumes that are either obscured or not obscured from a point light source. We can position the light
source at the center of an unoccludable object, and modify the algorithm to determine for each
partition the names of all objects that block that partition from the light source. Since the original
algorithm is optimized to avoid subdividing further any partition that is already in shadow, the
algorithm must be changed so that partitions are fully subdivided, with each representing the complete
set of objects shadowing it. If the objects remain stationary relative to each other, then the tree that is
calculated for a light source position located in a given unoccludable object is valid for all viewpoints.
For any given viewpoint, we can find the set of objects obscuring the point light source from the
viewpoint by performing a quick descent of the tree to determine the partition in which the viewpoint
resides. Figure 2 shows the partitions created by this object-centered approach.

Although this algorithm is efficient, it is unfortunately incorrect: visibility from the viewpoint is
computed to only a single point on an unoccludable object (the point light source). A correct version
of the algorithm requires a polyhedral light source shadow-volume algorithm. Considering the
polyhedral unoccludable object as the light source, this algorithm would partition space into volumes.
Each partition is associated with a set of objects that either fully or partially obscure it from the light
source. A partition is in the umbra of an object that fully obscures it and in the penumbra of an object
that partially obscures it. Figure 3 shows a 2D version of the partitions created by a polyhedral object-
centered approach. By classifying the viewpoint relative to the tree, we can determine the viewpoint's
partition efficiently, and hence those objects that fully and partially obscure the unoccludable object.
As in the viewpoint-centered approach, we must also clip the unoccludable objects, and obscuring
objects to the view volume. This is necessary because an otherwise partially obscuring object might
wholly obscure the visible part of the unoccludable object, whereas an otherwise wholly obscuring
object might not obscure any visible part of the unoccludable object. Multiple unoccludable objects
may be handled by building one tree for each object and classifying the viewpoint relative to each tree.
We can perform this algorithm using a polygonal area light source algorithm [Chin 90, Chin and
529

Center of unoccludable object

/
/

Figure 2: Object-centered approach with point light source shadow volume (in 20).
Objects are shown as lines in 20. Each partition in the shadow volume is marked with the
lower-case letter names of the objects that obscure it from the center of the unoccludable
object.

Feiner 91] that is executed once for each desired face of the unoccludable object. This also allows
selected (combinations of) faces of an object to be designated unoccludable. rather than treating the
entire object uniformly.

/
/ \\/
/ ~\/
\
/, _
/-
--_
It:' _ "" .... Unoccludable object
......
...... ......
........... ......
// \'{ A --
1/ a
B P

Figure 3: Object-centered approach with polyhedral light source shadow volume (in 20),
Objects are shown as lines in 20. Each partition in the shadow volume is marked with the
lower-case letter names of the objects that obscure it from the unoccludable object (shown
as a line). The subscripts p and u next to a name indicate that the partition is in the
penumbra or umbra of that named object.

The object-centered approaches trade off fast execution speed for the preprocessing time spent
building the tree(s). Therefore. these algorithms are well suited to situations in which the set of
unoccludable objects remains unchanged and the objects in the environment do not move relative to
each other.
530

3.1.3 Z-buffer-Based Approach


To provide an alternative to these approaches that can address dynamic environments, we have
developed a third approach, based on the use of z-buffer hardware. The z-buffered-based approach
requires no preprocessing and determines all objects that obscure an unoccludable object to the same
accuracy with which the illustration will be rendered in the z-buffer. If desired, once each obscuring
object is determined, the viewpoint-centered approach can then be invoked to determine how much of
the unoccludable object the obscuring object blocks. Thus, the z-buffer-based approach can be
thought of as an efficient culling preprocess for the viewpoint-centered approach.

pickVisibleObjects
{
initialize z-buffer to farthest possible z value

1* set zbuffer to contain z of closest object at each pixel *1


for each object 0 in environment {
render 0

set pick rectangle to desired size

enable pick mode

1* determine picked objects *1


for each object 0 in environment {
render 0 1* Only z-buffer compares are done *1
if pickflag set {
add 0 to list of picked objects
reset pickflag

disable pick mode

Figure 4: Conventional z-buffer picking algorithm.

Our technique builds on a 3D picking method that is standard with most z-buffer graphics systems,
and which is shown in Fig. 4. 3D z-buffer-based picking is typically implemented by first rendering
the environment to ensure that all z values are those of the closest object at each pixel. The user then
specifies an arbitrary upright pick rectangle in virtual device coordinates and enables "pick mode,"
which write-disables both the frame buffer and z-buffer, so that they remain unchanged during
subsequent operations. Next, the user redescribes the environment to the graphics system with the
same set of modeling and viewing transformations originally used to render it. In pick mode, each
object is first clipped to the pick rectangle and scan converted, and a flag is set if the object is found to
have any z value that is no farther than the z value at the corresponding pixel in the pick rectangle (i.e.,
if the object would be visible anywhere in the pick rectangle when the picture is complete). If the user
inspects and resets the flag after each object is processed, the set of all objects that are visible within
the pick rectangle may be determined.

As shown in the pseudocode of Fig. 5, we take advantage of this picking support to classify our
objects by first rendering only the unoccludable object into the z-buffer. The pick rectangle is then set
to the unoccludable object's rectangular extent, Next, we modify the contents of the z-buffer by
identifying every z-buffer pixel in the pick rectangle that still has the z value with which it was
531

classifyObjects
{
initialize z-buffer to farthest possible z value

render unoccludable object u

set pick rectangle to rectangular extent of u

/* Modify z-buffer so only objects that obscure u are picked */


for each z"'buffer pixel p in pick rectangle {
if p == farthest possible z value
p = closest possible z value

enable pick mode

for each object 0 in environment (excluding u) {


render 0

if pickflag set (
add 0 to list of obscuring objects
reset pickflag
) else
add 0 to list of nonobscuring objects
}
disable pick mode

Figure 5: Z-buffer-based object classification algorithm.

initialized (the farthest possible z value), and setting these pixels to the closest possible z value (see
Fig. 6). Assuming that all object z values are farther than the closest possible z value, any object that

(a) (b)

Figure 6: Modifying the z-buffer.


Z-buffer pixels are shown with distance mapped to gray scale. Darker values are closer than
lighter ones. (a) The z-buffer after unoccludable object u has been scan converted. (b) The
z-buffer after each farthest possible z value has been changed to the closest possible z value.

generates a pick thereafter is guaranteed to be an obscuring object, since it must lie within the
unoccludable object's silhouette.

Pick mode is then enabled and all other objects are described to the graphics system. Each object that
intersects the pick rectangle and which has at least one pixel that is no farther than the corresponding
pixel in the pick rectangle is added to the list of obscuring objects; all other objects are added to the
532

list of nonobscuring objects. For each obscuring object, we can call the viewpoint-centered algorithm
to determine how much of the unoccludable object is obscured. As described here, this visibility
algorithm operates at the exact pixel accuracy with which the illustration is rendered. To increase its
speed at the expense of accuracy, the visibility algorithm can instead be performed with a smaller
viewport than is used to render the final illustration.
3.2 Rendering Objects
Once obscuring and nonobscuring objects have been identified, we have to determine how to render
them to maintain the visibility constraints. We have implemented several approaches, which we
describe first for the case in which there is only one nonoccludable object. In each approach, the
frame buffer and z-buffer are first cleared and we render in its usual style any object that has not been
identified as obscuring an unoccludable object.

Figure 7: Radio with no visibility constraints.

Figure 7 shows one of the models that we are using, a radio receiver-transmitter, with no visibility
constraints being obeyed. The remaining figures show a variety of ways to satisfy a visibility
constraint to reveal the "holding battery," which maintains the radio's volatile memory.
3.2.1 Obscuring-Object Removal
The first, and simplest, approach is not to render any obscuring object. Although this approach makes
sense in situations in which the obscuring object is of no importance to what is being illustrated, it has
a number of potential problems otherwise: the context that the obscuring object would provide is lost,
any object attached to or supported by a missing object may not appear correct [Feiner 85], and it
makes little sense to remove an obscuring object if it is also an unoccludable object.
533

3.2.2 Obscuring-Object Transparency


The second approach is to render obscuring objects as transparent, as shown in Fig. 8. We currently
accomplish this using screen-door transparency, provided in firmware on most z-buffer-based
graphics systems. Screen-door transparency renders a subset of the pixels in an object's projection,
corresponding to the 1 bits in a rectangular transparency bitrnask that is replicated over the display.
(An object that is already partially transparent may be made more transparent by careful composition
of bitmasks.) Pseudocode is shown in Fig. 9. Because objects rendered using screen-door
transparency are rendered correctly by a z-buffer algorithm even if they are interleaved with other
opaque and transparent objects, all objects can be rendered in their original order. Note that if no
unoccludable object obscures another, and we decide to render all unoccludable objects in their regular
style, we can avoid clearing the buffers and re-rendering the unoccludable objects.

Figure 8: Radio with objects occluding battery rendered as transparent.

One problem with the obscuring-object transparency approach, apparent in Fig. 8, is that parts of the
obscuring object that do not block the unoccludable object are also rendered transparent. This can
cause otherwise hidden, but unimportant objects to be displayed, or, if these objects have already been
suppressed, may misleadingly imply that they do not exist [Feiner 85]. The next approach helps
correct this problem by isolating the effect to the area surrounding an unoccludable object.
3.2.3 Obscuring-Object Cutaway
The third approach that we have implemented creates a cutaway view by rendering obscuring objects
so that pieces are removed through which the entirety of the unoccludable object is seen. An example
is shown in Fig. 10. We do this by rendering the nonobscuring objects first, as shown in the
pseudocode of Fig. 11. Then, we disable the frame buffer and render into the z-buffer alone an
arbitrary cutaway shape whose z values are closer than those of any object. We call this the cutaway
mask. (Optionally, we can also render the mask's outline into the frame buffer, as was done in Fig.
534

renderWithTransparency
{
classifyObjects()

initialize frame buffer and z-buffer

render u

for each object 0 in list of nonobscuring objects {


render 0

set transparency mask

for each object 0 in list of obscuring objects {


render 0

Figure 9: Algorithm for rendering obscuring objects with transparency.

Figure 10: Radio with objects occluding battery rendered with a clear cutaway.

10.) The cutaway mask should enclose the unoccludable object's projection, as does the jagged
polygon used in Fig. 10.

Next, we enable the frame buffer and render all obscuring objects. Only the projected parts of these
objects that lie outside the cutaway mask will be drawn, allowing the unoccludable object and
nonobscuring objects to show through. (Note that no analytic clipping to the cutaway is required.)

One useful variation is to render the cutaway mask using screen-door transparency, as demonstrated in
Fig. 12. Only pixels of the cutaway mask that are rendered will block the pixels of the obscuring
objects that fall within the cutaway mask. Thus, parts of the obscuring objects that project within the
535

renderWithMask
{
classifyObjects()

clear frame buffer and z-buffer

render u

for each object 0 in list of non obscuring objects {


render 0

if line drawings of obscuring objects desired


for each object 0 in list of obscuring objects {
render 0 in wire frame mode

write-disable frame buffer

create geometry for cutaway mask that bounds u

render cutaway mask using closest possible z value

write-enable frame buffer

if cutaway mask outline desired


render outline of cutaway mask

for each object 0 in list of obscuring objects {


render 0

Figure 11: Algorithm for rendering obscuring objects with a cutaway mask.

cutaway mask will appear as if they were rendered with the complement of the screen-door
transparency bitmask used for the cutaway. (There is an interesting problem with this approach:
because all obscuring objects use the mask's screen, only the closest obscuring object will be visible
when multiple obscuring objects overlap.)

Instead of creating a "hard-edged" cutaway mask, we can also create one that fades from totally
transparent at the center to totally opaque at its edges, producing an airbrushed effect. For example,
we can accomplish this by rendering into the z-buffer a single mask consisting of a series of
successively smaller concentric filled ellipses, each with a screen-door bitmask with more 1 bits than
the previous circle. Figure 13 shows an example of this effect.

As this figure illustrates, a smooth-edged mask can be confusing: It IS not clear whether the
unoccluding object is in front of or behind the objects that surround it. A simple variation on this
rendering method addresses this problem by allowing obscuring objects to be shown as line drawings
in those places at which they intersect the mask. Figures 14-18 recapitulate the steps in rendering the
entire obscuring-object cutaway effect with line drawing. l First, the unoccludable and nonobscuring
objects are rendered (Fig. 14). Next, all obscuring objects are rendered in wireframe (Fig. 15).
(Ideally, a line-drawing algorithm should be used that included only those lines needed to
communicate the shape of the objects [Dooley and Cohen 90aJ.) The cutaway mask is then rendered
into the z-buffer alone (Fig. 16). (The soft-edged elliptical mask is shown here in red for tutorial
purposes, although it is not visible in the frame buffer.) Next, the obscuring objects are rendered, as

INane of these intermediate images actually appear, since we are using a double-buffered graphics system.
536

Figure 12: Radio with objects occluding battery rendered with a semi-transparent cutaway.

Figure 13: Radio with objects occluding battery rendered through a feathered cutaway.

shown in Fig. 17, which still shows the mask in red. The obscuring objects appear in their regular
style outside the mask, and smoothly change to wireframe inside the mask, which also reveals the
537

Figure 14: Cutaway effect: Unoccludable and nonobscuring objects.

Figure 15: Cutaway effect: Obscuring objects as line drawings.

unoccludable objects. (The wireframe objects are not visible outside the mask because they are
rendered with the same material definitions and lighting models.) Figure 18 shows the completed
538

Figure 16: Cutaway effect: Elliptical mask (shown in red).

Figure 17: Cutaway effect: Elliptical mask (shown in red) with obscuring objects.

illustration. For comparison, Fig. 19 shows a hard-edged cutaway view combined with the wirefran
effect.
539

Figure 18: Cutaway effect: Completed illustration.

Figure 19: Hard-edged cutaway with line drawings of obscuring objects.

Multiple unoccludable objects can be handled during rendering by drawing all objects that obscure no
unoccludable objects fIrst; next drawing obscuring object outlines (if needed) and cutaway masks (if
540

needed), and then rendering all obscuring objects. An example is shown in Fig. 20, in which a blue
cube and a yellow cube in a regular grid of cubes have been designated unoccludable. (An IBIS
illustration that was designed to show the location of these two cubes would typically also use
highlighting to provide emphasis.) Unoccludable objects that themselves obscure other unoccludable

Figure 20: Multiple unoccludable objects.

objects pose a problem. If rendered with the other unoccludable objects, they can obscure them; if
rendered with the obscuring objects, they may be difficult to see. IBIS can detect some of these cases
and automatically design and render a composite illustration [Seligmann and Feiner 89] that includes
inset subpictures that use different viewing specifications to ensure that all the visibility constraints of
all unoccludable objects are satisfied.
4 Support for Interaction
Since each illustration is part of an interactive sequence, we would like to ensure that there are smooth
transitions between each frame to avoid jarring visual discontinuities. When the obscuring object
transparency method is selected, we use a series of screen-door bitmasks that cause the obscuring
objects to fade in and out gradually. A timeout value ensures that even if the user pauses while
changing the view, objects continue to fade in and out to their correct values.

A series of screen-door bitmasks is also used to increase and decrease a cutaway's mask transparency
when the cutaway appears or disappears. As well, the cutaway mask can be opened and closed by
using a succession of star-shaped polygonal masks that increase and decrease in size, scaling about a
point in the mask's kernel.
541

5 Implementation
IBIS is written in C++ and the CLIPS production system language [Culbert 88]. It runs under HPUX
on an HP 9000 375 TurboSRX graphics workstation, which provides hardware support for realtime
3D shaded graphics. The illustrations in this paper take approximately three seconds each to render in
our current testbed (and about one second each when rendered without visibility constraints).
Visibility was determined using the z-buffer-based approach.
6 Future Work
Each of IBIS's constraints, including the visibility constraints, is associated with a success threshold
that IBIS uses to determine whether a goal has been satisfactorily achieved; if this is not the case, IBIS
will attempt to further modify the illustration. The dynamic illustration facility discussed here
currently does not take success thresholds into account and thus treats unoccludable objects uniformly.
We are modifying IBIS's illustration strategy so that if an unoccludable object is partially blocked, and
its threshold is low, it will be able to leave the obscuring object unchanged.

Although we feel that the techniques described here can produce effective illustrations, they are
careful compromises between utility and speed. Therefore, we have begun to investigate the use of an
efficient solid modeling system [Naylor 90] to do true 3D cutaways, which excavate arbitrarily shaped
pieces from obscuring objects. As graphics hardware increases rapidly in both speed and capability,
we will also be incorporating functionality such as alpha blending, modeling clipping planes, and
capping into IBIS's repertoire.

Acknowledgments
This work is supported in part by the Defense Advanced Research Projects Agency under Contract
NOO039-84-C-0165 and by a grant from the Hewlett-Packard Company. Norman Chin developed the
efficient procedures that we used to manipulate shadow volumes. Esther Woo implemented major
portions of IBIS's display-list navigation facility. John Edmark, Garry Johnson, and Alan Waxman
implemented portions of the original IBIS.

References
[Appel, Rohlf, and Stein 79]
Appel, A., Rohlf, F., and Stein, A.
The Haloed Line Effect for Hidden Line Elimination.
In Proc. ACM SIGGRAPH 79 (Computer Graphics, 13:2, August 1979), pages
99-106. Chicago, IL, August 8-10, 1979.
[Chin 90] Chin, N.
Near Real-Time Object-Precision Shadow Generation using BSP Trees.
Master's thesis, Columbia University, Department of Computer Science, 1990.
M.S. Thesis.
[Chin and Feiner 89]
Chin, N. and Feiner S.
Near Real-Time Shadow Generation using BSP Trees.
In Proc. ACM SIGGRAPH 89 (Computer Graphics, 23:3, July 1989), pages
99-106. Boston, MA, July 31-August 4, 1989.
[Chin and Feiner 91]
Chin,N. and Feiner, S.
Object-Precision Shadow Generation for Area Light Sources Using BSP Trees.
1991.
Submitted.
542

[Crow 77] Crow,F.


Shadow Algorithms for Computer Graphics.
In Proc. ACM SIGGRAPH 77 (Computer Graphics, 11 :3, July 1977), pages
242-248. San Jose, CA, July 20-22,1977.
[Culbert 88] Culbert, C.
CUPS Reference Manual
NASNJohnson Space Center, TX, 1988.
[Dooley and Cohen 90a]
Dooley, D. and Cohen, M.
Automatic lllustration of 3D Geometric Models: Lines.
In Proc. 1990 Symp. on Interactive 3D Graphics (Computer Graphics, 24:2, March
1990), pages 77-82. Snowbird, UT, March 25-28,1990.
[Dooley and Cohen 90b]
Dooley, D. and Cohen, M.
Automatic Illustration of 3D Geometric Models: Surfaces.
In Proc. Visualization '90, pages 307-314. San Francisco, CA, October 23-26,
1990.
[Elhadad et al. 89] Elhadad, M., Seligmann, D., Feiner, S., and McKeown, K.
A Common Intention Description Language for Interactive Multi-media Systems.
In A New Generation of Intelligent Interfaces: Proceedings of UCAl89 Workshop
on Intelligent Interfaces, pages 46--52. Detroit, MI, August 22, 1989.
[Feiner 85] Feiner, S.
APEX: An Experiment in the Automated Creation of Pictorial Explanations.
IEEE Computer Graphics and Applications 5:11:29-38, November, 1985.
[Feiner and McKeown 90a]
Feiner, S. and McKeown, K.
Generating Coordinated Multimedia Explanations.
In Proc. CAlA90 (6th IEEE Con! on Artificial Intelligence Applications), pages
290-296. Santa Barbara, CA, March 5-9,1990.
[Feiner and McKeown 90b]
Feiner, S. and McKeown, K.
Coordinating text and graphics in explanation generation.
In Proc. AAAI-90, pages 442~9. Boston, MA, July 29-August 3,1990.
[Kamada and Kawai 87]
Kamada, T. and Kawai, S.
An Enhanced Treatment of Hidden Lines.
ACM Trans. on Graphics 6(4):308-323, October, 1987.
[Kamada and Kawai 88]
Kamada, T. and Kawai, S.
Advanced Graphics for Visualization of Shielding Relations.
Computer Vision, Graphics, and Image Processing 43(3):294-312, September,
1988.
[Karp and Feiner 90]
Karp, P. and Feiner, S.
Issues in the automated generation of animated presentations.
In Proc. Graphics Interface '90, pages 39-48. Halifax, Canada, May 14-18, 1990.
[Martin 89] Martin, J.
High Tech Illustration.
North Light Books, Cincinnati, OH, 1989.
543

[Naylor 90] Naylor, B.


SCULPT: An Interactive Solid Modeling Tool.
In Proc. Graphics Interface '90, pages 138-148. Halifax, Nova Scotia, May 14-18,
1990.
[Saito and Takahashi 90]
Saito, T., and Takahashi, T.
Comprehensible Rendering of 3-D Shapes.
In Proc. ACM SIGGRAPH 90 (Computer Graphics, 24:4, August 1990), pages
197-206. Dallas, TX, August 6-10, 1990.
[Seligmann and Feiner 89]
Seligmann, D. and Feiner, S.
Specifying Composite Illustrations with Communicative Goals.
In Proc. UlST 89 (ACM SIGGRAPH Symp. on User Interface Software and
Technology), pages 1-9. Williamsburg, VA, November 13-15, 1989.
[Thibault and Naylor 87]
Thibault, W. and Naylor, B.
Set Operations on Polyhedra Using Binary Space Partitioning Trees.
Computer Graphics 21:4:153-162, July, 1987.
[Thomas 68] Thomas, T.A.
Technical Illustration, 2nd Ed.
McGraw-Hill, New York, NY, 1968.

Steven K. Feiner is an Associate Professor of Computer Science at


Columbia University. He has a Ph.D. in Computer Science from Brown
University. Dr. Feiner's research interests include image synthesis,
applications of artificial intelligence to computer graphics, user interfaces,
animation, hypermedia, and visualization. Much of his current work is
concerned with the development of knowledge-based multimedia user
interfaces. Dr. Feiner is on the editorial boards of Electronic Publishing and
ACM Transactions on Information Systems. Along with Drs. James Foley,
Andries van Dam, and John Hughes, he is coauthor of Computer Graphics:
Principles and Practice, 2nd Ed.

Doree Duncan Seligmann is currently a Ph.D. student in Computer Science


at Columbia University. Her research interests involve computer graphics
illustration, visual languages, modeling artistic techniques, user interfaces,
and multimedia communication. Her thesis work concentrates on issues of
visual communication, and on the development of the IBIS intent-based
illustration system. At AT&T Bell Laboratories, Holmdel, she is part of the
Rapport project, a multimedia conferencing system. After completing a
degree in Anthropology at Harvard University and a thesis on Irish pubs,
Ms. Seligmann spent several years in Paris directing and designing
theatrical productions. She has had 14 one-person shows of her paintings.

Address: Department of Computer Science, Columbia University, New


York, NY 10027.
Piecewise Linear Approximations of Digitized
Space Curves with Applications
Insung Ihm and Bruce Naylor

Abstract

Generating piecewise linear approximations of digitized or "densely sampled" curves


is an important problem in many areas. Here, we consider how to approximate an arbi-
trary digitized 3-D space curve, made of n + 1 points, with m line segments. We present
an O(n 3 logm) time, O(n 2 Iogm) space, dynamic programming algorithm which finds
an optimal approximation. We then introduce an iterative heuristic algorithm, based
upon the notions of curve length and spherical image, which quickly computes a good
approximation of a space curve in O(Nitern) time and O(n) space. We apply this fast
heuristic algorithm to display space curve segments and implicit surface patches, and
to linearly approximate c'lfved 3D objects, made by rotational sweeping, by binary
space partitioning trees th<Lt are well- balanced.

Keywords : Piecewise lin~ar approximation, Digitized space curves, Computa-


tional Geometry, Algebraic curves and surfaces, Binary space partitioning tree

1 INTRODUCT rON

The piecewise linear approximation of a digitized or densely sampled curve is an impor-


tant problem in image processing, pattern recognition, geometric modeling, and computer
graphics. Digitized curves occur as boundaries of regions or objects. Such curves, usually
represented as sequences of points, may be measured by devices such as scanning digitiz-
ers or may be generated by evaluating parametric equations of space curves, or by tracing
intersection curves given by implicit surface equations. They can also be obtained from an
experiment. For efficient manipulation of digitized curves, they are typically represented in
the form of sequences of line segments. While the original curves are made of large sequences
of points, their approximation, are represented by a small number of line segments that are
visually acceptable.

The piecewise linear approxilc\ation problem has received much attention, and there exist
many approximation algorithms for this problem. Standard line fitting methods such as
least squares approximation, Chebycheff approximation and any other nonlinear approxi-
mation [Can71, Cd80, Ric64, Mon70] are not well suited for use in a situation where fast

545
546

and interactive response time is required, since these approximation algorithms perform
relatively poorly in terms of computational time and space. On the other hand, the liter-
ature in related areas contains many heuristic methods· that are more direct and efficient
even though, in general, they do not find an optimal approximation [Ram72, PH74, RW74,
Wil7S, SGSO, PavS2, WDS4, RobS5, DunS6, FWLS9]. This problem was also treated more
theoretically in the area of computational geometry. Imai & Iri [IIS6] presents an O( n 3 ) time
algorithm for finding the best approximation. The time complexity is reduced to O(n 2 Iogn)
in [MOSS, TouS5]. However, most of these works consider only planar curves as their input
data, and little work has addressed space curve approximation. In many applications, a
three dimensional (3D) object is designed with a set of boundary curves in 3D space which
are represented as a set of equations or as a sequence of points in 3D space. Hence, having
a good approximation method for digitized space curves is essential. In [KdSS] which is
one of the few works on 3D space curve approximation, a quintic B-spline is constructed
for noisy data, and the length of Darboux vector, also known as total curvature, is used as
the criterion for segmentation of 3D curves. This method requires construction of quintic
B-splines, explicit computation of curvature and torsion, and root solving of polynomials.

In this paper, we consider how to quickly produce a good piecewise linear approximation of a
digitized space curve with a small number of line segments. Our algorithm is based upon the
notions of curve length and spherical image, which are fundamental concepts in differential
geometry [Kre59]. In Section 2, we define some notations and give a mathematical formu-
lation of the specific problem we are dealing with. This approximation problem is naturally
reduced to a combinatorial minimax problem which can be restated as "Given some number
of points, choose a smaller number of points such that the maximum error of approximation
is minimized". In Section 3, an optimal approximation is found in O(n 3 log m)-time and
O( n 2 10g m )-space. We describe, in Section 4, a fast heuristic iterative algorithm which re-
quires O(Nitern )-time and O(n )-space, where Niter is a number of iterations carried out. Also,
the performances of the heuristic algorithm for some test cases are analyzed. In Section 5,
we illustrate applications of this fast heuristic algorithm in which space curves and implicit
surfaces are adaptively linearized. In Section 6, we also apply the heuristic approximation
algorithm to construct adaptive binary space partitioning trees for a class of objects made by
revolution. It is shown that the linear approximation of a curve can be naturally extended
to linearly approximate some class of curved 3D objects in bsp trees that are well-balanced.

2 PRELIMINARIES

We first define a digitized space curve.

Definition 2.1 Let C be a space curve in three dimensional space. A space curve segment
C(a, b) is a connected portion of a curve C with end points a, b E R3.

In order to define a curve segment without ambiguity, a tangent vector at a might be needed.
But we assume this vector is implicitly given.

Definition 2.2 A digitized space curve segment C( a, b, n) of order n is an ordered sequence


{a = PO,Pl,P2,··· ,pn = b} of points Pi E R3 , i = 0,1,··· n, which approximates C(a, b).
547

Approximation of a digitized space curve with a small number of line segments results in
an approximation error. The quality of approximation is measured in terms of a given error
norm that can be defined in many ways. Some of most commonly used ones are

1. infinite norm:

2. 2-norm:

3. area norm: L area = absolute area between curve segment and approximating line seg-
ment.

In this paper, we use Loo as an error norm to measure a goodness of an approximation. Note
that our algorithms in the later sections are also compatible with L 2 •

Definition 2.3 A piecewise linear approximation LA( C, a, b, m) of order m to C( a, b, n)


is an increasing sequence {O = qo, ql, q2,' .. ,qm = n} of indices to points in C. An error
E(LA( C, a, b, m)) of a piecewise linear approximation LA is defined as maXO:'Oi:'Om-l Eseg( i)
where the i-th segment error Eseg( i) is max qi :'Oj:'OQi+l dist(pj, line(pq" Pqi+l))' and
dist( x, line(y, z)) is the Euclidean distance from a point x to a line, determined by two
points y and z.

(Note that, for any point x E ?R 3 , and two other points y,z E ?R 3 , (y /; z), dist(x,line(y,z))
can be compactly expressed as II y - x + (x~:~Zy~y) (z - y) 112 where (', .) is a dot product of
two vectors and II . II is a length of a vector.)

As pointed out in Pavlidis and Horowitz [PH74]' the problem of finding a piecewise linear
approximation LA can be expressed in two ways:

1. find a LA(C,a,b,m) such that E(LA) < E for a given bound E and m is minimized.

2. find a LA(C, a, b, m) that minimizes E(LA) for a given m.

In this paper, we focus mainly on the second type of problem. However, we will also discuss
briefly the first type of problem in Section 4.3.5.

Definition 2.4 Given C(a, b, n) and an integer m (n 2: m), the optimal piecewise lin-
ear approximation LA*( C, a, b, m) of order m is a piecewise linear approximation such that
E(LA*) s: E(LA) for any piecewise linear approximation LA of order m. (Note LA* is not
unique.)

Given these definitions, the problem can be stated as :

Problem 1 Given C(a,b,n) andm, find LA*(C,a,b,m).


548

3 AN OPTIMAL SOLUTION

3.1 An Algorithm

A naive algorithm would be as following:

Algorithm 3.1 (NAIVE)

temp = 00;
for all the possible (:-=-~) LA(C,a,b,m) do
compute E(LA);
if E(LA) < temp then LA" = LA; temp = E(LA);
endfor

Note that the problem has a recursive nature, that is, it can be naturally divided into two
subproblems of the same type. Dynamic programming, which is a general problem-solving
technique widely used in many disciplines [AHU74], can be applied in this case to produce
a rather straightforward algorithm. We first give an algorithm which works in case m is a
power of 2. Then the algorithm is slightly modified for an arbitrary m.

Define EL
to be the error of LA"( C, pi, Pj, I), that is, the smallest error of all piecewise linear
approximations with I segments to the portion of C from Pi to Pj. Then EL can be expressed
I I
in terms of Ei~ and Elj as following:

(1)

(Note that EL = 0 if j - i :s; I.)


The recursive relation renders the following dynamic programming algorithm which computes
the minimum error Eo;, and its corresponding LA' :

Algorithm 3.2 (DYNAMIC)

/* basis step */
for i = 0 to n - 1 do
for j = i + 1 to n do
compute Elj;
endfor
endfor
1* inductive step */
for d = 1 to log m do
for i = 0 to n - 2d - 1 do
for j = i + 2d + 1 to n do
2d
Eij = max
{E2d-1 E 2d- 1 }
ik' , k ' j
.
= mllli<k<j {E2d-1 E2d-1}.,
max ik , kj
549

I<t = k';
endfor
endfor
endfor
construct LA* from I<t;
In the basis step, Eli is computed by calculating the distances from the points Pk, i <
k < j to the line passing through Pi and Pi, and taking their maximum. I<t
is needed
to recursively construct the optimal piecewise linear approximation once EEJ:, is computed.
Note the recursive relation LA*(C, Pi, Pj, 2d ) = LA*(C, Pi, PKd, 2d - 1 ) U LA*(C, PKd, Pj, 2d - 1 ).
I) I)

3.2 The Time and Space Complexities

Since Eb is computed in O(j - i) time, the basis step requires O(L;i;l L;j'=i+1 (j - i))
O(n 3 ) time. Similarly, Elf can be computed in O(j - i) time. So, the inductive step needs
O(n 3 10gm) time. Also, construction of LA* can be done in O(m) time. These three time
bounds are combined into O( n 3 10g m).

With regard to space, the algorithm needs O(n 2 ) space for storing a table for E;/. Also,
O(n 2 10g m) space is required to save I<t,
d = 1,2,· .. , log m. Hence, the space complexity
is O(n 2 10gm).

3.3 An Algorithm for an Arbitrary m

When m is not a power of 2, we can break minto m' and m - m', where m' is the largest
power of 2 less than m. m - m' is then broken if it is not a power of 2. Applying this
process repeatedly produces two sequences of numbers, one made of powers of 2, and the
other made of non-powers of 2. By maintaining two tables, and synchronizing the order of
merging operations, EEJ:, can be computed. It is not difficult to see that this modification
only increases both time and space complexities by constant factors.

4 A HEURISTIC SOLUTION
Even though the algorithm DY N AM I C finds an optimal approximation, the time and space
requirement is excessive. As stated in Section 4.3.4, the algorithm is extremely slow even
for modest n, for example, n = 400. In fact, it is more desirable to generate quickly a good
approximation. In this section, we describe a heuristic algorithm which consists of two parts,
computation of an initial approximation and iterative refinement of the approximation. Our
heuristic algorithm is based upon the observation that the error of a segment is a function
of the length of the curve segment, and the total absolute change of the angles of tangent
vectors along the curve segment. It tends that the longer curve segment has the larger
segment error. Also, the total angle change is a measure of how much a curve segment is
bent. However, it is illustrated in the next two subsections that neither measure alone is a
550

good heuristic. Our heuristic in Section 4.3 is a weighted sum of the two measures, and this
simple combined measure yields a good initial guess.

4.1 Curve Length Subdivision

Assume we have a parametric representation C(t) of a curve C. The first heuristic is to


divide a curve segment into subsegments with the same curve length where the curve length
is defined to be f:
II d~\t) II dt. This quantity is usually approximated by the chord length
as following.

Given a digitized curve C(a,b,n) = {a = PO,PI,"',Pn = b}, consider a parametric curve


C(t) of a parameter t where C(O) = po and C(l) = pn. Then,

r II dC(t) II
n-I
Pi+1 - Pi
L
l
dt ~ II d(. . ) II d(Pi,Pi+l)
Jo dt i=O P"P.+I
n-I
L II PHI - Pi II
i=O
n-I
L d(pi' Pi+l)
i=O

where d(p,q) is the Euclidean distance between two points P and q in ~3.

Algorithm 4.1 (LENGTH)

/* let Lseg( i, j) be L:t;;,~ d(pk' PHd */


compute total = L:~;;;~ d(pk' PHr);
seglength = ceil(total/m);
qo = 0; i = 0;
while i < m - 1 do
find the largest j such that Lseg( qi, j) < seglength;
qi+l = j; i = i + 1;
endwhile

Figure l(upper left) and Figure 2 (leftmost) indicate that this algorithm produces a LA
which approximates C quite well in flat regions of a curve, and poorly in highly curved
regions.

4.2 Spherical Image Subdivision

Consider a curve C(s) with an arc length parameter s [Kre59, O'N66j. When all unit tangent
vectors T(s) of C(s) are moved to the origin, their end points will describe a curve on the
551

Figure 1: Folium of Descartes

Figure 2: A cubic curve


552

unit sphere. This curve is called the spherical image or spherical indicatrix of C(s). Given a
curve segment, the length of the corresponding spherical image implies how much the unit
tangent vector changes its direction along the curve segment. Hence, it gives us a measure
of the degree to which a curve segment is curved. It is easily shown that the curvature K( s)
is. equal to the ratio of the arc length of the spherical image, and the arc length of C(s). So,
the length of the spherical image corresponding to a curve segment C (s) : [0, l] is J~ K( s) ds.
(f~ K( s) ds is sometimes called the total curvature [O'N66], while it also can mean the length
of Darboux vector [Kre59].) Practically, the quantity must be approximated.

Given a digitized curve C( a, b, n) = {a = Po, PI, ... ,pn = b}, consider an imaginary paramet-
ric curve C(s) of an arc length parameter s where C(O) = Po and C(l) = Pn. At a point Pi,
S ::::; cl(Po, Pi) such that C( s) = pi, where cl(Po, Pi) = z=~:~ d(pj, Pj+l)' Then, the curvature
is approximated as following:

K(S) II lim T(s + 8s) - T(s) II


68---+0 8s
::::; II t;+1 - ti II (1)
d(pi,Pi+d
where ti is an approximated unit tangent vector. (We will discuss how to get ti shortly.)
Then,

n-I
L II ti+l - ti II
n-I
L d(ti' ti+d·
i=O

The simple forward-difference approximation (1) to K(S) can be replaced by the popu-
. d(ti)~~t'). ) which is a much better approximation
lar central-difference approximation d( PJ-l,P, P"P,+l
when the points are close together. Integration can be also replaced by a better approxima-
tion formula. See [Cd80] for more numerical techniques.
In this second heuristic method, C(a, b, n) is subdivided into LA(C, a, b, m) = {O =
qo, ql, q2, ... ,qm = n} such that each subsegment has the same length of the spherical image.

Algorithm 4.2 (IMAGE)

/* let Iseg(i,j) be z=i:! d(tk,tk+l) */


compute total = z=~:6 d(tk, tk+I);
segind = ceil(total/m);
qo = 0; i = 0;
while i < m - 1 do
find the largest j SUI h that Iseg( qi, j) < segind;
qi+1 = j; i = i + 1;
endwhile
553

The quantity Iseg(qi, qj) is an approximating measure of the length of the spherical image of
the segment from Pqi to Pqi+1l that is, Iseg( qi, qj) is a total absolute change of the angles of
tangent vectors. Hence, this algorithm is sensitive to high curvature. We can see 1M AG E
returns a LA which approximates C poorly in fiat portions of a curve, and very well in highly
curved portions in Figure 1 (upper right) and Figure 2 (the second from left).
In the above algorithm, tangent vector information is used to subdivide a curve. If the
digitized space curve has been generated from equations, say a parametric equation or two
implicit equations, the tangent vector at each sample point can be computed directly from
them. When instead a digitized curve has been given in terms of a sequence of points, or
direct computation of tangent vectors from given equations is expensive, the tangent vector
tk to a curve C at Pk still can be approximated by averaging the directions of the neighboring
lines of Pk in C. In our implementation, the tangent vector is approximated by 5 successive
points as follows [Pie87] :

where 0' = II Vi-l X Vi 11,/1 = II Vi+! X Vi+2 II, Vi = Pi - Pi-l, and X means a cross product
of two vectors. In case the digitized curve is open, the Bessel conditions are applied for the
tangents at the end points as follows [deB78] :

Vo= 2Vl - V2, V-l = 2vo - Vb


Vn+l = 2v n - Vn-l, Vn +2 = 2Vn +l - Vn·

4.3 Heuristic Subdivision

Now, we give a heuristic algorithm which combines the two techniques. It consists of two
steps: generation of an initial piecewise linear approximation LAo, and iterative refinement
of piecewise linear approximation LAk to produce LA k+!.

4.3.1 Computation of An Ihitial Approximation: LAo

An initial LAo is computed by an algorithm which is a combination of LENGTH and


IMAGE.
The weight, 0' is a parameter which controls the relative emphasis between curve length and
spherical image, and is empirically chosen.

Algorithm 4.3 (INIT)

select some value of 0' (0 :-:; 0' :-:; 1);


compute total = Lk,;;;5(O'· d(Pk,Pk+!) + (1 - 0'). d(tk, tk+d);
segsum = ceil(total/m);
qo = 0; i = 0;
while i < m - 1 do
554

find the largest j such that a' Lseg(qi,j) + (1 - a)· Iseg(qi,j) < segsum;
qi+1 = j; i = i + 1;
endwhile

See Figure 1 (bottom left) and Figure 2 (the third from left).

4.3.2 Iterative Refinement of Approximations: LAk

The "hybrid" algorithm INIT generally produces a good piecewise linear approximation.
The next step is to diffuse errors iteratively in order to refine the initial approximation.
Note each segment is made of a sequence of consecutive points of a digitized curve, and it is
approximated by a line connecting its end points. Usually, an error of a segment decreases
as either of its end points is assigned to its neighboring segment. Hence, the basic idea in
the following iterative algorithm is to move one of end points of a segment with larger error
to its neighboring segment with less error, expecting decrease of the total error of the new
LA. In the kth step of the following algorithm ITER, each segment of LAk is examined,
diffusing, if possible, its error to one of its neighbors. LAk tends to quickly converge to a
minimal LA which is a local minimum. See Figure 1 (bottom right), Figure 2 (rightmost),
and Figure 3.

Algorithm 4.4 (ITER)

compute LAo from INIT;


k = 0;
do until (satisfied)
compute errors of segments in LAki
curmax = E(LAk(C, a, b, m);
for i = 0 to m - 1 do
if the error of i-th segment is larger than
that of either of its neighboring segments
then move the i-th segment's end points to the neighbor
only if this change does not result in segment errors
larger than curmaxi
endif
endfor
LAk+1 = LA k ;
enddo

4.3.3 The Time and Space Complexities

First, O( n) time is needed in order to approximate the tangent vector at each point. The
algorithm INIT needs to scan the points and tangent vectors to compute Lseg and Iseg first,
555

Figure 3: A human profile and a goblet


556

and then Lseg and Iseg are scanned to divide the digitized curve. Hence, it takes O( n)
time. Now, consider the algorithm ITER. First, the segment errors of LAk is computed in
O(n) time. In the for loop, each segment and its two neighbors are examined, hence, each
segment is examined twice. Since for each segment, the segment error must be computed,
O(n) computation is needed by the for loop. So, ITER takes O(Niter . n) time where Niter is
the number of iterations. So, the time complexity of the heuristic algorithm is O( Niter·n), and
it is easy to see O( n) space is sufficient for storing input data and intermediate computations.

4.3.4 Performance

We have implemented both the optimal and heuristic algorithms on a Sun 4 workstation
and a Personal Iris workstation. Table 1- 5 in Appendix A show their performances for
test data. The integer in the parenthesis is the number of iterations needed to arrive at the
local minimum. The bottom row (LAk/ LA') of each table indicates the performance of our
heuristic algorithm, and it is observed that it approximates the optimal solution reasonably
well. The program for the heuristic algorithm computes the approximate solution quickly
(immediately or in a few seconds depending on how many iterations are needed.) On the
other hand, it takes about 45 minutes to compute the optimal solution for (n = 404,m = 64)
example of Table 4.

4.3.5 The Center of Mass

We now briefly consider the following type of the piecewise linear approximation problem
: "find a LA(C, a, b, m) such that E(LA) < c for a given bound c and m is minimized."
Even though our heuristic algorithm was invented for an arbitrary number of subsegments,
we can use it for dividing a segment into 2 subsegments. One simple algorithm would be to
recursively divide a curve segment until the error of subsegment is less than c.

If a curve segment is to be divided into only two subsegments, the notion of the center of mass
can be applied. As before, assume we have a parametric representation C(8) of a curve C,
where 8 is an arc length parameter, and 11:(8) is its curvature. Consider a curve segment define
by an interval [0, I]. Then the center of curvature, defined by c" = J~ 811:( 8) d8/ J~ K( 8) d8,
can be used as a heuristic that divides a curve segment C( 8) : [0,1] into two subsegments
C(8) : [O,c,,] and C(8) : [c",l].
Again, c" needs to be approximated. For a digitized curve C( a, b, n) = {a = Po, P1, ... ,pn =
b}, consider an imaginary parametric curve C( 8) of an arc length parameter 8 where C(O) =
Po and C(l) = Pn. Then, at a point pi, 8::::: cl(PO,Pi) such that C(8) = Pi, where cl(po, Pi) =
I:~~1 d(pj, pj+I). Together with the approximation of the denominator given before, the
following expression results in an approximation of c" :

r l n-1 ti+1 - ti
Jo Sll:(8) ds ::::: ~ cl(po, Pi) II d(Pi,Pi+1) II d(Pi,Pi+1)
n-1
L cl(po, Pi) I ti+1 - ti I
i=O
557

n-l
L cl(Po, Pi)d( ti, ti+l).
i==O

5 APPLICATION I DISPLAY OF SPACE


CURVES AND SURFACES

Curves and surfaces in geometric modeling are represented in either the parametric or implicit
form. Each form has its own advantages and disadvantages. For instance, implicit curves
and surfaces naturally define half spaces, and ray-surface intersections are easily computed,
while the parametric form is well suited for generating points along a curve or surface. In
order for complex objects, made of curves and surfaces, to be effectively manipulated, it is
very desirable to have a fast and adaptive display method. In this section, we consider how
a good approximation to a curve or surface is generated quickly.

5.1 Adaptive Display of Space Curve Segments

Our heuristic algorithm is well suited to producing a piecewise linear approximation of a


space curve segment in the parametric or implicit form. First, the curve segment is densely
sampled, and then the linear approximation algorithm filters the sampled points, producing a
good approximation to the curve segment. Points on a parametric curve are easily generated.
A curve, represented by two implicit surfaces or an implicit surface and a parametric surface,
can be traced using a surface intersection algorithm (for example, [BHHLSS]). The space
curve tracing algorithm is very fast when the degrees of curves are in a reasonable range
and there are no singular points along the curve segment. As seen in the examples, only a
small number of line segments, adaptively filtered, can approximate a curve segment well,
resulting in fast display. Figure 1, 2 and 4 are examples of planar curves, and Figure 5 and
6 are those of non planar curves.

5.2 Adaptive Display of Implicit Surface Patches

Algebraic surfaces have become increasingly important as they lend themselves well to some
applications in geometric modeling like creation of blends and offsets [MSS5, SedS5, HHS6,
WarS6, ROS7, BISS, PKS9], and algorithms for displaying algebraic surfaces have emerged.

Hanrahan [HanS3] showed that algebraic surfaces lend themselves well to ray tracing. Seder-
berg and Zundel [SZ89] uses a scan line display method which offers improvement in speed
and correctly displays singularities. Even though both approaches produce very good im-
ages, the computation cost is expensive and the process is static in the sense that operations
on objects, like rotation and translation, can not be done dynamically. On the other hand,
the polygonization-and-shading technique [AGS9, BloSS] uses the capability of the graphics
hardware which provides very fast rendering. We are currently working on the problem of
constructing a complex object made of triangular implicit surface patches, and one difficult
problem is how to isolate, numerically, only the necessary part or the triangular patch from
558

Figure 4: A four-leaved rose

Figure 5: A non planar quartic curve


559

Figure 6: A nonplanar sextic curve

P2

QI

PI PI

Figure 7: Recursive refinement of a triangle

the whole surface. The space curve tracing and our heuristic algorithm together can be used
to produce adaptive polygonization of the necessary portions of smooth algebraic surface
patches. For example, the following simple procedure produces adaptive polygonization of a
triangular algebraic surface patch.

Let f(x,y,z) = 0 be a primary surface whose triangular portion clipped by three planes
hi(x , y,z) = 0, i = 1,2,3 is to be polygonized. (See Figure 7.) Initially, the triangle To =
(Po, PI, P2 ) is a rough approximation of the surface patch. Each boundary curve decided by
f and hi is traced to produce a digitized space curve, then its LA of order 2d for some given
d is computed. Then To is refined into four triangles by introducing the 3 points Qo, QI, and
Q2 where Qi, i = 0,1,2 is the center point of each LA of order 2d. The clipping planes of
subdivided triangles can be computed by averaging the normals of the two triangles incident
to the edge. Then, each new edge is traced, and then its LA of order 2d - 1 is produced . In
this way, this new approximation is further refined by recursively subdividing each triangle
until some criterion is met.
560

Figure 8: A quartic surface patch

Figure 8 shows an example of the resulting adaptive polygonizations when d = 3. A goal


here is to put more triangles on the highly curved portion.

While the above method produces a regular (but adaptive) network of polygons, it could be
modifled to generate more adaptive polygonization. Rather than subdivide all the triangles
up to the same level, each triangle is examined to see if it is already a good approximation
to the surface portion it is approximating. It is refined only when the answer is no. Some
criterions for such local refinement are suggested in [Blo8S, AGS9]. However, to design
an irregular adaptive polygonization algorithm with robust local refinement criterions, is an
open problem.

6 APPLICATION II CONSTRUCTION OF BI-


NARY SPACE PARTITIONING TREE

Binary space partitioning tree (bsp tree) has been shown to provide an effective represen-
tation of polyhedra through the use of spatial subdivision, and are an alternative to the
topologically based b-reps. It represents a recursive, hierarchical partitioning, or subdivi-
sion, of d-dimensional space. It is most easily understood as a process which takes a subspace
and partitions it by any hyperplane that intersects the subspace's interior. This produces
two new subspaces that can be partitioned further.
An examples of a bsp tree in 2D can be formed by using lines to recursively partition the
2D space. Figure 9(a) shows a bsp tree induced partitioning of the plane and (b) shows the
corresponding binary tree. The root node represents the entire plane. A binary partitioning
of the plane is formed by the line labeled u, resulting in a negative halfspace and a positive
561

halfspace. These two halfspaces are represented respectively by the left and right children
of the root. A binary partitioning of each of these two halfspaces may then be performed,
as in the figure, and so on recursively. When, along any path of the tree, subdivision is
terminated, the leaf node will correspond to an unpartitioned region, called a cell.

(a) (b)

Figure 9: Partitioning of a 2D bsp tree (a), and its binary tree (b)

The primary use of bsp trees to date has been to represent polytopes. this is accomplished
by simply associating with each cell of the tree a single boolean attribute classification ::=
{ in, out}. If, in figure 9, we choose cells 1 and 5 to be in cells, and to the rest out cells,
we will have determined a concave polygon of six sides. This method, while conceptually
very simply, is capable of representing the entire domain of polytopes, including unbounded
and non-manifold varieties. Moreover, the algorithms that use the bsp tree representation
of space are simple and uniform over the entire domain. This is because the algorithms only
operate on the tree one node at a time and so are insensitive to the complexity of the tree.
A number of bsp tree algorithms are known, including affine transformations, set operations,
and rendering (see e.g. [N AT90]). The computational complexity of these algorithms depends
upon the shape and size of each tree. Consider point classification for example. The point is
inserted into the tree and at each node the location of the point with respect to the node's
hyperplane determines whether to take the left or right branch; this continues until a leaf is
reached. The cost of this is the length of the path taken. Now if this point is chosen from a
uniform distribution of points over some sample space of volume v, then for any cell c with
volume Ve at tree depth de, the probability Pe of reaching c is simply ~ and the cost is de. So
an optimal expected case bsp tree for point classification would ·be a tree for which the sum
of Pede over all c is minimized. If the embedding space is one dimensional, then this is the
classic problem of constructing an optimal binary search tree; a problem solved by dynamic
programming.

The essential idea here is that the largest cells should have the shortest paths and smallest
cells the longest. For example, satisfying this objective function globally generates bounding
volumes as a by-product: if a polytope's volume is somewhat smaller than the sample space's
volume, constructing a bounding volume with the first hyperplanes of the tree results is large
"out" cells with very small depths. Now, in the general case in which the "query" object q
has extent, i.e. is not a point, then q will lie in more than one cell, and a subgraph of the
tree will be visited. Thus the cost of the query is the number of nodes in this subgraph. This
leads to a more complicated objective function, which we do not intend to examine here,
but the intuition taken from point classification remains valid.
562

We use these ideas in conjunction with the linear approximation methods, described before,
to build" good" expected case trees for solids defined as surfaces of revolution (or should we
say, that we expect these trees to be good). First, we orthogonally project the curve to be
revolved onto the axis of rotation, which we take to be a vertical z-axis. We then partition
space with horizontal planes where each plane contains one of the linearly approximated
curve points. The bsp tree representing this is a nearly balanced tree, and each cell will
contain the surface resulting from the revolution of a single curve segment.

Now the revolution of the curve need not be along a circle, but can be any convex path
for which we have constructed a linear approximation. Thus each face of the solid will
be a quadrilateral in which the "upper" and "lower" edges lie in consecutive horizontal
partitioning planes, are parallel, and are instances of a single path edge at some distance
from the axis of revolution that is determined by the revolved curve. Now the bsp subtree
for the surface between horizontal planes is obtained by recursively partitioning the path of
revolution to form a nearly balanced tree.

The method we use is one that in 2D generates for any n-sided convex polygon a corre-
sponding nearly balanced bsp tree of size O( n) and height O(1og n). The path curve is first
divided into four sub-curves, one for each quadrant, and a hyperplane containing the first
and last points is constructed. By convexity, a sub-curve lies entirely in one halfspace of its
corresponding hyperplane, and we call that halfspace the "outside" halfspace and the oppo-
site halfspace the "inside" halfspace. The intersection of the four inside halfspaces is entirely
inside the polygon, and so forms an in-cell of the bsp tree. We then construct independently
a tree for each sub-curve recursively.

We first choose the median segment of the sub-curve and partition by the plane of the
corresponding face. Since the path curve is convex, all of the faces will be in the inside
halfspace of this plane and an out-cell can be created in its outside halfspace. Now each non-
horizontal edge of the median face is used to define a partitioning plane which also contains
the first/last point of the sub-curve. All of the faces corresponding to this sub-curves' edges
are in the outside halfspace, and so an in-cell can be created in its inside halfspace. We
have now bisected the sub-curve by these planes which contain no faces and can recurse
with them. The recursion continues until only a small number of faces/segments remain, say
6, at which point only face planes are used for partitioning, since the cost of the non-face
partitioning planes out-weights their contribution to balancing the tree. The result for a
path curve of n edges is a nearly balanced tree of size < 3n and height O(logn).

In some sense, we have constructed a tree that is the cross product of the path curve and
the revolved curve; we build a tree of horizontal planes that partitions the revolved curve,
and then we form "slices" of the object by constructing a tree for each segment of the path
curve. If the revolved curve has m segments, then the number of faces is nm and the bsp
tree is of size O( nm) and height O(1og nm) = O(1og n + log m).
The object in figure 10 was made by rotating the curve in figure 3 around an ellipse. Its bsp
tree is quite well balanced. The goblet in figure 11 was made by constructing two objects
using the curve in figure 3, and then applying a difference operation to carve a hole on
the goblet. The bsp tree in figure 11 was obtained after applying the difference operation,
and then a union operation for the red ball. It is observed that set operations on well-
balanced bsp trees result in well-balanced trees. The set operation and display were done in
563

Figure 10: A Human profile rotated

Figure 11 : Agoblet
564

Sculpt [Nay90] which is an interactive modeling system based on bsp trees.

7 CONCLUSION

In this paper, we have discussed the problem of piecewise linear approximation of an arbitrary
digitized 3-D curve. Two algorithms have been presented. One finds an optimal linear
approximation at a high expense. The other computes a heuristic linear approximation,
based on the fundamental notions of curve length and spherical image of a space curve. This
heuristic algorithm finds a good linear approximation quickly. We have also shown that our
heuristic algorithm can be applied to display of space curves and implicit surfaces, and to
adaptively constructing well-balanced binary space partitioning trees of objects defined by
revolution.

Acknowledgment The authors would like to thank the anonymous referees for their
helpful comments. Insung Ihm wishes to thank AT&T Bell Laboratories in Murray Hill,
New Jersey, for providing nice environment where most of this work was done during the
summer in 1990.

A The Performances

n 109
m 4 8 16 32 64
LAo 5.2561ge-1 2.21325e-1 8.09768e-2 2.66073e-2 8.98677e-3
LAk 4.02838e-1 1.12507e-1 3.08774e-2 1.23695e-2 3.76437e-3
(k) (5) (9) (17) (16) (14)
LA' 4.02838e-1 1.12507e-1 3.04525e-2 8.94592e-3 2.76188e-3
LAk/LA* 1.000 1.000 1.014 1.393 1.363

Table 1: The curve in Figure 3

n 408
m 2 4 8 16 32 64
LAo 3.41045e-1 2.24775e-1 5.47018e-2 1.42360e-2 3.80354e-3 1.1938ge-3
LAk 3.41045e-1 9.53494e-2 2.75688e-2 8.92481e-3 2.95337e-3 8.66971e-4
(k) (0) (86) (50) (80) (66) (381)
LA' 2.73563e-1 8.95572e-2 2.63278e-2 6.62991e-3 2.02041e-3 5.45924e-4
LAk/LA* 1.247 1.065 1.047 1.346 1.462 1.588

Table 2: The curve in Figure 4


565

n 237
m 8 16 32 64
LAo 1.03375e-l 6.11455e-2 2.95784e-2 8.24535e-3
LAk 6.0774ge-2 2.90067e-2 7.76577e-3 5.63440e-3
(k) (12) (17) (20) (5)
LA" 5.87190e-2 2.06813e-2 5.1221ge-3 l.75973e-3
LAk/LA" 1.035 1.403 1.516 3.202

Table 3: The curve in Figure 5 (goblet)

n 404
m 4 8 16 32 64
LAo 2.0139ge-0 3.63921e-l 9.66444e-2 2.67485e-2 8.39998e-3
LAk 1.86220e-0 2.26435e-l 7.78874e-2 2.2451Oe-2 6.9870ge-3
(k) (4) (32) (16) (10) (8)
LA" 1.85530e-0 2.26435e-l 7.6066ge-2 2.05613e-2 5.69636e-3
LAk/ LA" 1.004 1.000 1.024 1.092 1.227

Table 4: The curve in Figure 7

n 234
m 4 8 16 32 64
LAo 7.24214e-l l.72242e-l 5.3202ge-2 1.64083e-2 1.60168e-2
LAk 4.81844e-l 1.34728e-l 3.69400e-2 1.53000e-2 3.80315e-3
(k) (15) (15) (13) (2) (11)
LA" 4.81844e-l 1.34728e-l 3.65433e-2 1.05332e-2 3.16273e-3
LAk/LA" 1.000 1.000 1.011 1.453 1.202

Table 5: The curve in Figure 8

B List of Figures
1. Figure 1 : Folium of Descartes

(a) equation: C(t) = (1!~3' 1~:3'0) or (J(x,y,z) = x3 - 3xy + y3, g(x,y,z) = z)


(b) n=109,m=20
2. Figure 2 : A cubic curve

(a) equation: C(t) = (t,t 3,0) or (J(x,y,z) = x 3 _y+z, g(x,y,z) = z)


(b) n = 408, m = 11
566

3. Figure 3 : A human profile and a goblet

(a) Points were generated from 12 rational Bezier curves in [Pie87], and then slightly
disturbed.
(b) (profile) n = 169, m = 20
(c) (goblet) n = 237, m = 20
4. Figure 4 : A four-leaved rose

(a) equation: (f(.r, y,.::) = .r6 + 3:r 4 y2 - 4x 2y2 + 3J;2 y + y6,


4 g(x, y, z) = z)
(b) n = 400, m = 61

'i. Figure'i : A non planar quartic Cllfve

(a) equation: (f(x, y,.::) = :~6.T2 + 81!J2 + 9z 2 - 324, g(x, y, z) = x 2 + y2 - 3.94)


(b) n = 40·1, m = 32
6. Figure 6 : A non planar sextic curve

(a) equation: (f(x,y,.::)=y2_.T2_.T3,g(X,y,.::)=z-x 2 +x-2)


(b) n = 234, m = 20
7. Figllfe 8 : A qua.rtic surface patch

(a) equation: f(x,y,.::) = 0.0l8,53292z 4 - 1.14809166y2z 2 - 1.14809166x 2 z 2 +


0.99982830z 2 - 1.166621.581/ - 1.14809166x 2 y 2 + 2.1849858y 2 + 0.0185:3292.1A +
0.99982830:r2 - 0.72183150
(h) d = :~
8. Figurc 10 A human profile rotated

(a) The primary curve: the curve in Figure :) with 32 segments.


(b) Thc auxiliary curve: an ellipse with 32 segmcnts.

9. Figure 11 A goblet

(a) Thc primary curve: the curve in Figure 3 with 20 segmcnts.


(b) The auxiliary curve: a circle with 32 segments.
567

References
[AG89] E.L. Allgower and S. Gnutzmann. Polygonal meshes for implicitly defined sur-
faces. manuscript, Nov. 1989.

[AHU74] A.V. Aho, J.E. Hopcroft, and J.D. Ullman. The Design and Analysis of Computer
Algorithms. Addison-Wesley, Reading, Mass., 1974.

[BHHL88] C. Bajaj, C. Hoffmann, J. Hopcroft, and R. Lynch. Tracing surface intersections.


Computer Aided Geometric Design, 5:285-307, 1988.

[BI88] C. Bajaj and I. Ihm. Hermite interpolation using real algebraic surfaces. In
Proceedings of the Fifth Annual ACM Symposium on Computational Geometry,
pages 94-103, Germany, 1988.

[Bl088] J. Bloomenthal. Polygonization of implicit surfaces. Computer Aided Geometric


Design, 5:341-355, 1988.

[Can71] A. Cantoni. Optimal curve-fitting with piecewise linear functions. IEEE Trans-
actions on Computers, c-20:59-67, 1971.

[Cd80] S.D. Conte and C. deBoor. Elementary Numerical Analysis: An Algorithmic


Approach. McGraw-Hill, New York, third edition, 1980.

[deB78] C. deBoor. A Practical Guide to Splines. Springer-Verlag, New York, 1978.


[Dun86] J.G. Dunham. Optimum uniform piecewise linear approximation of planar curves.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 8:67-75, Jan.
1986.

[FWL89] C. Fahn, J. Wang, and J. Lee. An adaptive reduction procedure for the piecewise
linear approximation of digitized curves. IEEE Transactions on Pattern Analysis
and Machine Intelligence, 11(9):967-973, Sep. 1989.

[Han83] P. Hanrahan. Ray tracing algebraic surfaces. Computer Graphics, 17(3):83-90,


1983.

[HH86] C. Hoffmann and J. Hopcroft. Quadratic blending surfaces. Computer Aided


Design, 18(6):301-306, 1986.

[1186] H. Imai and M. Iri. Computational geometric methods for polygonal approxi-
mations of a curve. Computer Vision, Graphics and Image Processing, 36:31-41,
1986.

[Kd88] N. Kehtarnavaz and R.J.P. deFigueiredo. A 3-D contour segmentation scheme


based on curvature and torsion. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 10(5):707-713, Sep. 1988.

[Kre59] E. Kreyszig. Differential Geometry. University of Toronto Press, 1959.

[M088] A. A. Melkman and J. O'Rourke. On polygonal chain approximation. In G. T.


Toussaint, editor, Computational Morphology, pages 87-95. North-Holland, 1988.
568

[Mon70] U. Montanari. A note on minimal length polygonal approximation to a digitized


contour. Comm. ACM, 13:41-47, 1970.

[MS85] A. Middleditch and K. Sears. Blend surfaces for set theoretic volume modeling
system. Computer Graphics, 19(3):161-170, 1985.

[NAT90] B. Naylor, J. Amanatides, and W. Thibault. Merging bsp trees yields polyhedral
set operations. Computer Graphics, 24(4), Aug. 1990.

[Nay90] B. Naylor. Sculpt: An interactive solid modeling tool. In Proc. of Graphics


Interface, Halifax, Nova Scotia, May 1990. Graphics Interface '90.

[O'N66] B. O'Neill. Elementary Differential Geometry. Academic Press, 1966.


[Pav82] T. Pavlidis. Algorithms for Graphics and Image Processing, pages 281-297. Com-
puter Sciences, New York, 1982.
[PH74] T. Pavlidis and S.L. Horowitz. Segmentation of plane curves. IEEE Transactions
on Computers, c-23(8):860-870, Aug. 1974.

[Pie87] L. Piegl. Interactive data interpolation by rational Bezier curves. IEEE Computer
Graphics and Applications, pages 45-58, July 1987.

[PK89] N.M. Patrikalakis and G.A. Kriezis. Representation of piecewise continuous al-
gebraic surfaces in terms of B-splines. The Visual Computer, 5(6):360-374, Dec.
1989.

[Ram72] U. Ramer. An iterative procedure for the polygonal approximation of plane


curves. Computer Graphics and Image Processing, 1:244-256, 1972.

[Ric64] J.R. Rice. The Approximation of Functions, volume 1. Addison-Wesley, Reading,


Mass., 1964.

[R087] A.P. Rockwood and J.C. Owen. Blending surfaces in solid modeling. In G. Farin,
editor, Geometric Modeling; Algorithms and New Trends, pages 367-383. SIAM,
Philadelphia, 1987.

[Rob85] J. Roberge. A data reduction algorithm for planar curves. Computer Vision,
Graphics and Image Processing, 29:168-195, 1985.

[RW74] K. Reumann and A.P.M. Witkam. Optimizing curve segmentation in computer


graphics. In A. Gunther, B. Levrat, and H. Lipps, editors, International Computer
Symposium, pages 467-472. American Elsevier, New York, 1974.

[Sed85] T. W. Sederberg. Piecewise algebraic surface patches. Computer Aided Geometric


Design, 2:53-59, 1985.

[SG80] J. Sklansky and V. Gonzalez. Fast polygonal approximation of digitized curves.


Pattern Recognition, 12:327-331, 1980.

[SZ89] T.W. Sederberg and A.K. Zundel. Scan line display of algebraic surfaces. Com-
puter Graphics, 23(3):147-156, 1989.
569

[Tou85] G. T . Toussaint. On the complexity of approximating polygonal curves in the


plane. In Proceedings of the lASTED International Symposium on Robotics and
Automation, pages 59-62, Lugano, Switzerland, 1985.

[War86] J. Warren. On algebraic surfaces meeting with geometric continuity. PhD thesis,
Cornell University, 1986.

[WD84] K. Wall and P.E. Danielsson. A fast sequential method for polygonal approx-
imation of digitized curves. Computer Vision, Graphics and Image Processing,
28:220-227, 1984.

[WiI78] C.M. Williams. An efficient algorithm for the piecewise linear approximation of
planar curves. Computer Graphics and Image Processing, 8:286-293, 1978.

Insung Ihm is currently a Ph.D. student and member of the Com-


puting about Physical Objects Lab. in the Department of Computer
Sciences at Purdue University. His research interests are in Com-
puter Aided Geometric Design and Computer Graphics. He received
his B.S. and M.S. degrees in Computer Science from Seoul National
University in 1985, and Rutgers University in 1987, respectively, and
worked at AT&T Bell Labs. in Murray Hill, NJ during the summer
in 1990.
Address: Department of Computer Sciences, Purdue University,
West Lafayette, IN 47907, USA.

Bruce Naylor has been a Member of Technical Staff at AT&T Bell


Labs. in Murray Hill, NJ since 1986. Previously, he was on the
faculty of Information and Computer Science at Georgia Institute
of Technology in Atlanta, GA from 1981-1986. He did his graduate
studies at the University of Texas at Dallas, receiving a Master and
Ph.D . in Computer Science in 1979 and 1981, respectively. His B.A.,
received in 1976, is from the University of Texas at Austin where he
majored in Philosophy.
Address: AT&T Bell Laboratories, 600 Mountain Ave. , Murray Hill ,
NJ 07974, USA.
Chapter 10
Fluid Flow Visualization
Ellipsoidal Quantification of Evolving
Phenomena
D. Silver, N. Zabusky, V. Fernandez, M. Gao, and R. Samtaney

Abstract

Studying the evolution of coherent objects over time-varying data sets is essential
to understanding large scale simulation and experimentation. Identifying regions of
interest and tracking them over time is an important part of the process. In this
work, we present a method for systematic object identification by the fitting of generic
shapes, like ellipsoids, to isolated regions. These concepts are applied to two- and
three- dimensional data sets from fluid dynamical problems, and we show the advan-
tages of using reduced representations of data to elucidate the kinematics and topology
of evolving structures.
Keywords: Ellipsoid, Ellipse, Visualization, Fluid Dynamics, Feature Tracking.

1 Introduction

The goal of large-scale simulations and experimentations in computational sci-


ence is a quantitative and mathematical understanding of the model being in-
vestigated. A key component in this process is visualization. Once important
structures are identified, they can be tracked and quantified (measured) in time.
Studying the evolution and interaction of coherent amorphous objects over time-
varying data sets is the essence of scientific discovery. Coherent objects are
easily recognizable when the data sets are viewed, because they are localized in
space and have a finite lifetime. They can therefore be traced throughout their
existence, even though their shape may be constantly evolving and changing.
Coherent features are similar across scientific domains. Examples include en-
hancements (positive perturbations) or depletions (negative perturbations) such
as: clouds, jet streams, ozone holes (meteorology); electron clouds and disloca-
tions (condensed matter); or bubbles and vortices (fluid dynamics). The aim in
the different disciplines is to study the evolution and essential dynamics of these
objects and describe them for a modified time period, thus obtaining a partial
solution to the original problem. For example, one tracks the formation and
progression of a storm front in a sequence of data images, or uses satellite data
to study ozone hole growth. The scientist must identify the feature, determine

573
574

how it has changed, where it has moved, and how it interacts with other sec-
ondary structures. We call this process visiometrics: visualizing, recognizing,
identifying, tracking, quantifying, and ultimately mathematizing (constructing
a simplified mathematical model of) evolving amorphous objects in one-, two-,
and three-dimensional time-dependent data sets [BZ90]. Visualization is only an
intermediate step in this process. The ultimate goal is to help fully realize
the science underlying the model, leading to an enhanced capability
for prediction and discovery.
In this paper, we describe a visiometric approach to the study of coherent amor-
phous structures. In the next section, a short overview of some related work is
presented. Our main focus is on data from large scale simulation of two and
three dimensional fluid dynamic simulations. However, the methods discussed
here are general and apply to other domains as well.

1.1 Related Work

There many methods for visualizing one, two, and three dimensional time-
varying data sets. Unfortunately, determining the essential observed features
and quantifying them is not a trivial task. The emphasis on detecting and char-
acterizing large-scale structures in flows has been one of the most important
motivations in applying digital image processing to flow visualization [Hes88].
There are two approaches to extracting graphical information and flow visualiza-
tion [BS85]. In one approach, the engineer works interactively at the workstation
observing the solution, characterizing and then identifying important features.
In the second approach, a computer is programmed to analyze the datasets,
searching for special features automatically. The first approach has the advan-
tage that one can discover information that might not be anticipated, while the
second approach is needed because of the sheer amount of data generated.
In studies of turbulence, it is usual to base diagnostics from averaging over
fluctuating quantities or transforming the data (e.g. power spectral densities).
This approach is very important but may hide local characteristics of coherent
structures [BPZ90]. An alternative is the characterization of coherent local-
ized regions. One method is to decompose the flow field into regions defined
by streamlines connected through nodes or critical points which captures the
global flow topology [PC87] [HH89] [CPC90]. Another technique is to search
for important local features using maxima tracking and thresholding to separate
the flow field into substructures [BZ90] [SZ91] [BL91]. In the next section, we
describe a method for searching and isolating local features.
575

1.2 Model Building

There are three basic phases when characterizing coherent objects:

• Visualization,
• Quantification and Abstraction,
• Mathematization and Reduced Model Formulation.

Although these steps demonstrate a natural progression, the process of model


building is one of refinement and involves many iterations, over physical and
numerical parameters and diagnostic projections, until a mathematically valid
formulation is achieved.

1.2.1 Visualization

The first step in experimental science is generating the data to be analyzed.


The data is the result of observation or experimentation (laboratory or numer-
ical). Before viewing, the data must be preprocessed to extract or compute the
variable (s) of interest. The visualization phase encompasses all of the different
steps involved in converting the data set into a picture (the functional layers in
[BM87]). These layers include accessing the data, classifying it (also - choosing
a good color map, choosing opacity parameters, or determining an appropriate
contour level), creating a geometric model (for volume rendering or isosurface
contours [LC87j [UK88]), rendering it (by assigning lighting parameters, per-
forming hidden surface elimination), and finally displaying it (on a device) and
managing input.

1.2.2 Quantification and Abstraction

To understand the interactions and dynamics of complex phenomena, large pro-


cesses are broken up into sets of smaller subprocesses. The first step is to isolate
relevant features of the data (large magnitude and/or scale) from ones that are
not as essential (small magnitude). The remaining data should still have the
requisite information to describe the underlying physics.
The next step is to identify coherent features. Since these are space filling ob-
jects, they can be "extracted". Line and surface characterization can then be
computed: volume, surface area, tangent, normal, curvature, torsion, circula-
tion, and distances to other objects. In many instances, the objects are very
complex. To simplify, these areas of interest can be abstracted by using low
degree of freedom quantifications of geometric and topological features of space
576

filling objects. After the extent of a feature is determined (its location in the
data set), a procedure can be used to identify it as one of a set of simpler shapes.
These shapes are generic, and can be used for different physical phenomena. For
example, some of the basic shapes are line-like objects found in images of jet
streams, storm fronts, and galaxy formations; tube-like objects (in 3D) such as
a tornado or a typhoon; sheet-like; and ellipsoid-like objects such as distorted
bubbles or vortex regions (see next section). (The simplest shape is a point,
e.g. the maximum value in the field.) This represents the first step towards un-
derstanding the topology of coherent structures and can enhance the modeling
of the observed phenomena.

1.2.3 Mathematization

In the model building phase, the scientist attempts to formulate what is hap-
pening on the simplified objects for the observed time period, and develop a low
degree-of-freedom mathematical representation to explain the evolution of these
structures. The simple model can then be solved and its results juxtaposed with
the results of the actual simulation. The differences should lead to insights into
higher order processes.

2 Ellipsoidal Abstraction
Thresholding is a simple way to determine regions in a scalar field. Thresholding
defines a subset of the field constituted by areas in which the associated scalar
quantity is equal to or larger than the threshold value. Varying the thresholds
separates the field into substructures [BL91]. In general, the field can be broken
up into "primary" and "secondary" objects. Primary objects correspond to
low threshold values and are usually large. High threshold values cause these
objects to become smaller and disconnected (secondary objects). They are
distinct (although obviously related to) the primary objects and have their own
lifecycles.
During the process of thresholding, generic shapes begin to appear. As men-
tioned previously, one of the shapes most commonly observed is an ellipse (2D)
or ellipsoid (3D). This suggests that physical space moments about extrema
can be used for abstraction. By isolating regions (extremas) of the field and
then computing the second moment, essential information at a first level of
quantification is obtained. The second moments define a tensor, which can be
associated with an oriented ellipsoid. Furthermore, ellipses show up in analyti-
cal solutions to some problems in incompressible fluid dynamics (2D). Thus the
use of the ellipsoids to represent coherent objects can be a step towards the
mathematization process.
577

The ellipsoids are described by the centroid of the object, and the eigenvalues
and eigenvectors of the tensor of moments. Even at low thresholds, when the
topology is too complex to be represented accurately by ellipsoids, the fitted
ellipsoid still provides a sense of position, orientation, and relative weight to
the region. Fitting ellipsoids at different thresholds also serves to "quantify"
the entire data set and provides a new set of topological statistics (see Section
5). Furthermore, the ellipsoids can be easily tracked over time, resulting in a
simplified model capturing the dynamics of these regions.
The use of ellipses and ellipsoids obtained from the tensor of second moments
for visualization and diagnostics is diverse. The method of moments has been
applied in pattern recognition applications to characterize images through low-
order moments (e.g. [Tea80]). Other applications in image processing include
finding the orientation of projections of 3D objects [SaI90]. Ellipsoids have also
been used in the context of iconic representation of 3D tensor fields [SIG89]
[Dic89] and in the analyses of vortex mergers and cancellation [BP89]. In the
next section, the process of elliptical quantification in both two- and three-
dimensions is discussed.

3 Elliptical Quantification

Let w(x) be the distribution of the scalar in the domain O. The zeroth order
moment of this domain is

w = 1f! w(x) dO (1)

the content of the region, where dO = IIi=l dXi. The centroid is given by

Xi = W- l in w(x) Xi dO (2)

and the normalized symmetric matrix of second moments around the centroid
IS

(3)

The process of ellipsoidal quantification for 2D data sets reduces to that of ellipse
fitting. Given a 2D data set, the extrema in the set are identified. Around each
of these extrema a region is selected. This region is comprised of points whose
data value exceed a certain threshold. If only moments up to second order
are considered, the 2D coherent structure, which has already been identified, is
equivalent to an ellipse of constant value having definite size, orientation and
aspect ratio (eccentricity). The major and minor axes, a and b respectively, the
578

aspect ratio (a/b), and the orientation, cP, of the ellipse are obtained from the
second order moments by

2 1/2 1/2
a = (2(ho + 102 + [(ho - 102 ) + 4If11 ) ), (4)

2 1/2 1/2
b = (2(I20 + 102 - [(ho - 102 ) + 4If11 ) ), and (5)

(6)

The angle cP between the x-axis and the semi-major axis of the ellipse determines
the orientation.
It is convenient to color the ellipse in proportion to the zeroth order moment
divided by the area of the ellipse. An example of ellipse fitting is presented in
Section 4.

3.1 Ellipsoidal Quantification

Ellipsoid fitting is very similar to ellipse fitting. Thresholding is initially done to


identify the isolated regions. (Care must be taken to insure that distinct objects
or regions are not merged. The objects are determined by the connectivity of
their isosurfaces, numerical resolution of the data sets, and distance to the areas
of high magnitude.)
The centroid Xi of each object and the tensor of moments are calculated using
equations (1) - (3). The eigenvalues and eigenvectors of this matrix can then be
computed (the square-root is used).

3.2 Rendering

Once the elliptical regions are identified, and the ellipse (or ellipsoid) computed,
they can be displayed with the data set. It is important to note here that
interactive display of these regions and their abstractions is crucial to gaining an
understanding for the process being investigated. The interactivity is based on
three variables - time, threshold value (isosurface value in 3D), and ellipsoidal
value (the threshold value at which the ellipsoid was fit). Stepping through
different ellipsoidal values and/ or different threshold values and/or time provides
insight into the evolution of the coherent structures. In addition to the ellipsoids,
other statistical quantities, related to the data set and ellipsoids, are computed;
579

for example, the number of ellipsoids per threshold value, size, area, strength,
eccentricity, maxima information, average value, etc.
Ellipse fitting and tracking is written as an extension to the DAVID 2 environ-
ment [BZ90]. The ellipses and the associated statistics are rendered interactively
(one can step between time, thresholds, and ellipses). The ellipses can be drawn
alone or superimposed on the original data set or the thresholded data set.
The ellipsoid computation and ~endering is written in Dore on the Stardent
ST1500. The statistics associated with the ellipsoids are displayed using DAVID.
The ellipsoids are rendered in a number of different modes. By looking at the
ellipsoids with the isosurfaces they represent, one can see to what extent the
object is captured by the ellipsoid. In 3D, because the data sets are so complex,
the connectivity of the ellipsoids with respect to the data sets is sometimes
difficult to comprehend. In these cases, the ellipsoids (at high threshold) are
rendered together with an isosurface at low threshold (transparent) or with
corresponding vortex lines.

4 Application to Physical Examples

The concepts discussed in the previous sections are now applied to concrete
physical examples.

4.1 Ellipse Fitting in 2D Shock-Bubble Interactions

Figure 1 is a time dependent 2D data set of axisymmetric shock-bubble interac-


tions [BZ90]. The threshold values chosen are: 7% of the absolute maximum for
the positive vorticity and 30% of the absolute minimum for the negative vortic-
ity. Figure la is the last frame of the simulation. Figure 1b shows the regions
which lie below and above the negative and the positive thresholds respectively.
The upward pointing arrows on the color scale indicate the threshold values. In
Fig. lc, ellipses are fit onto the thresholded regions. Of the coherent structures
identified, four are dominant. Ellipse 1 contains approximately 30% of the total
positive vorticity associated with the strong supersonic vortex in the flow. El-
lipses 2 and 3, with large aspect ratios, approximate layer-like structures in the
flow.
Ellipses 1-3 contain more than 50% of the positive circulation in the vorticity
field. The other major coherent structure is the subsonic negative vortex ap-
proximated by Ellipse 6. Ellipses 4 and 5 contain a relatively small amount of
positive circulation (Fig. Id). There are other ellipses in the field which have
negative circulation. These ellipses can be seen in the wake of the strong neg-
ative subsonic vortex and they are all weak, i.e. either their areas are small or
580

the total content is small. Figure Ie shows the relative areas of the coherent
structures, and Fig. If contains the percentage of the total positive or negative
circulation in the ellipses. This process can be repeated for different thresh-
old/extrema values.
By isolating coherent structures, the vorticity field is reduced in complexity.
With these appropriately smoothed elliptical regions of vorticity as initial con-
ditions, new simulations can be investigated.

4.2 Ellipsoid Fitting of 3D Vortex Tube Interactions

The physical-space interactions which lead to collapse and reconnection of vortex


tubes and the accompanying cascade to small-scale vortex debris is a fundamen-
tal problem in 3D vortex dynamics and turbulence. To understand this process
numerical simulations of the evolution of two vortex tubes initially perpendic-
ular to each other have been carried out [BPZ90]. The ratio of the strengths
of the vortex tubes is 11/10. By examining the magnitudes of vorticity (scalar
quantity) at a low threshold (27% of magnitude) the topology of the reconnec-
tion process is evident (see Fig. 2 and [PIX90]). To understand what is occurring
inside the central portion of the tubes, regions of high vorticity are fitted with
ellipsoids as shown in Fig. 3-6. (In these figures, we focus on the central area of
the original data sets.)
For the time t = 5.2 at the threshold value of 27%, two perturbed but well
defined tubes or "vortex cores" are observed (Fig. 2). In fact, after being initially
orthogonal, they now have the same orientation. This orientation is clearly seen
by the orientation of the two fitted ellipsoids (Fig. 3) at a threshold value of
59%.
At t = 5.5, the topology of the field, determined by the lowest threshold,
has changed considerably because of reconnection (threshold value same as for
t = 5.2). The right ellipsoid segments into two parts because of the strong
nonuniformities induced by the reconnection process. This is clearly seen from
the vortex lines in Fig. 4.
At the last available time, t = 6.0, the changes in the pictures are dramatic.
Four objects are evident (at this threshold) due to segmentation of both tubes.
In Fig. 5, the four ellipsoids are displayed with a lower transparent threshold to
show the different parts of the tubes. Although the reconnection is in process,
the two tubes still maintain their identity as witnessed by the two distinct
ellipsoids in the center. Eventually, the horizontal sections will become dominant
after reconnection. The statistics for this time step are very different than
for the previous times. At certain threshold values, nonuniformities result in
segmentation or cascade into small scale objects.
581

In addition to displaying the ellipsoids, statistics pertaining to these regions are


also computed. When observing the statistics for the entire run, some interesting
points emerge. In Fig. 6, the number of objects per threshold value is plotted.
At early times, before reconnection, many small-scale objects are generated by
a process that we believe is analogous to "stripping" in two dimensions [Dri89].
At about t=5, the small-scale structure is shed, and only two objects are present,
suggesting a quasi-equilibrium state right before reconnection. We are currently
investigating this phenomena in more detail.

5 Conclusions

Further work in this area involves extending the scalar ellipsoids to correspond-
ing forms for vector fields. This includes characterizing the three-component
vortici ty field (as a vector instead of as a scalar); computing the circulation (or
integrated vorticity) by integrating the field over appropriate planar regions,
e.g. the minor elliptical cross-section of the ellipsoid (minor ellipse); and gen-
erating color coded bundles of vectors that originate from the minor ellipse.
This will help us understand the topology of the vector fields connected to the
ellipsoids.
Elliptical quantification is limited in that it is a low-order approximation. In
the shock bubble problem (Section 4), only about half of the positive and neg-
ative circulation is captured. Alternate strategies are required to account for
a larger fraction of the total integrated fields. For example: use of third- and
fourth-order moments, thresholding as a percentage of the localized extremum
(rather than of the largest maximum or smallest minimum), and fitting layer-
like and tube-like representations. In each case, additional complexity is added
to the objects. Furthermore, the low-lying global "sea" of the functions may be
characterized by statistical and spectral representations (e.g. a mixed, possibly
wavelet, representation of the functions).
We have demonstrated how extrema tracking, thresholding and ellipsoid (ellipse)
fitting have helped us quantify the magnitude of scalar objects in evolving flows.
These low-order representations correspond to a reduction in the information
used to describe the process. Even though the ellipsoids are simple shapes, they
can characterize complicated topologies through their size and orientations. The
statistical information pertaining to the different threshold values presents a
global picture of the topologies and provides a new signature for the time-data
sequence. We hope that this visiometric approach will lead to new ways for
characterizing intermittent phenomena in turbulent and chaotic fluid motions.
y Ci rcu lation Sc+06 [Ellipses arc color codcd 1
v 100 2yv ~yv
01 ._ -~'., ' . '" 4c+06
3c+06

\
'
.....~ f., i~...~~-.;;;.
\. .
100
~ "~,\;, , "

,." ' Vorticity


x
(a)

-2 *#~lf hr~---'~~ 3 4 5 x c+04


(1I)

U#. •. hW4 ,. ~~~~

-2' -1 0 1 i 3 4 5 x c+06
(e)
Figure 1: Ellipse Fitting in 2D Shock Bubble Interactions
583

Figure 2: Vortex Tubes Time Evolution

Figure 3: Ellipsoid Fitting at Threshold=58% and Time: t=5 .2, t=5.5, t=6.0
584

Figure 4: Ellipsoids with Vortex Lines, t=5.5

Figure 5: Ellipsoid Fitting at t=6.0 : isosurface at 27%, ellipsoids at 58%


585

Il_(lhjccts, II_objects, Il_objccts


.,211: 2U 20
t = 6.S

Ie 5.75
,= 2.0
11
\ ~1=J,(l1l /::::: In
1=(,0

Iv.'·~ ,=.1.0
I
~1=5.0
,-~
01---1---1 1
n 10 20 10 20 o 10 20

threshold threshold threshold

Before reconnection. After reconnection.

Figure 6: Number of Objects vs. Threshold Value for t=2.0-6.0

6 Acknowledgements

This work was supported in part by NSF Grant DMS-89-01-900 (N. Zabusky,
R. Samtaney), ONR Grant N00014-90-J-I095 (N. Zabusky), IBM Fellowship
(V. Fernandez), NSF CCR-89-09197 (D. Silver), and the CAIP Center. Thanks
also to Simon Cooper for his help.

References
[BL91] J. G. Brasseur and Wen-Quei Lin. Structure and Statistics of Intermittency in Homo-
geneous Turbulent Shear Flow. To appear in Advances in Turbulence, 3, 1991.
[BM87] M. Brown B. McCormick, T. Defanti. Visualization in Scientific Computing. Computer
Graphics, 21(6), November 1987.

[BP89] J. D. Buntine and D. I. Pullin. Merger and Cancellation of Strained Vortices. J. Fluid
Mech., 205:263-295, 1989.

[BPZ90] O. Boratov, R. Pelz, and N. Zabusky. Winding and Reconnection Mechanisms of Closely
Interacting Vortex Tubes in Three Dimensions. AMS-SlAM Seminar on Vortex Dy-
namics and Vortex Methods, Seattle, June 1990. To appear in Lectures in Applied
Mathematics.
[BS85] P. Buning and J. Steger. Graphics and Flow Visualization in Computational Fluid
Dynamics. In AlAA-85-1507-CP, AlA A 7th computational fluid dynamics conference,
Cincinnati, Ohio, 1985.
586

[BZ90] F. Bitz and N. Zabusky. DAVID and Visiometrics: Visualizing, Diagnosing and Quan-
tifying Evolving Amorphous Objects. Computers in Physics, pages 603-613, Novem-
ber/December 1990.
[CPC90] M. S. Chong, A. E. Perry, and B. J. Cantwell. A General Classification of Three-
Dimensional Flow Fields. Phys. Fluids A, 2(5):765-777, May 1990.
[Dic89] R. Dickinson. Unified Approach to the Design of Visualization Software for the Analysis
of Field Problems. SPIE Proceedings, 1083, January 1989.
[Dri89] D. Dritschel. Strain-Induced Vortex Stripping in Mathematical Aspects of Vortex Dy-
namics. In R.E. Cafl.isch, editor, Mathematical Aspects of Vortex Dynamics, pages 107-
113. SIAM,NY, 1989.
[Hes88] L. Hesselink. Digital Image Processing in Flow Visualization. Ann. Rev. Fluid Mech.,
20:421-485, 1988.
[HH89] J. Helman and L Hesselink. Representation and Display of Vector Field Topology in
Fluid Flow Data Sets. IEEE Computer, August 1989.
[LC87] W. Lorensen and H. Cline. Marching Cubes: A High Resolution 3D Surface Construction
Algorithm. Computer Graphics, 21(4):163-170, August 1987.
[PC87] A. E. Perry and M. S. Chong. A Description of Eddying Motions and Flow Patterns
Using Critical-Point Concepts. Ann. Rev. Fluid Mech., 19:125-155, 1987.

[PIX90] What Happens When Two Vortexes Meet. Pixel, page 7, July/ August 1990.

[SaI90] D. B. Salzman. A Method of General Moments for Orienting 2D Projections of Unknown


3D Objects. Computer Vision, Graphics and Image Processing, 50:129-156, 1990.
[SIG89] Two and Three Dimensional Visualization Workshop, August 1989. SIGGRAPH course
notes 13.

[SZ91] D. Silver and N. Zabusky. 3D Visualization and Quantification of Evolving Amorphous


Objects. SPIE/SPSE Conference Proceedings, February 1991.

[Tea80] M. R. Teague. Image Analysis Via the General Theory of Moments. J. Opt. Soc. Am.,
70(8), August 1980.

[UK88] C. Upson and M. Keeler. V-BUFFER: Visible Volume Rendering. Computer Graphics,
22(4):59-64, August 1988.
587

Authors Biography

Deborah Silver is an Assistant Professor in the Department of Elec-


trical and Computer Engineering at Rutgers University. She received
a B.S . (1984) in Computer Science from Columbia University, and
an M.S. (1986) and Ph.D. (1988) in Computer Science from Prince-
ton University. Her interests include scientific visualization, computer
graphics , computational geometry and numerical analysis . She has
been involved in the Visualization Laboratory at the CAIP Center (an
interdisciplinary research organization at Rutgers University) since its
inception. Address: c/o CAIP Center, Rutgers University, P.O. box
1390, Piscataway, NJ 08855.

Norman Zabusky is the State of New Jersey Professor of Computa-


tional Fluid Dynamics in the Department of Mechanical and Aerospace
Engineering at Rutgers University. He was educated at the College
of the City of New York (B.E.E. 1951), at Massachusetts Institute of
Technology (M.S. 1953) and California Institute of Technology (Ph.D.
1959). He has worked at Raytheon, Inc, the Max-Planck Institute for
Physics and Astrophysics in Munich, the Princeton Plasma Physics
Laboratory, Bell Laboratories (as Head of the Computational Physics
Research Department), and as Prof. of Mathematics at the Univer-
sity of Pittsburgh, before joining the Rutgers faculty in 1988. He has
also been a consultant or visiting scientist to Exxon Research and
Engineering, National Center for Atmospheric Research, Los Alamos
National Laboratory, Naval Research Laboratory, and the Institute of
Theoretical Physics (Santa Barbara) . At Rutgers, in conjunction with
the CAIP Center, he has developed a laboratory for data visualization
and quantification. Address: Dept. of Mech. and Aero. Engineering,
Rutgers University, P.O. Box 909, Piscataway, NJ 08855 .

V. Fernandez is currently a Ph.D . student in the Department of


Mechanical and Aerospace Engineering at Rutgers University. He is
working in the CFD area and is part of the Visualization Laboratory
at CAIP Center. Address: c/o CAIP Center, Rutgers University,
P.O. box 1390, Piscataway, NJ 08855 .
588

R. Samtaney is currently a Ph.D. student in the Department of Me-


chanical and Aerospace Engineering at Rutgers University. He is both
working in the compressible CFD area and is part of the Visualization
Laboratory at CAIP Center. Address: c/o CAIP Center, Rutgers
University, P.O. box 1390, Piscataway, NJ 08855.

Ming Gao is a Ph.D. student in the Department of Electrical and


Computer Engineering at Rutgers University. He is currently working
on visualization at the CAIP Center. Address: c/o CAIP Center,
Rutgers University, P.O. box 1390, Piscataway, NJ 08855.
Smoothed Particle Rendering for Fluid
Visualization in Astrophysics
Mikio Nagasawa and Kunio Kuwahara

The astrophysical smoothed particle hydrodynamics is applied to the radiative transfer problem
for the direct visualization of 3-D scalar fields. The Smoothed Particle Rendering (SPR) inte-
grates the ray equation through the opaque medium and calculates the global contribution of
scattered light. The opacity represents the density scalar. The emissivity and the flux direction
are derived by the temperature field and its gradient. This method has some common features
with Voxel Volume Rendering and Radiosity Method. The SPR is applicable both to grid data
and to particle configurations. The validity of SPR indicates the possibility to simulate the
radiation hydrodynamics.
Key words: Computer animation; Radiosity; Smoothed particle method; Volume rendering.

1. INTRODUCTION
Early computer graphics were using 2-D contour maps or 3-D surface models to display volume
data. For a raster graphic device, it is suitable to define the pixel images of color intensity
rather than drawing geometric primitives. The visualization of isovalue surface in 3-D space
uses a set of polygons. This does not visualize the interior structure of the object. The binary
classification of surface position and hidden surface often leads the loss of fine structures.
Volume rendering is the new technique to visualize the whole 3-D scalar field. Ideal visualization
is the simulation of optical physics including the interaction of photons and matter. The Ray
Tracing method obeys the law of optics in vacuum, while the Volume Rendering solves the
radiative transfer in a continuum.
The volume data are sampled by the cubic "voxels". Each voxel assigns the opacity and the color
intensity. It is a good algorithm for displaying fuzzy surfaces. The apparent surface is judged
by the recognition of an observer who watches the output image. Fine structures will be shown
without classification into geometric primitives. This means the direct conversion of volume
data to the eye-recognition by human observers. The difficulty of Voxel Volume Rendering is
the jaggy feature of aliasing due to the voxel sampling. We present the model which overcomes
such a difficulty by using the particle configurations. It will result in a realistic image of the
object. Especially, the rendering of gas or cloud is suitable and reliable using this method.
In hydrodynamical studies of astrophysics, the visualization of a 3-D continuum is indispensable.
When we analyze 3-D simulation data, various types of figures are drawn. Unfortunately, we
have to know the place of interest in the volume data befo,re drawing the 2-D figures. The lack
of 3-D information often keeps us away from the new discovery. Therefore the method to display
the whole volume data at onece has been required for a long time. Our effort to develop the
Smoothed Particle Rendering has this motivation.
In §2, we present the idea of representing the volume data by the smoothed particle config-
uration. In §3, the method for simulating the astrophysical hydrodynamics by this idea are

589
590

briefly reviewed. The 3-D volume data are the output of Smoothed Particle Hydrodynamics.
The formulation of SPR will be described as an approximation of radiative transfer theory in
§4. The numerical results of our simulations will be given in §5. We will show three typical
applications in detail. In §6, the applicability and the advanced technique to implement this
code are discussed.

2. SMOOTHED PARTICLE METHOD


We treat the three dimensional problem in the context of Smoothed Particle Method (SPM)
introduced by Gingold and Monaghan (1977). The SPM treats the expectation value averaged
by the smoothing kernel W( r, h) with a smoothing length h. The mass density p at the position

I:
r =( x, y, z) could be sampled statistically by the value at r'.

(p(r)) = W(r- r')p(r')d 3 r' (2.1)

The smoothing kernel should be normalized as

(2.2)

The integral is discretized by the Monte Carlo method using N particles of mass mo.
N
PN(r) = L moW(r - rj, h) (2.3)
j=l

If h-> 0 as N -> 00, then PN(r) -> p(r). The validity of this SPM is represented by the
condition that the discrete kernel should converge into the Delta function:

lim W(r,h) = 8(r) (2.4)


N~oo

There is a theorem of Parzen (1962) which suggests a condition allowing selection of the smooth-
ing kernel. Suppose K(u) is a Borel function s.t.

1 00
K(u)du = 1, 1 00
IK(u)ldu < 00, lu 2 K(u)1 = 0 (2.5)

The kernel should have the form:

(2.6)

According to this theorem, we can use the Gaussian type kernel,


1 2 2
W(r,h)= 7ry'7rh3exp(-r /h) (2.7)

The smoothing length should tend to be small as the particle number N increases. In three
dimensions, the scaling is
1/3
h = ." ( rna ) ex: N- 1/ 3 (2.8)
p(r)
This is determined in accordance with the spatial variation of local density. The parameter." is
a coefficient of order unity which determines the degree of overlapping between particles.
591

3. HYDRODYNAMICS
BASIC EQUATIONS

Hydrodynamics are described by the Euler equations. The mass and momentum conservation are
written as the equation of continuity and equation of motion which describe the time evolution
of fluid element.
8p
at +V'(pv) =0 (3.1)

-8v
at
1
+ (v.V')v = --V'p-
p
V't/; (3.2)

where v, P and t/; are the fluid velocity, gas pressure and gravitational potential, respectively.
The fluid element moves under the forces of pressure gradient and gravity. In the self-gravitating
system, Poisson equation determines the gravitational potential t/;,

(3.3)

The equation of state for the ideal gas should be given by

P = (-y -1)pU (3.4)

with the ratio of specific heat 'Y. The internal energy U changes according to the energy equation

dU P
-dt = --V'.v
p
(3.5)

The above set of equations determine the evolution of hydrodynamical systems in astrophysics.

SMOOTHED PARTICLE HYDRODYNAMICS

The volume data which we want to visualize is not the grid data, but the field expectation
value calculated by the particle distributions. Of course, it is possible to visualize an
arbitrary scalar field by the SPR which is independent of hydrodynamics. However, for
understanding the smoothed particle algorithm and learning how the volume data of density and
temperature is determined by the simulation, we
review the Smoothed Particle Hydrodynamics
(SPH) here. Neither color nor opacity is not
a free parameter in principle. They should be
the outcome of hydrodynamics rather than the
visualization itself.
The SPH is a kind of Monte Carlo method and
the fluid system is treated as an ensemble of N
fluid elements. The motion of each element is
described in Lagrange coordinates. The idea
is to replace the fluid elements by finite size
particles. This method has the advantage that it
can handle the 3-D space with smaller computer
Fig.I. Idea of Smoothed Particle Method memory compared with the Finite Difference
Method.
592

Each element is assumed to have the same mass rno and its own internal density distribution, for
which we chose the Gaussian-type smoothing kernel as indicated in Eq.(2.7). The local density
of fluid is given by the superposition of density distribution of all particles,

= -rno L: -exp(-Ir.
N
p(r.) 1 - r·1 2 /h·)
2
(3.6)
, 7r."fi j=l M, '"

The increase of particle number N is necessary to simulate the local fine structure with smaller
particle size h. The size of each particle in our SPH is determined by the local density as in
Eq.(2.8), so that we can represent a steep density gradient by adjusting the spatial resolution.
The pressure gradient and gravity force are acting on i-th fluid element. The components of
gravity can be calculated directly by integrating the mass distribution instead of solving the
Poisson equation.

(3.7)

smce

(3.8)

and rij == Iri - rJ


In numerical calculations, the equation of motion should be solved with a certain artificial
viscosity IIij by controlling the coefficient e,

_ \ \l:) = _ ~ \l (p) _ \l \ ~)
L: [(p.-1+ h;II (P- +h;II (3.9)
=-rn o N __ ij )
\lW(r .. ,h.)+ j
-- ij )
\lW(r .. ,h.) ] ,
p~ p. '" p~ p. " ,
j=l " "

where
(v ... r .. )2 [ ]
II .. =c: " " E> -(v.-v.).(r.-r.) (3.10)
" r~. ""
"
For the SPH formulation, we extend the energy equation developed for the Particle-and-Force
method (Daly et al. 1965). The kinetic energy per unit mass of the i-th element is]{i = ~IViI2.
The energy change rate of each particle should be determined by the work that the other particles
do on it. The power results from the interparticle forces and the corresponding particle's velocity
as follows

d
-(]{. + U.), = ~f.".
N (V'- '+-V,)' (3.11)
dt ' ~" 2
iii
where hj is the sum of the pressure gradient and the viscosity force exerted by j-th particle
onto i-th particle.
593

Therefore, we get the energy equation which determines the change of fluid temperature,

(3.12)

In order to guarantee the exact energy-momentum conservation, the force summation should
be written in the anti-symmetric form to be !;j=-Jji . There are two ways of summation: one
is the passive "gathering scheme" (Wood 1981) and the other is the active "scattering" scheme
(Miyama et al. 1984). The passive scheme uses hi instead of hj in Eq.(3.6). Although there is
another symmetrization using h=(h i + hj )/2, we prefer the kernel-symmetrized SPH such as

N
p(ri) = ~o L [W(rij , hi) + W(rij' hj)] (3.13)
j=l

This has a better resolution than the size averaged method with the same number of particles.

RADIATION HYDRODYNAMICS

Radiation hydrodynamics is an important subject in astrophysics. We have succeeded in


simulating the adiabatic evolution of an astrophysical fluid. However, the photon energy flux
plays an important role in the gas dynamics around a high energy object such as a neutron star
or a black hole etc. In the case that the radiation pressure is not negligible compared with the
ordinary force J, the momentum equation is expressed as

Dv
p-+V'P+ V'.P+--( 1OF) - - (47r oj
V
--+V'.F ) =J--v (OP
-+v.J ) (3.14)
Dt c2 ot c2 c Ot c2 ot

where c is the velocity of light. The energy equation should also include the radiation flux,

De - -
p- PDp
- + [47rOJ
- - + V'.F - v.(V'.P) ] = 0 (3.15)
Dt pDt cOt

The radiation intensity is not only the function of space and time, but the function of directional
vector n and the spectrum frequency v. The mean flux J is an angular-averaged radiation
intensity. The flux vector represents the momentum of radiation intensity,

F == 10 00
dv f dwI( r, n, v, t)n (3.16)

and the tensor of radiation pressure is

p == c- 1 10 00
dv f dwI( r, n, v, t)nn (3.17)

The radiation model which gives the estimation of this flux will be the first step to solve the
radiation hydrodynamic problem. We formulate the transfer equation in the next section along
the lines of the Smoothed Particle Method.
594

4. SMOOTHED PARTICLE RENDERING


RADIATIVE TRANSFER

The propagation of radiation IS described by the radiative transfer equation along the ray
trajectory ds.
dlv
ds = -KvP I ·
v + JvP (4.1)

where Kv is the absorption coefficient and jv is the emissivity. In order to calculate the actual
coefficients, significant use of atomic physics is required.
We develop the volume rendering method in smoothed particle configurations. At first, we
sample the physical quantities in Eq.( 4.1) at random particle positions, and the transparency
formula is therefore discretized as

( 4.2)

We note that the zk is not the grid point but the depth coordinates of distributed particles.
This reduces to the asymptotic expression as

( 4.3)

The overlapping procedure are performed in the sequence of particle distance. Therefore, the
observer coordinates :c are defined for the line-of-sight vector to be V =(0, 0,1).
The image will appear on the 2-D pixel spaces X =(x, y, -(0) by this definition. The pixel color
intensity is an overlap integral along the line of sight,

Iv(x,y) = t
k=O
[Sv(X,y,Zk)av(X,y,Zk) IT
m=k+l
(l-a v (X,y,zm))] ( 4.4)

where Sv(x,y,zo) = Sbkg,v and a v(x,y,zo)=1. The particle position {:ck ;k=l, ... ,N} must be
sorted to become zk+l :s zk in advance.

VT
Vp

Fig.2. Smoothed Particle Rendering (top) and Voxel Volume Rendering (bot).
595

This overlapping integral will be done when we know the particle emissivity Sv( Zj) and particle
opacity av( Z;). The photon frequency is simplified as three broad color bands of v=R,G ,B for
the restricted purpose of visualization.
The optical depth is calculated using a subclass of integrals. In the smoothed particle
approximation, opacity is integrated for each individual particle.

av(x, y, Zj) = exp( -Tv) = exp (- j z,+h


z,-h KVPj( zj)dz
)
(4.5)

where h is the particle cut-off size with which the contribution of kernel becomes negligible as
W(h/h) ~ l.
The realistic absorption coefficient is a certain function of physical variables, Kv=Kv(p, T, X ... ).
The simulation of microphysics to determine Kv is beyond our interest to visualize the fluid
motion. For simplicity, we use the electron scattering type opacity Kv=const as a parameter in
SPR.
There are many other artificial opacities. One of them is the resonance opacity at isovalue level
Iv, introduce by Levoy (1988).
if IV!(Zj)I=O and !(Zj)=!v

if IV!(zj)1 > 0 and


(4.6)
!(Zj) - rvlV!1 S!v S !(Zj) + rvlV!1
otherwise
The extension to multi-layer opacity is possible in the similar way as Voxel Volume Rendering,
N
a tota /( Zj) = 1 - II (1 - a n ( Zj)) (4.7)
v=l

SCATTERING PHASE FUNCTION

The emissivity could be the scattered light of incident external light as assumed in the shading
models of computer graphics. The scattered light is described by the phase function p((J, rp; (JI, rpl),
which specifies the angular dependence of scattered amplitude:

J·(8)
v = Kv ~
4
7r
1"1
0 0
2
" P(B ,'i',
-+.. (JI ,'i'
-+.1)1v ((JI , 'i'-+.1)d(J1 d-+.
'i'
I ( 4.8)

In the plane geometry, there are analytic expressions of scattering integral (Chandrasekhar
1960). For the microscopically isotropic scattering, the intensity of light incident from (110 =
cos (Jo, rpo) and scattered to (11, rp) is
1 tvo
I(z = 0,11, rp; 110' <Po) = ---H(I1)H(l1o)l1o F (4.9)
411 + 110
If the medium obeys the Rayleigh scattering law p(8) = ~(1 + cos 2 8) or more general phase
function p( cos 8)= L:r=o tv/p/( cos 8), one can calculate the scattered light by use of H -function.
The H-function is defined so as to satisfy

H(I1) = 1 + /-lH(I1) r1
1/;(I1I)IH(I1I)dI11
io 11 + /-l
(4.10)

where 1/;(11) is an even polynomial s.t. J011/;dl1 S 1/2.


596

The same procedure is applicable in the spherical geometry of interest. We need the solution
for one spherical particle. The normal mode expansion of the radiation equation

dJ.
ft·-' 1--ft~' (aI)
+- -aft /J=/J; = -"'pJ.'+ 1 L a.[.
-"'p (4.11)
, dr r 2 J J

gives us the solution. The fist mode corresponds to ambient light, second mode is diffuse light
and the higher contributions can be interpreted as a highlight term. For example, the ambient
mean intensity is given as
3 loT dr
J = -21 '"' a.[. = -F.o
L.. " 4
-
0 r2
(4.12)

In order to define the directional cosine, there is a way of choosing normal vector. The normal
vector is determined by the gradient field of density scalar.
( 4.13)
Therefore the rendering image contains the information of density gradient field.
DIFFUSION ApPROXIMATION

The other possibility of emission is self-emissivity. The idea of Local Thermal Equilibrium
(LTE) assumes that the emissivity is the blackbody radiation of temperature T.
2hll 3 1
Sv == B.,(T) = c2 ehv / kT -1 (4.14)
where, h is the Planck constant and k is the Boltzmann constant. In opaque regions where the
optical depth defined in Eq.(4.5) is large as r., ~1, LTE becomes a good approximation.
In a semi-infinite atmosphere without convection, the source function can be expanded in a
series of opacity as

S.,(t.,)=~
00 [dnB]
dr;" (tv-r.,t/n! ( 4.15)

The radiation intensity at a certain optical depth is calculated with its angular dependence.

(4.16)

111
The Eddington flux is the first moment

H.,(rv) == "2
-1
Iv(r."ft)ftdft
00
= L(2n + 3)-1
n=O
~n+1B
d 2n+t
r.,
( 4.17)

Further successive terms in the expansion reduce as O(l/r;). As a first order approximation,
we can compile the angular dependence of radiation as

( 4.18)

Therefore, the flux normal vector is parallel to the temperature gradient (Mihalas 1978).

H.,(r.,) = ]: dB., = _]: (_l_aBv) (dT) (4.19)


3 drv 3 "'vP aT dz
This diffusion approximation will be plausible as an approximation for SPR. The main radiation
flux of self-emitting gas will be chosen to be parallel to the gradient vector of temperature field.
Thus we can make the images only with the density and temperature scalar fields. The casting
light from the outside of the volume data is not necessary in SPR algorithm.
597

RADIOSITY METHOD

The preceding fonnalism of SPR does not consider the contribution of intermediate opacity
region explicitly. In optically thick regions, the temperature and its gradient detennine the main
radiation flux. While in optically thin regions, the scattered light represents the density structure
and its gradient direction. The radiosity equation includes the back reaction of scattering light
on the global emissivity (Kajiya 1986; Immel et al. 1986). In discretized fonn of N patches, it
is written as

Bj = E j + (j J BjFjj
N
= E j + (j LBjFjj
j=l
(4.20)

where the intensity B j is named Radiosity and the reflection coefficient is (j. Geometrical effects
are represented by the form factor


I)
1
= -A.
1
11
A.
..
A}
cos ¢>j cos ¢>j
7rr2.
I)
r··dA-dA·
I) 1 )
(4.21 )

where ¢> is the angle between the normal vector of the patches and the relative position of rj - rj'
If we identify the N -patches in the Radiosity Method with N -particles in SPR, the above
expression can be translated. The reflection factor should be the function of particle's optical
depth all(:z:;)-
The inclusion of Radiosity Method in SPR means that the ray equation has to be integrated
along the ray of every particle pair, not only along the observer's line of sight. This requires a
matrix inversion which uses O(N2) computer memory. To avoid this difficulty of the Radiosity
Method, an asymptotic algorithm was presented by Cohen et al. (1988). It iteratively calculates
the contribution from the brightest patches until the remission becomes negligible. Suppose a
certain i-th particle with B j ~ B j then the radiosity of j-th particle is calculated explicitly.

A-
B·) = E.) + (BY.
) 1)1
= E) + (B·-l.F
) 1 Aj I)
(4.22)

It uses O( N) memory by N steps. The optimization is done by the fast sorting to select the
emitting patches and the brighter patches first.

5. ASTROPHYSICAL APPLICATIONS
We present the hydrodynamical simulation and its visualization in this section. The supernova
explosion is the study of spherical shock waves. The collision of interstellar gas is the
investigation of the fission and coagulation processes. Tidal disruption is the initiation of
accretion flows around a compact star. These simulations are presented as video movies.

SUPERNOVA EXPLOSION

Three dimensional simulations of supernova explosion were perfonned (Nagasawa et al. 1988).
The matter dominated by radiation pressure is treated as the ideal gas of 1 = 4/3. The
spherical shock condition decides the topology of shocked gas. For the steep gradient envelope
such as Idlogp/dlogrl :?: (7 - 1)/(1 + 1), the "bubble" solution shows the shell structure and
has a singularity at inside discontinuity. They are unstable against convection and result in
fragmentation of ejected gas. The ejected gas seems very clumpy and shows porous structure.
The clumpiness becomes manifest by chosing the suitable opacity.
598

Fig.3. The volume rendering of 3-D supernova explosion with resonance opacity and the
Phong's shading. The equidensity surface shows the void structure although the general
motion is a spherical expansion.

Fig.4. Same as Figure 1, but the fine resolution image with N = 106 particles.
599

When the shock wave arrives at the photosphere, the optical outburst starts. After that, the
ejected gas and the radiation field expand so rapidly that the interior temperature decreases
adiabatically. Since the electron scattering is dominant, the absorption coefficient is constant
and the opacity sharply decreases outward. The convection between shock front and contact
discontinuity causes effective turbulent mixing in the shell. The SPR shows clearly the mixing
and fragmentation of supernova ejected gas.
The shape of density fluctuation is complicated. The temperature radiation model shows the
filamentary ejection of hot gas.
CLOUD-CLOUD COLLISIONS

As an example of the interaction between two gravitationally bound states, there is a simulation
of interstellar cloud-cloud collisions (Nagasawa and Miyama 1987). In supersonic head-on
collisions between two stable clouds, the shock compression increases the density. The self-
gravity can trigger the instability and forms a protostar.
In the case of off-center collisions, the outcomes depend on the angular momentum. If the
angular momentum is large, the system starts fission to form the binary system after the
collisional merging. If the angular momentum is small, the shock compression triggers the
gravitational collapse and the rapidly rotating core forms near the collisional center. For the
intermediate case, they make a merged disk with a bar-spiral structure. The shock region
becomes a rigidly rotating protostar, while the outer region extends as halos. These subtle halo
structures are represented in the SPR pictures. In this way, the various cloud morphology are
explained by cloud-cloud interactions.
TIDAL DISRUPTION AND ACCRETION DISK

We simulate also the close encounter of ordinary star passing by a compact object
(Nagasawa et al. 1991). The tidal torque is strong enough to deform or disrupt the bulky
star. In the encounter of parabolic orbit, the disrupted gas flows into the compact star ejecting
the angular momentum by elongating the spiral structure outward.
Tidal disruptions initiate the accretion flows around the compact star like a neutron star or
a black hole. The overflowed gas makes the accretion disk around the compact star. The
non-axisymmetric instability of hot torus helps the angular momentum transport. It continues
the accretion flow and triggers the astrophysical jet.

6. DISCUSSION
RESOLUTION AND DYNAMIC RANGE

The Smoothed Particle Hydrodynamics with N =10 6 particle is available now. However, the
Smoothed Particle Rendering up to N=10 5 particles can be done with reasonable CPU cost.
Using the supercomputer Hitachi S820/80 (~3 GFlops) at the Institute of Computational Fluid
Dynamics, it takes a few seconds of CPU time to overlay N=10 4 particle volume data onto the
640x480 pixels.
The resolution limit of the smoothed particle method is almost equal to the particle size.
However, the simulation with only N =400 particles can sufficiently reproduce the 3-D spherical
density profile within a few percent error. The reason is that the SPH involves average in many
directions and the particle size is adjusted to the given density structure. If the particles fill the
space according to the local density, the SPH represents the continuum as Lagrange scheme in
the Finite Element Method. When we use this staggered particle of Eq.(2.8), we can calculate
the diffuse envelope with much improved accuracy.
600

• •
• •
Fig.5. The Smoothed Particle Rendering of supernova. The opacity is calculated with density
data and the emissivity is the function of temperature. This image is the output of these
two 3-D scalar field .

Fig.6. The simulation of star formation induced by a cloud-cloud collision. Four figures show
the time evolution. The central collapsing region becomes a protostar. The SPR can
display the condensation process in the diffuse gas .
601

Fig.7. Various cloud morphologies due to dynamical cloud-cloud collisions. Complex structures
in 3-D space are easy to recognize by the SPR.

On the other hand, the SPM has the same feature as the Monte Carlo method when the
smoothing length is constant for every particle. Suppose the system with mass M and the mean
density Po, the resolution is limited to scale as,

h,;:::; ( M )1/3 ~ ~ (M )1/3 (6.1 )


PoN 2 pN
The dynamic range of density contrast is also limited by the particle number. The diffuse limit
of particle sampling is the case that the mean interval is near to the cut-off (Tij) '" 2h . The
dense limit is the perfect overlap of all particles. Thus, the dynamic range of density contrast is

1/8 ,:S p/ Po ,:S N (6.2)

Smoothed Particle Rendering is a natural extension of SPH . It does not need the intermediate
dataset such as 3-D mesh data. Therefore, it reflects the resolution of SPH directly as the
quality of output SPR images indicate. The fair judgement of hydrodynamical simulation will
be possible by SPR. If better simulations are performed, finer images will appear.
However, the intensity range of one color pixel on the graphic device is usually less than 1byte,
which is not sufficient to monitor the output of realistic rendering of astrophysical problems.
The temperature and emissivity range of the simulation is often out of this dynamic range .
602

Fig.8. Tidal disruption in parabolic orbit. The fluid flow motion is represented by the SPR
with constant opacity.

Fig.9. Tidal overflow in the system of close binary stars. The flow makes a compact disk which
is well resolved by the SPH simulation.
603

FAST SORTING ALGORITHM

For accelerating the algorithm of gradient field calculation in SPH and SPR, we need fast sorting
methods for neighborhood search.

E !;j
N
f;(shoTt) = (6.3)
r'i<h

It is obvious that the list vector method is preferred to the masking method in calculations with
many particles.
The chaining block algorithm is developed for this purpose. Every particle is assigned with the
label of the corresponding blocks in the Cartesian coordinates. The neighbor block is searched
at first. Then, the member particle of neighbor blocks with the size b 2: hmax is selected as
a candidate. There are {I, ... , 27} candidate blocks for one particle. The hierarchy block is
another strategy to accelerate the sorting. By choosing the optimal block size as b = hmax/nb'
there are {I, ... ,(2nb + 1)3 - nrim} candidates. This sorting is effective if hmax/hmin ~ l.
We also find the recipe to run these codes effectively on vector pipeline supercomputers. The
strategy is to pick up the solitary block which contains a single particle and to sort only the
particles which change the position label during the time integration.
Both the gravity calculation in SPH and the Radiosity Method in SPR require the O(N2) floating
operations if we use the direct summation. The grouping method which bundles the small
contributions from the distant particles saves the CPU time. Of course, the faster algorithm
increases the lack of data accuracy. The block sorting will help us to achieve the performance
of O(N1.2~1.5).

FLUID MOTION IN GENERAL COORDINATES

We usually use Cartesian coordinates (x, y, z) in the hydrodynamics and renderings, which is
free from any coordinate singularity. In the Voxel Volume Rendering, the interpolation scheme
causes inevitable jaggy noise. Such noises are dangerous to distinguish the hydrodynamical
fluctuations on the output images. Both hydrodynamical perturbations and rendering aliasings
depend on the coordinate system such as spherical, cylindrical or Cartesian one. The SPM does
not have any special coordinate system. It is free to make the configuration in 3-D spaces. As a
great advantage, the SPR is free from an aliasing because of its statistical randomness and the
absence of definite geometrical voxels.
In addition, the motion of Lagrange particles, which are not the test particles, show the actual
multi-fluid evolution. We can simulate also the mixture of N -species gas.

7. CONCLUSION
A new radiation model named Smoothed Particle Rendering are presented for the visualization
of 3-D scalar fields with the possibility to simulate the radiation hydrodynamics. This is an
application of Smoothed Particle Method in astrophysics to the radiative transfer problem.
The formalism of SPR aims to integrate the ray equation through the opaque medium and to
calculate the global contribution of scattered light. The SPR is applicable both to grid scalar
data and to particle configurations in 3-D spaces. The volume data are sampled and represented
by the "smoothed particles" .
604

The advantage of SPR is the reduce of selective parameters. Each particle assigns the calculated
opacity and the color intensity. The opacity is derived from the physical absorption coefficient
and represents the density scalar data. The color intensity and the flux direction represent the
temperature scalar field and its gradient. The determination of the emissivity is derived by
the diffusion approximation of Local Thermal Equilibrium. There is no necessity for external
casting lights.
The SPR is good for displaying fuzzy surface of gas. We do not need any digital determination of
the surface. The apparent surface is recognized by the observers who watch the output images.
The display of faint object such as an astrophysical gas are tested. No jaggy feature is detected
in the SPR because of its smooth overlapping integral. The volume data of Smoothed Particle
Hydrodynamics are found to be congenial to SPR visualization.

ACKNOWLEDGEMENTS
The numerical computations were performed on a Hitachi S820/80 supercomputer at Institute of
Computational Fluid Dynamics, Tokyo. The associated video tape was recorded on the Hitachi
real-time animation system (KGRAF /MOVIE). This work was partially supported by NHK
scientific projects.

REFERENCES
Chandrasekhar S (1960) Radiative Transfer : New York, Dover
Cohen MF, Cohen SE, Wallace JR, Greenberg DP (1988) A Progressive Refinement Approach
to Fast Radiosity Image Generation. Computer Graphics 22 (4) : 75-84
Daly BJ, Harlow FH, Welch JE (1965) Generalized Particle In Cell Method. Los Alamos
Scientific Laboratory Report (3144) : 1-31
Gingold RA, Monaghan JJ (1977) Smoothed particle hydrodynamics, theory and application
to non-spherical stars. Mon. Not. R. astron. Soc. 181 : 375-389
Immel DS,Cohen MF,Greenberg DP (1986) A Radiosity Method for Non-Diffuse Environments.
Computer Graphics 20 (4) : 133-142
Kajiya JT (1986) The Rendering Equation. Computer Graphics 20 (4) : 143-149
Levoy M (1988) Display of Surfaces from Volume Data. IEEE Computer Graphics and
Applications 8 (3) : 29-37
Mihalas D (1978) Stellar Atmospheres : 2nd ed. New York, Freeman
Miyama SM, Hayashi C, Narita S (1984) Criteria for collapse and fragmentation of rotating
isothermal cloud. Astrophys. J. 279 (2) : 621-632
Nagasawa M, Miyama SM (1987) Three-Dimensional Numerical Simulation of Interstellar
Cloud-Cloud Collisions and Triggered Star Formation. Prog. Theor. Phys. 78 (6) : 1250-1272
Nagasawa M, Matsuda T, Kuwahara K (1991) Roche overflow and formation of astrophysical
jet. : in preparation
Nagasawa M, Nakamura T, Miyama SM (1988) Three-Dimensional Hydrodynamical Simulation
of Type II Supernova. Publ. Astron. Soc. Japan 40 (6) : 691-708
Parzen E (1962) On estimation of a probability density function and mode. Ann. Math. Stat.
33 (3) : 1065-1076
Wood D (1981) Collapse and fragmentation of isothermal gas clouds. Mon. Not. R. astron. Soc.
194 : 201-218
605

BIOGRAPHIES
Mikio Nagasawa is a researcher at the Institute of Computational
Fluid Dynamics, Tokyo. His research interest includes the hydrodynam-
ics of self-gravitating system, radiation transfer and pattern recognition
of volume data. Nagasawa received a Bachelor of Science in 1982, a
Master of Science in Physics in 1984 and Ph.D. of Astrophysics in 1987
from Kyoto University. He is a member of Astronomical Society of
Japan.
Address: Institute of Computational Fluid Dynamics, 1-22-3,
Haramachi, Meguro, Tokyo 152 Japan: nagasawa@icfd.co.jp

Kunio Kuwahara is a professor at the Institute of Space and


Astronautical Science. Kuwahara graduated from the University of
Tokyo at 1966 and received a Ph.D. of physics in 1975. He was a NRC
Research professor at NASA Ames Research Center in 1980-1981. His
wife organizes the Institute of Computational Fluid Dynamics which is
the most outstanding supercomputing center in Japan . He supervises
young researchers in fluid mechanics and visualization at Institute of
Computational Fluid Dynamics.
Address: Institute of Space and Astronautical Science, 3-1-1 ,
Yoshinodai, Sagamihara, Kanagawa 229 Japan: kuwahara@icfd.co.jp
Chapter!!
Applications
Pan-Focused Stereoscopic Display Using a Series
of Optical Microscope Images
Kazufumi Kaneda, Shohei Ishida, and Eihachiro Nakamae

ABSTRACT

Even though an electron microscope is useful for obtaining images with deep depth of focus and
large magnification, it is useless for observation of wet samples because of evaporation of water
in a vacuum. A scanning laser microscope can be used for wet samples, but observable samples
are relatively limited compared to the optical microscope. An optical microscope is most useful
for observing various kinds of wet samples. However, precise observation of an extended region
of the optical microscope is quite difficult when particles dispersed in depth are observed at large
magnification because the larger the magnification, the smaller the focused area.
To overcome this problem, the paper proposes a method of displaying a pan-focused stereoscopic
image: By using image processing techniques, in-focus areas are extracted from each image which
is focused on slightly different depths, and a stereoscopic image is composed with these in-focus
areas. The proposed method is applied to the observation of three dimensional distributions of
particles in a slurry, a common problem in the field of civil engineering, and the usefulness of the
proposed method is demonstrated.
Keywords and Phrases: Image Processing, Image Synthesis, Optical Microscope Image, Pan-
Focused Image, Multiple Images

1 INTRODUCTION

Electron microscopic images are usually in-focus over an extended region because of the intrinsically
wide range of depth of focus of the electron microscopes. However, in general electron microscopes
have the serious defect of being unable to observe wet samples, because water evaporates from
the samples in a vacuum. Therefore, cohesion of particles in water and interactions of particles
and water over time cannot be observed. A scanning laser microscope can be used to observe
wet samples, but the types of sample which can be observed are limited compared with optical
microscopes. Optical microscopes have no such limitation, but in large magnification the in-focus
range becomes very narrow (at a magnification of 400, the depth of field is approximately 1 micro-
meter (see appendix). For this reason, only a small cross section of the sample is in-focus; when
particles are dispersed in depth, it is very difficult to closely observe an extended region of sample
at high magnification using an optical microscope.
However, exactly such an observation is often required in various fields such as engineering, biology,
medicine, etc. In this paper we address the problem of observing the distribution and cohesion of
particles in a slurry medium using an optical microscope; this issue is very important for the study
of materials in civil engineering to investigate the slurry system such as combination of cement
powder and water [Tazawa 89].

609
610

Several methods for increasing depth of field have been proposed; they can be classified as optical
methods and image processing methods.
Several optical methods using annular apertures [Welford 60] and Fresnel zone pupil masks
[lndebetouw 84] have been proposed. However these methods are not useful for observation with
optical microscopes, because the depth of field is increased only ten times at the highest estimate.
Image processing methods which extract in-focus areas from a series of images taken at different
focal positions and compose these areas into an image have been developed.
Pieper, et al. [Pieper 83] presented three algorithms for locating in-focus areas. The most effective
of these algorithms is based on the fact that in-focus areas of photographs contain high spatial
frequencies. However, the method suffers from the effect that a differential operator for measuring
spatial frequencies increases noise. Resultant images are not clear, and three dimensional shapes
cannot be observed because images obtained contain no depth information.
Sugimoto, et al. [Sugimoto 85] calculated the degree offocus using local variation of image intensity
and composed images based on proportion of the degree of focus into pseudo-stereoscopic images.
This pseudo-stereoscopic display method composes each image by shifting it laterally in proportion
to the calculated depth of the image. Therefore, the obtained images differ from the real views,
and the exact height of the samples cannot be observed.
Darrell, et al. [Darrell 88] used Laplacian and Gaussian pyramids for calculating depth of three
dimensional objects from multiple images with different focus position. However, in their method
zoom distortion must be compensated, because multiple images are taken by changing the focal
distance of the lens system. Furthermore, their purpose is different from ours, because their purpose
is only to calculate depth from the camera to each object in a scene.
Shio [Shio 89] extracted in-focus areas using normalized standard deviation of gray level images
and composited a pan-focused image. However, his method is not useful for half-tone images
such as microscope images, because the method was developed to deal with binary images such as
character images.
Okabayashi, et al. [Okabayashi 89] proposed an image composition method with little susceptibility
to noise, because it transforms images into Fourier space and composes the image in a spatial
frequency plane. However, the composite images have no three dimensional information, and the
Fourier transformation makes the method computationally expensive.
By combining image processing and computer graphics, in this paper we present a method for
displaying pan-focused stereoscopic images from multiple optical microscopic images at large mag-
nification; a succession of optical microscope images with focus position at slightly differing depths
is obtained by moving the stage of the microscope, and the in-focus areas are extracted from these
images. Finally, a pan-focused stereoscopic image is composited from these in-focus areas. This
method makes the most of blur in de-focus area, because the larger the magnification is, the shal-
lower the depth of focus is. The proposed method makes it possible to closely observe wet samples
at large magnification, and promises to be a welcome new tool in many fields of research. That is,
this method is useful not only for civil engineering but also biology, medicine, etc.
Our method consists of the following four steps:

1. Taking a succession of optical microscope images with different focal position in depth.
2. Extracting in-focus areas from these images.

3. Reconstructing three dimensional objects from these in-focus areas using a voxel structure.

4. Displaying the reconstructed objects stereoscopically.


611

video camera

t-------j
AID converter
and --1 monitor TV I

optical microscope
optical microscope image input unit

Ethernet

mIcro-
computer parallel graphics workstation
mini-computer

[~:~~i.:.~:~~:~]
image processing and graphic display unit

Figure 1: System configuration.

In the following section, we describe the system configuration, extraction of in-focus areas, and
reconstruction and display techniques. Finally, the usefulness of the method is demonstrated by
application to observation of particles such as cement paste and fly ash.

2 SYSTEM CONFIGURATION

The configuration of the proposed system is shown in Fig. 1. The system consists of three units:
an image input unit and image processing and display units with a hard disk.
Images taken from a color video camera attached to an optical microscope are stored in a frame
buffer after conversion into digital images. These digital images are transferred to the hard disk of
a micro-computer in order to transmit the images to the processing unit. In order to continuously
change the focus positions in a constant pitch, the microscope sample stage is moved by a stepping
motor controlled by the micro-computer.
Image processing is performed on a mini-computer which has multiple processors, because parallel
processing can be introduced for each image with different focus position and each pixel in an
image. The final images are displayed on the screen of a powerful graphics workstation with an
excellent user interface. In order to observe stereoscopic images interactively, a stereoscopic display
system and stereo-glasses are employed. For image transmission computers are connected via a
high speed network.
612

[%] noise
100 <- without noise removal
Remove Noise from Original Images
75
1
Generate Map Images
1 50
Smooth Map Images
median filter and
1 arithmetic mean
Extract In-focus Areas 25

Figure 2: Outline of extracting in-focus areas. a ~1----5~----~10~--~1~5----~2~0~


number of the images
used for arithmetic mean

Figure 3: Example of the relationship between


noise and the number of images.

3 EXTRACTING IN-FOCUS AREAS

Optical microscope images exhibit the following characteristics [Rosenfeld 76]; in general, in-focus
areas contain high spatial frequencies, because uneven surfaces, patterns, and contours of objects
are clearly resolved, while out-of-focus areas are blurred, and the more out-of-focus the image is,
the lower the spatial frequency content (see Fig. 9).
We use the difference of spatial frequencies for extracting in-focus areas; an outline of the process is
shown in Fig. 2; First, in order to suppress noise affecting the extraction of in-focus areas, noise is
removed from original images taken from the optical microscope. Then, map images which express
the degree of focus at each pixel of the image are generated. Further, in order to remove noise
in these map images, each map image pixel is smoothed in the vertical (z-axis) direction, based
on the pixel value in preceding and succeeding map images. Finally, in-focus areas are extracted
based on the smoothed map images. Below, we discuss methods for removing noise, generating
map images, and extracting in-focus areas.

3.1 Removing Noise

Random noise in the original images can cause erroneous detection of in-focus areas, if detection
is based only on the magnitude of high spatial frequencies. In order to remove random noise, a
method combining a median filter and an arithmetic mean is employed. Several images with the
same focus position are taken by fixing the stage of the microscope. After filtering these images
with a median filter, the intensities of each pixel at the same z and y coordinates in the images
are averaged. This process yields a high quality image.
Fig. 3 shows the result of noise removal according to the number of images for the arithmetic
mean, where a median filter with 3 x 3 window size was used. The larger the number of images
used, the smaller the noise becomes. When three or four images are used for the arithmetic mean,
approximately 70% of the noise can be removed.
613

I intensity

x
(a) intensity distribution (b) smoothing
of an original image
after removing random noise

(e) ABS(image (a) - image (b)) (d) smoothing

Figure 4: Procedure of generating map images.

3.2 Generating Map bnages

Each pixel in a map image has intensity proportional to the magnitude of high spatial frequencies.
This means that the map image can be generated by subtracting a low pass filtered image from
the original image.
Consider a scanline with intensity distributed as shown in Fig. 4 (a). First, in order to remove
high spatial frequencies, an averaging filter is applied to the original image (see Fig. 4 (b)), and
the absolute difference between the original image and the averaged image is calculated for each
pixel (see Fig. 4 (c)). Finally, the map image is obtained by applying a smoothing filter to the
image whose pixel values are proportional to the absolute difference (see Fig. 4 (d)).
Based on the idea described above, the map image is calculated by the following two steps.

Step 1: A (2n + 1) x (2n + 1) filter with the following weights is applied to the images after the
random noises are removed, and the absolute values produced by the high pass filter are
stored for each image.

all

[
an

a2n~1 1

where n is a parameter of the filter size, and ai; (1 ::; i, j ::; n + 1) is a weight determined by
the following equation.

1 - (2n~1)' (ifi=j=n+1)
ai; ={ 1
(otherwise)
- (2n+1)'
614

··_·······7(x,y)
.. '

•• _--::::'" ( X , y)
· · · ·. -:,:7(:r , y)
.'

Figure 5: Extracting in-focus areas using


a succession of images.

Figure 6: Reconstructed objects consisting


of voxels.

Step 2: The following smoothing filter with m x mwindow size is applied to the image produced
by the high pass filter.

3.3 Extracting In-Focus Areas

In extracting in-focus areas using the map image generated above, a thresholding process should
not be employed, because the magnitude of high spatial frequencies is varied across the area. It
depends on the attributes of objects such as colors, surface conditions, etc. In order to solve the
problem, we make use of the fact that the most in-focus area is included in only one image among
multiple images with different focus positions. That is, comparing the values of every map image
at the same OIl and y coordinates (see Fig. 5), the most in-focus pixel is expected to have the largest
value. Therefore, in-focus pixels are detected by locating the pixel with the largest value among all
pixels at the same OIl and y coordinates. A pan-focused image is obtained by applying the process
mentioned above to all OIl and y coordinates.

4 STEREOSCOPIC DISPLAY OF RECONSTRUCTED OBJECTS FROM VOX-


ELS

In the proposed method reconstructed objects are represented by a set of voxels defined as follows:
A voxel is a rectangular parallelepiped whose top surface is the same size as a pixel of the sampled
image, and whose height and color are the same as those of the in-focus area. That is, the height
of the top surface of a voxel corresponds to the height of its sampled image, and the side surfaces
are perpendicular to the sampled image (see Fig. 6). Reconstructed objects consisting of these
voxels are perspective-transformed for the viewpoints corresponding to the right and left eyes, and
are stereoscopically displayed after hidden surface removal.
615

If a viewpoint is set, visible priority and hidden surfaces of each voxel can be easily determined;
thus, reconstructed objects can be quickly displayed. That is, voxels are classified into nine groups
shown in Fig. 7 (a) by the location ofthe foot of the perpendicular, P, from the viewpoint to the
sampled image. Visible priority of each group is classified into three levels: groups 1, 2, 3, and
4 are the lowest visible priority, groups 5, 6, 7, and 8 the second, and group 9 the highest (see
Fig. 7 (b». Visible priority of each voxel in a group can be also easily determined: In groups 1, 2,
3, and 4, the closer to group 6 or 8 the scanline is, the higher the visible priority is, and in the same
scanline, the closer to group 5 or 7 the voxel is, the higher the visible priority is (see Fig. 7 (b»; for
example, in group 1 the order from the lowest to the highest visible priority of scanlines is y = n,
n - 1, ... , i + 1, and the order of visible priority of voxels on a scanline is a: = 1, 2, ... , i - 1.
In groups 5, 6, 7, and 8, the closer to group 9 the voxel is, the higher the visible priority is (see
Fig. 7 (b».

view point

1L: x
(a) classification of the sampled images

I pixel
<C--? group 5

L
n ;- I
n -I ""
group 1 group 2

group 8

I pixel!
~j
group : \
;-
;) \
(

j-I group 6

11
7
""
group 4 group 3
2
I ;- (

I 2 .......... i-I i i +I .... m-I m

group 7
(b) visible priority of each group

Figure 7: Visible priority for hidden surface removal.


616

Table 1: Invisible voxel surfaces for each group. A

group invisible surfaces


1 D,E,F
2 C,D,F c
3 B,C,F E
4 B,E,F
5
6
C,D,E,F
B,C,D,F
;
........................ ........................

7 B,C,E,F
z
8 B,D,E,F
9 B,C,D,E,F
IL:x F

Figure 8: Classification of voxel surfaces.

Furthermore, invisible surfaces of every voxel in each group can be classified as shown in Table 1
(see Fig. 8). Making use ofthis information, all voxels surfaces except invisible ones are overdrawn
in order of the visible priorities of each group, each scanline, and each voxel. This method quickly
generates an image with hidden surface removal, because the visible priority is easily determined
and invisible surfaces are easily removed.
This system makes it possible to interactively observe the image, using the high speed graphic
engine of a graphics workstation. Furthermore, the system has the following functions for precise
observation:

1. Each image for right and left eyes is displayed alternately 60 times per second, and stereo-
glasses with a liquid crystal shutter synchronized to the display provide a three dimensional
view of the reconstructed objects.

2. Numerical information such as sizes and positions of objects can be easily grasped by com-
positing a three dimensional scale with the reconstructed objects (see Figs. 12 and 13).

5 EXAMPLES

In order to verify the accuracy of the proposed method, it is applied to the observation of dry
copper powder whose attributes, such as shape and color, are already known. An objective lens
with magnification 40 was used, and 52 steps of optical microscope images were taken with 1
micro-meter pitch. Fig. 9 shows the images of every five steps; only a small part of the sample is
focused in each image. By using the proposed method pan-focused stereoscopic images shown in
Fig. 10 are generated from these original images. The three dimensional copper surfaces can be
closely observed over a whole range of vision.
Fig. 11 shows an example of pan-focus stereoscopic display of fiy ash, the ash produced by ther-
moelectric power plants. Forty one steps of original images were taken in the same condition as
617

\0

Figure 9: Optical microscope images of dry copper powder.

Figure 10: Pan-focused stereoscopic display of dry copper powder.


618

Figure 11: Pan-focused stereoscopic display of fly ash.

-:. ... .';:'!t.~


I" , ~ ., *' ..'
• .. . -.
~ ~ i_';~.
't......
-;, •. _ ',.

~~
!S ~5'~ -,n.
-.r~~ .'t..
: P '.~..'..
:~ ..
-:
".'-
.'" .of!t~'
.'
. .
:",
...
~J".
." ,...
··~IJ·
~., ~. _.
.d.... ·- ...
Ito .,
~ -.....
.....~~.

Figure 12: Pan-focused stereoscopic display of dry cement powder.

Figure 13: Pan-focused stereoscopic display of cement in a slurry.


619

Fig. 9. Some of the semitransparent fiy ash particles were not accurately reconstructed, because
the transparency of the particles is too high.
Figs. 12 and 13 show examples of dry cement powder and cement in a slurry, respectively. The
proposed method makes it possible to observe wet samples as they are, although some noise is ob-
served. In Fig. 12, 77 steps of optical microscope images with an objective lens of magnification 40
were taken in 1 micro-meter pitch, and in Fig. 13 the location in depth is compensated considering
the refraction of water after taking 50 steps of images in the same condition. Appreciation of the
position and the size of each particle is made possible by composing three dimensional scales, whose
lengths in vertical, horizontal, and depth directions are 30, 30, and 20 micro-meters, respectively.

6 CONCLUSIONS

This paper proposes a method for displaying pan-focused stereoscopic images from a succession
of optical microscope images which have very shallow depth of focus. That is, in-focus areas are
extracted from multiple images which have slightly different focus position in depth by applying
a high pass filter to these images, and a pan-focused stereoscopic image is composed with these
in-focus areas. The proposed method makes it possible to closely observe wet samples at large
magnification, even when it is difficult to observe a whole range of vision using an optical microscope
because of shallow depth of focus. The usefulness of the proposed method is demonstrated by
applying to observation of particles encountered in civil engineering research.
In order to generate clear images under various conditions, the following problems remain for
further study:

• Background noise (extremely slender voxels in Figs. 12 and 13) should be removed.

• When a cover glass is used to prevent water from evaporating, sampled images are very
degraded. In this case it is very difficult to generate clear pan-focused images.

• When samples change with time, fast image processing is required.

In order to address these problems, methods for image restoration from degraded images and high
speed image processing should be developed.

ACKNOWLEDGMENT

The authors wish to thank Professor Ei-ichi Tazawa and Professor Asuo Yonekura for motivating
this research and for their discussion about the application of the method. We would like to thank
the reviewers for their helpful comments.

REFERENCES

[Darrell 88] Darrell T and Wohn K (1988) "Pyramid Based Depth from Focus," Proc. IEEE
Computer Society Conference on Computer Vision and Pattern Recognition
:504-509

[Indebetouw 84] Indebetouw G and Bai H (1984) "Imaging with Fresnel Zone Pupil Masks: Ex-
tended Depth of Field," Applied Optics 23(23):4299-4302
620

[Inoue 80] Inoue T, Yokoyama J, and Hayashi T (1980) "Fundamentals of Observation


Using Microscopes," Chijin-syokan, Tokyo (in Japanese)
[Okabayashi 89] Okabayashi M, Kikuchi S, Ohyama N, and Honda T (1989) "Increasing Focal
Depth of Reflecting Microscope Images by Synthetic Method," The Institute of
Television Engineers of Japan, Technical Report 13(46):19-23 (in Japanese)
[Pieper 83] Pieper RJ and Korpel A (1983) "Image Processing for Extended Depth ofField,"
Applied Optics 22(10):1449-1453
[Rosenfeld 76] Rosenfeld A and Kak AC (1976) "Digital Picture Processing," Academic Press,
Inc., New York
[Shio 89] Shio A (1989) "Pan-Focused Image Synthesis Using Multiple Images with Differ-
ent Focal Conditions," Information Processing Society of Japan, Special Interest
Group Reports 89(16):105-110 (in Japanese)
[Sugimoto 85] Sugimoto SA and Ichioka Y (1985) "Digital Composition of Images with
Increased Depth of Focus Considering Depth Information," Applied Optics
24( 14 ):2076-2080
[Tazawa 89] Tazawa E (1989) "Grain and Concrete," Cement . Concrete (514):1-8 (in
Japanese)
[Welford 60] Welford WT (1960) "Use of Annular Apertures to Increase Focal Depth," Jour-
nal of the Optical Society of America 50(8):749-753

APPENDIX
DEPTH OF FOCUS OF OPTICAL MICROSCOPE IMAGES

When focus is exactly set at a point on a sample, the range of depth over which the neighboring
areas are also in-focus is called depth of focus. For observation with the naked eye using an optical
microscope, depth of focus T is expressed by the following equation [Inoue 80]:

T- n~ 250n K
(1)
- 2a 2 + aM '
where a is the numerical aperture of the objective lens, M the total magnification of the microscope,
n the index of refraction of the sample, ~ the wavelength of light, and K a coefficient related to
the human eye. The first term on the right side of Eq. (1) expresses depth of focus for cameras,
which is called objective depth of focus. The second term expresses increase of depth of focus due
to the function of adjustment of the human eye, which is called subjective depth of focus. The
larger the total magnification M is, the larger the numerical aperture a is. Therefore, the larger
the total magnification is, the shallower both objective and subjective depths of focus become.
For example, observing particles in water leads to the depth of focus shown in Table 2. In gen-
eral, it is often necessary to observe samples whose thickness is several tens of micro-meters, at
magnification of more than 400; in this case, it is quite difficult to observe a whole range visually.
621

Table 2: Depth of focus for optical microscopes.

total eyepiece objective numerical objective subjective depth of


magnification lens aperture depth of depth of focus
M a focus [pm] focus [pm] T [pm]
100x lOx lOx 0.25 5.85 7.58 13.43
200x lOx 20x 0.4 2.29 2.37 4.66
400x lO x 40x 0.65 0.87 0.73 1.60
600 x lOx 60 x 0.8 0.57 0.39 0.96
1000 x lOx 100x 0.9 0.45 0.21 0.66
(for n = 1.33, ~ = 0.55[pm], and K = 0.57)

Kazufumi Kaneda is a research associate in Faculty of Engineering at


Hiroshima University. He worked at the Chugoku Electric Power Company
Ltd ., Japan from 1984 to 1986. He joined Hiroshima University in 1986. His
research interests include computer graphics and image processing.
Kaneda received the BE, ME, and DE in 1982, 1984, and 1991, respectively,
from Hiroshima University. He is a member of lEE of Japan, IPS of Japan,
and IEICE of Japan.
Address: Faculty of Engineering, Hiroshima University, 4-1, Kagamiyama
1 chome, Higashi-hiroshima, 724 Japan.
E-mail: kin@eml.hiroshima-u .ac.jp

Shohei Ishida is a graduate student in system engineering at Hiroshima


University. His research interests include computer graphics and image pro-
cessing.
Ishida received the BE degrees in electronics engineering in 1989 from Hi-
roshima University. He is a member of IEICE of Japan.
Address: Faculty of Engineering, Hiroshima University, 4-1, Kagamiyama
1 chome, Higashi-hiroshima, 724 Japan.

Eihachiro Nakarnae is a professor at Hiroshima University where he was


appointed as research associate in 1956 and a professor in 1968. He was an
associate researcher at Clarkson College of Technology, Potsdam, N. Y., from
1973 to 1974. His research interests include computer graphics and electric
machinery.
Nakamae received the BE, ME, and DE degrees in 1954, 1956, and 1967 from
Waseda University. He is a member of IEEE, ACM, lEE of Japan, IPS of
Japan and IEICE of Japan.
Address: Faculty of Engineering, Hiroshima University, 4-1 , Kagamiyama
1 chome, Higashi-hiroshima, 724 Japan.
E-mail: naka@eml.hiroshima-u.ac.jp
Reconstructing and Visualizing Models
of Neuronal Dendrites
Ingrid Carlbom, Demetri Terzopoulos, and Kristen M. Harris

Abstract: Neuroscientists have studied the relationship between nerve cell morphology and func-
tion for over a century. To pursue these studies, they need accurate three-dimensional models
of nerve cells that facilitate detailed anatomical measurement and the identification of internal
structures. Although serial transmission electron microscopy has been a source of such models
since the mid 1960s, model reconstruction and analysis remain very time consuming. We have
developed a new approach to reconstructing and visualizing 3D nerve cell models from serial
microscopy. An interactive system exploits recent computer graphics and computer vision tech-
niques to significantly reduce the time required to build such models. The key ingredients of the
system are a digital "blink comparator" for section registration, "snakes," or active deformable
contours, for semi-automated cell segmentation, and voxel-based techniques for 3D reconstruction
and visualization of complex cell volumes with internal structures.

Keywords: scientific visualization, 3D reconstruction, image registration, image segmentation,


contour tracking, snakes, volume rendering, dendrites, dendritic spines

AVS is a trademark of Stardent, Inc.


leAR is a trademark of ISG Technologies, Inc.
VoxelView is a trademark of Vitallmages, Inc.

1 Introduction
Neuroscientists search for links between neuronal dendritic morphology and behavior, and mor-
phology and disease by studying the relationship between the morphology and function. De-
tailed morphological studies require accurate three-dimensional models of nerve cells that facil-
itate anatomical measurement and identification of internal structures. Neuronal dendrites and
their protruding dendritic spines can be seen with a light microscope (Fig. 1), but the resolution
is insufficient for detailed anatomical measurement and the internal structures are not visible.
To date, the only method available for detailed measurement and study of internal cell structure
is through 3D reconstructions from serial electron microscopy, or serial EM (Fig. 2) (Harris and
Stevens 1988; Harris and Stevens 1989; Stevens and Trogadis 1984; Wilson et al. 1987).
Reconstructions from serial EM have been produced almost since the invention of the electron
microscope. Initially, the reconstructions were purely manual, but over the years they have relied
increasingly on computers. Even with current computer-assisted techniques, the reconstruction
of a 5J.Lm dendritic segment with all of its spines and synapses, along with the quantitative
analysis of the reconstructed model, can take up to three months of work. By contrast, the
tissue preparation and EM photography take only about two days! It is remarkable that so many
neuronal reconstructions have been made because " ... the incredible investment in time and energy

623
624

Fig. 1: A hippocampal pyramidal cell with the soma and the dendrites (den) visible. The axons
are from other hippocampal cells that form synapses with these dendrites. At higher magnification
the dendritic spines are seen protruding from the dendrite. Bars = lOJ.Lm.
(Reproduced from (Harris et al. 1980) with permission from the publisher.)

necessary to reconstruct cells is nothing short of heroic" (Stevens and Trogadis 1984) .
We present a new interactive approach to reconstructing and visualizing 3D nerve cell models from
serial microscopy, and describe an interactive system which exploits recent computer graphics and
computer vision techniques to reduce significantly the time required to build such models. After
presenting a more detailed review of the relevant neurophysiological motivation for our work and
its relationship to prior efforts, we describe the key components of our approach to reconstruction
and analysis of neuronal dendrites. Our prototype system currently features a digital "blink
comparator" for section registration, "snakes," or active deformable contours, for semi-automated
cell segmentation, and voxel-based techniques for 3D reconstruction and visualization of the 3D
morphology of dendrites along with their internal structures.

1.1 Background
A nerve cell, or neuron, has four constituent parts: the cell body (soma), the dendrites, the
axon, and the presynaptic terminal of the axon (Kandel and Schwartz 1985). The soma is the
metabolic center of the cell, the dendrites are the receiving units, the axon is the conducting
unit, and the presynaptic terminals are the transmitting units. The areas of contact between the
presynaptic axonal terminals of one cell and the dendrites of another cell are called the synapses .
Most synapses are located at the end of protrusions on the dendrites, called the dendritic spines
(see Fig. 1).
In humans, the dendritic spines are lost or change shape both with aging (Feldman and Dowd
1975) and with diseases that affect the nervous system, such as dementia (Catala et al. 1988),
brain tumors (Spacek 1987), Down's syndrome (Marin-Padilla 1976), epilepsy (Scheibel et al.
1940), Huntington's disease (Graveland et al. 1985), and alcoholism (Ferrer et al. 1986). Detailed
anatomical descriptions of the synapses and dendritic spines will provide new understanding about
their function, thus improving opportunities for understanding the underlying causes and effects
of these diseases.
625

Fig. 2: An EM photomicrograph from a section of a rat hippocampus . The dendrite (den) is


located in the center, and a large spine is protruding from its right side. The mitochondrion
(mc), microtubules (mt), some smooth endoplasmic reticulum (ser), and the synapse (syn) are
indicated. Bar = l/-Lm .

The dendritic spine is positioned so that changes in its morphology could modulate the transfer
of information from the synapse to the dendrite (Brown et al. 1988b; Wickens 1988). Direct
physiological study of the relationship between dendritic spine morphology and function has been
impossible because of their small size. Several simulations with theoretical models have shown,
however, that small changes in morphology could change the biophysical properties of the spines
(Rail 1974; Crick 1982). Several laboratory studies have shown that dendritic spines change shape
during maturation, following experience, and in response to direct physiological stimulation of the
presynaptic axon . Repeated, or "tetanic," stimulation causes an enhanced synaptic efficacy, which
is referred to as long-term potentiation (LTP), a leading candidate for a synaptic explanation
of behavioral learning (Brown et al. 1988a). Anatomical analyses of stimulated dendrites have
revealed swollen spines and changes in the size of synapses (for a review see (Harris et al. 1989)).
Thus, a change in morphology might contribute to, or represent, the increase in synaptic efficacy
which is observed after stimulation.
Research has also shown that the cytoskeleton and the organelles internal to the neuron, such
as neurofilaments, microtubules, mitochondria, and smooth endoplasmic reticulum (see Fig. 2)
may control the shape of the nerve cell (Harris and Stevens 1988; Harris and Stevens 1989). The
challenge is to find links between neuronal morphology and physiological function in the normal
and diseased brain (Brown et al. 1988b).
626

1.2 Previous work

Registration and Reconstruction: Early EM reconstruction techniques were entirely manual


(Stevens et al. 1980). A photograph from an electron microscope was illuminated from below and
the outline of the structures of interest were traced on a sheet of acetate positioned on top of the
print. Next a photograph of an adjoining section was illuminated and aligned with the trace from
the previous section, and its structures were then traced on a new sheet of acetate. This process
was repeated for all sections. The thickness of the acetate and the magnification were chosen so
that a fairly accurate model would result by inserting a number of blank sheets of acetate between
each trace. Finally, an artist would produce a 2D illustration of the 3D model.
Over the years, reconstruction from serial EM has become increasingly computer assisted. In a
system developed by Stevens et al. (1980j 1984), the EM negatives are first rephotographed onto
a 35mm filmstrip. Next, the filmstrip is mounted on a film transport, which in turn is mounted
on a stage driven by two computer-controlled stepping motors. The first image is digitized and
stored in image memory. The second image is continually digitized while the user moves the film
transport, and a video switcher alternately displays the stored and "live" images on a graphics
screen at a frequency of about 4 Hz. There is an illusion of movement when the images are
misaligned, and the movement is reduced as they are brought into alignment. When the motion is
minimized between the two images or features of interest, the second image is stored. Next, serial
sections two and three are aligned, followed by pairwise alignment of the remaining sections. Once
the images are aligned, the features or boundaries of interest are traced manually using a bitpad.
The traces can be displayed as a set of contours, as contours with the hidden lines removed, or
tessellated to form a surface which can be displayed as a solid object (Harris and Stevens 1989).
Using motion to compare photographs of the same or similar objects is not a new idea. As-
tronomers have used blink comparators since early this century to study astronomical plates
(Croswell 1990). A blink comparator holds two plates of the same part of the sky, and alter-
nately displays them to the user. Stationary objects such as stars remain fixed, but objects such
as comets or planets appear to move.
The problem of image alignment appears in many disciplines. Cartographers need to align aerial
photographs, and in robot vision much effort is devoted to registration of stereo pairs and temporal
image sequences (Horn 1986). In both these disciplines, the images are different views of the same
object, and in robot vision one can often assume that, at least locally, images are only misaligned
translationally. The alignment of neural sections, however, is a considerably more difficult problem
because the EM images are misaligned both translationally and rotationally and there is generally
a large discrepancy between consecutive images. The latter is due both to the integration over
the thickness of the section in the EM photographic process and to distortions of the tissue by its
preparation and by the EM process.
Giertsen et al. (1990) propose a method to automatically align electron micrographs. They
manually trace the membrane and internal substructures of a pancreatic cell in serial sections and
specify connectivity relations among successive contours. Using contour shape features (centroids
and mean radii), they determine linear transformations between successive contours and refine
the alignment using residuals between the original and smoothed data. Aligned contours are
tessellated and rendered as polygonal surfaces.
Segmentation: Many techniques have been developed fo~ 3D image segmentation. One of the
simplest techniques, intensity thresholding, has been used by (Drebin et al. 1988) and (Hohne et
al. 1989) and works relatively well for separating soft and hard tissue in MR and CT scans. Surface
normals are needed for high-quality rendering, and they may be computed using 3D generalizations
of 2D edge detectors; e.g., the Zucker-Hummel operator (Hohne et al. 1989). When the 3D volume
cannot be segmented by simply looking at single intensity values, a larger voxel neighborhood can
627

be used, by applying a 3D generalization of the Marr-Hildreth operator (Hohne et al. 1988; Hohne
et al. 1989). Other 3D generalizations of edge detectors have also been used for segmentation (Liu
1977; Zucker and Hummel 1981; Morgenthaler and Rosenfeld 1981).
Another technique for constructing polygonal representations of constant density surfaces from
volumetric data is marching cubes (Lorensen and Cline 1987). The data is assumed to be defined
on a 3D lattice, and an iso-surface is approximated by finding all intersections between the iso-
surface and the edges of the lattice. The technique dictates how to fill in the surface between
the edges of a cube for each of the 14 different types of intersections that can occur between the
surface and a cube in the lattice. The gradients are determined at each corner of the cube, and
the surface is rendered using the gradients to estimate the surface normal.
Unfortunately, because of the complexity of EM images of neuronal tissue (see Fig. 2), straightfor-
ward segmentation using any of the aforementioned techniques does not appear promising for 3D
reconstruction of dendritic models. Therefore, we aim at introducing a higher level of automation
into the section-by-section manual tracing methodology currently practiced by neuroscientists. A
recently proposed model-based image feature localization and tracking technique known as snakes
(Kass et al. 1987) is well suited to this goal. This interactive technique is consistent with the
manual tracing methods, but is considerably faster and more powerful.
The ability of snakes to conform to complex biological shapes such as cells and to track their
nonrigid deformations across image sequences makes them attractive tools for biomedical image
analysis. In (Leymarie 1990), snakes are used to track living cells moving on a planar surface.
Assuming modest interframe motion, snakes can exploit frame-to-frame coherence to track the
moving cell and also follow the deformations which occur as the cell moves. Ayache et al. (1989)
use snakes to find edges in cross sections of MR data. The user draws an approximate contour
around the region of interest and the snake deforms to fit the regions more accurately. Once the
snake has reached an equilibrium, it is used as a starting point for the next cross section. A 3D
model is built from the the resulting set of contours using Delauney triangulation (Boissonnat
1988). Variations on snakes based on B-splines have been applied to the segmentation of 3D CT
and MR data (Leitner et al. 1990).
Visualization: Volume visualization is an area in computer graphics that has received a great
deal of attention in recent years (Upson 1989). Much work has been devoted to high-quality
rendering of CT, PET, and MR data (Drebin et al. 1988; Levoy 1990) and seismic data (Sabella
1988; Wolphe, Jr. and Liu 1988). Specialized software and hardware has been developed to
facilitate real-time manipulation of both medical data (Meagher 1982; Meagher 1984) and seismic
data (Chakravarty et al. 1986). More recently, general-purpose software for volume rendering has
become available. Examples are ISG Technologies' ICAR System (Stevens and Trogadis 1990),
Stardent's AVS system (Upson et al. 1989) and Vitallmages' VoxelView system (Vitallmages, Inc.
1990). In our work, we use the VoxelView system on a Silicon Graphics workstation to visualize
the 3D reconstructed dendrites.

2 Reconstruction of Neuronal Dendrites

2.1 Data Acquisition


Using routine tissue processing, a slice (400JLm thick) of well-preserved hippocampus was obtained
from the brain of a male rat, and the slice was embedded in epoxy resin. The slice was further
sectioned with an ultramicrotome, at an average section thickness of about 0.06JLm. (The dendritic
segment used in this paper is dendrite number 24 of reference (Harris and Stevens 1989)). Each
section was photographed at 10,000 times magnification in a JEOL 100B transmission electron
628

microscope and printed on 8xl0 inch photographic paper (see Fig. 2).
The EM photomicrographs were digitized on an ECRM Autokon flat-bed laser scanner capable of
digitizing reflection copy images at a wide range of resolutions (Ulichney 1982). The scanner has
a fixed spot size and the intensity can be quantized at one or eight bits/pixel. We digitized the
images at a resolution of 2560x1983 pixels, approximately twice the sampling rate of the smallest
features of interest, and with eight bits of intensity per pixel. The images were low-pass filtered
and subsampled to a size of 640x496 pixels.

2.2 Image Registration


We chose a manual approach to image registration for two reasons. First, the automatic align-
ment of successive EM images is a very difficult problem, because the images are displaced both
translationally and rotationally and usually there is a large disparity between consecutive images.
Second, even if an automated solution can be found, user intervention will be necessary when the
tissue has been distorted during preparation; for example, when a section has a fold.
We have implemented an interactive digital blink comparator. One image is held stationary
and the user translates and rotates the other image while the stationary and moving images
are alternately shown on a graphics screen. The user can translate the image in the x- and y-
directions and can rotate the image about its center by using a three-button mouse, each button
controlling one type of motion. The user moves the image until the motion between the two
images is minimized and the images are aligned. By using double buffering and by pre-computing
x- and y-components of the composite transformation, we obtain comparisons at a frequency of
about 1 Hz for a 640x496 pixel image on a Silicon Graphics 4D /220GTX using one of the R-3000
processors. We found this frequency adequate to obtain good alignment, although a higher speed
would be desirable.
When all images are aligned by pairs, we find the composite transformation for each image relative
to the first image in the EM series and res ample the images using a spatially varying digital shift
filter. This finite impulse response (FIR) filter is designed with a conventional windowing tech-
nique (Oppenheim and Schafer 1975). The resulting aligned images are input to the segmentation
processes described in the next section.

2.3 Extracting Models from Electron Micrographs


The extraction of neuronal dendrites from a set of aligned EM images reduces to three subprob-
lems: (i) the localization of dendritic profiles in digital micrographs, (ii) the segmentation of the
interiors of dendrites bounded by profiles, and (iii) the identification of profiles of the same den-
drite across serial micrographs. The density and geometric complexity of neuronal features in
micrographs make the first and third subproblems especially difficult to automate fully.
We take a semi-automatic approach which exploits recently developed physically-based vision
techniques for interactively localizing and tracking extended features in images. We employ a
variant of snakes, the interactive deformable contour models introduced in (Kass et al. 1987).
Snakes provide significant assistance to the user in accurately locating the membranes that bound
the dendrites in EM images. Using a mouse, the user quickly traces a contour which approximates
the dendrite boundary, then starts a dynamic simulation that enables the contour to locate and
conform to the true membrane boundary. Where necessary, the user may guide the contour
by applying to it simulated forces using the mouse. Through minimal user intervention, snakes
quickly produce accurate dendritic profiles in the form of complete, closed contours that facilitate
the segmentation of dendritic interiors. Finally, with some guidance, snakes are able to exploit
629

the coherence between serial micrographs to quickly extract a sequence of profiles of the same
dendrite.

2.3.1 Deformable Contour Models

A snake can be thought of as a dynamic deformable contour in the x-y image plane. We define
a discrete deformable contour as a set of n nodes indexed by i = 1, ... ,n, with time varying
positions Xi( t) = [Xi( t), Yi( t)]'. The behavior of an interactive deformable contour is governed by
the first-order dynamic system of equations

dXi
'Ydi + ai + f3i = f.; i = 1, ... ,n, (1)

where 'Y is a velocity-dependent damping constant, ai(t) are "compression" forces which make
the snake act like a series of unilateral springs that resist compression, f3i( t) are "rigidity" forces
which make the snake act like a thin wire that resists bending, and f i ( t) are forces in the image
plane applied to the contour.
Let Ii be the given reference length of the spring connecting node i to node i + 1 and let ri( t) =
Xi+l - Xi be the separation of the nodes. Given the deformation ei(t) = Ilr;!1 - Ii, we define

(2)

To obtain contours that can stretch arbitrarily but resist shrinking past a prespecified amount,
we set

ai
(t )
=
{a°
if ei < 0,
otherwise,
(3)

so that each spring resists compression with constant a only when its actual length Ilrill is less
than Ii. To give the contours some rigidity, we introduce the variables bi and define rigidity forces

f3i = bi+I(Xi+2 - 2Xi+l + Xi) - 2bi(Xi+1 - 2Xi + Xi-I)


+ bi - 1 (Xi - 2Xi_1 + Xi-2)' (4)
Note that in the absence of external forces, if the nodes are separated more than Ii, are equally
spaced, and lie on a straight line, ai and f3i vanish and the contour will be at equilibrium.
Compression and rigidity are locally adjustable through the ai and bi variables. In particular, by
setting ai = bi = 0, we are able to break a long deformable contour into several shorter contours
on an image.
To simulate the deformable contour we integrate the system of ordinary differential equations
(1) forward through time using a semi-implicit Euler procedure (Press et al. 1986). Applying
the forward finite difference approximation dx./dt:::o (X!+6t - xD/6t to (1) and collecting linear
terms in the Xi on the left yields the pentadiagonal system of algebraic equations

2.- x t +6t + f3 t +6t = 2.- x t _ at + ft (5)


6t • • 6t' • •
for the subsequent node positions xl+ 6t in terms of the current positions xl. Since the system has
a constant coefficient matrix, we factorize it only once at the beginning of the deformable contour
simulation using a direct LDU factorization method and then efficiently resolve with different
right-hand sides at each time step (see (Terzopoulos 1987) for details).
After each simulation time step we draw lines between the new node positions xl+ 6t to display
the deformable contour as a dynamic curve in the image plane.
630

2.3.2 Image Segmentation using Deformable Contours

The deformable contour is responsive to an image force field which influences the contour's shape
and motion. It is convenient to express the force field as the gradient of a potential function
PI,(x,y) computed from the image I.(x,y) of EM section s:
(6)

where V = [8/8x,8/8yj'. By simulating (1) with (6), the ravines (extended local minima) of
PI, act as attractors to deformable contours. The contours "slide downhill" and stabilize at the
bottoms of the nearest ravines.
In the present application, we are interested in the localization of cell membranes, which appear
dark in positive micrographs. We therefore convert 1.( x, y) into a 2D potential function whose
ravines coincide with dark cell membranes:

PI,{x,y) = Gu * Is(x,y), (7)


where Gu * denotes convolution with a 2D Gaussian smoothing filter of width (J. The filter
broadens the ravines of PI, so that they attract the contours from some distance away.
In practice, Is is not a continuous function, but a digital image. Therefore, we first convolve
the image with a discrete smoothing kernel, then compute (6) by bilinearly interpolating the
smoothed image gradients evaluated at the four pixels surrounding Xi. The user can set the
degree of smoothing (J and select the display of image Is or potential functions PI, through a
menu-driven interface.
The user initializes a closed deformable contour by quickly sketching with a mouse an approximate
trace around the dark membrane of a dendrite of interest. Figure 3(a) shows an initial deformable
contour sketched near a cell membrane (nodes are created automatically so that they are spaced
about one pixel apart, and an additional spring is inserted between the first and last nodes to
close the contour). The user then initiates the snake simulation (5). In a few simulation time
steps the deformable contour equilibrates at the bottom of the nearest ravine in PI, (Fig. 3(b )).
By interacting with the contour (see below), the user can help it quickly localize the membrane
ravine conform to its shape to produce an accurate profile of the dendrite (Fig. 3(e)).
Because the dendritic profile is a closed continuous contour, it is easy to segment the interior of the
cell from the rest of the image. We accomplish the segmentation by applying a standard region-fill
algorithm (Foley et al. 1990) which starts from a seed point inside the profile and sequentially
accesses the pixels of Is that are bounded by the profile. Figure 3(f) shows the cell segmented
from the dendritic profile of Fig. 3( e).

2.3.3 User and Constraint Forces

Often the user will sketch an initial trace which deviates too much from the membrane ravine to
descend into the ravine properly. This is the case for the initial contour in Fig. 3(a) which reaches
the equilibrium configuration shown in Fig. 3(b). As can be seen, the deformable contour may
fall prey to nearby dark features inside the cell (or to nearby features of neighboring cells) which
act as false attractors. In such a case, the user may apply interactive simulated forces fim(t) by
using the mouse to guide deformable contour towards the ravine of interest as it is stabilizing (see
(Kass et al. 1987) for details about user forces). A useful force is the interactive spring

fm = { Xi - m(t) if Ilxi - m(t)11 is minimal for node i,


(8)
, 0 otherwise.
631

(a) (b)

(c) (d)

(e) (f)
Fig. 3: Cell segmentation using a deformable contour (see text) . (a) Initial sketched contour.
(b) Initial equilibrium position. (c )-( d) Manipulating the contour with interactive springs (green
lines) and constraints (blue lines) . (e) Final profile. (f) Segmented cell.
632

Fig. 4: To the left, the current EM image with a deformable contour (snake); to the right, a stack
of dendritic profiles and their interiors. The current snake is displayed in red.

which pulls the nearest node towards the time-varying mouse position m(t) in the image plane.
Figure 3( c) shows the effect of a user stretching the contour towards the right with an attached
interactive spring (green line) from the mouse position (blue circle).
To localize a profile accurately, the user may want to constrain points on a deformable contour
by attaching them with springs to selected anchor points on the image. Such constraints prevent
the deformable contour from straying far from these points, regardless of the image forces and
the user's other mouse manipulations. The mechanism for adding constraints is simply to fix
m(t) = ak in the spring force (8) to create an anchor point ak in the image. The constraining
spring then applies a force f ia•. Figure 3(d) illustrates a constraint spring (blue line) which pulls
the contour back towards an anchor point on the cell membrane as the user tugs on the contour
with an interactive spring (green line). Note the two constraints (blue dots) in the final profile
contour in Fig. 3( e).
Combining the three types of forces, we have

fi = q\1 PI,(Xi) + cmft + Ca L fr', (9)


k

where 2.:k is a summation over all the anchor constraints in force and where CI, Cm, and Ca are
the strength factors of the image forces, user spring forces, and anchor spring forces.

2.3.4 Exploiting Coherence Across Serial Sections

Our interactive technique for extracting cell profiles from EM images benefits from the fact that
snakes can exploit the coherence of profile positions and shapes across adjacent images . Often,
the user need not reinitialize the deformable contours when progressing from image to image to
extract adjacent profiles of a dendrite.
Once the deformable contours equilibrate into the membrane ravines in PI" we replace this po-
tential function with the potential function PI'±l of an adjacent member of the image sequence.
Continuing from their previous equilibrium positions, the contours automatically slide downhill
to regain their equilibria in the new ravines, quickly localizing the new positions of the cell mem-
branes in the adjacent image and conforming to their shapes.
633

Fig. 5: Oblique slice through model of a dendrite reconstructed from 41 serial sections. The mito-
chondrion can be seen extending through the length of the dendrite and some smooth endoplasmic
reticulum is visible in the large spine.

This simple mechanism for exploiting image-to-image coherence works so long as the perturbation
is small enough to maintain the deformable contours within the membrane ravines as we switch
to adjacent images. Should part of a contour escape the ravine, however, the rest of the contour
will usually pull it back into place due to the rigidity of the model. Nonetheless, deformable
contour reinitializations are normally required when portions of the dendritic spines are no longer
connected to the parent dendrite.

3 Visualizing Neuronal Dendrites


During image segmentation, the user can visualize what takes place in two graphics windows
(Fig. 4). The left window shows the current EM image overlayed with the deformable contour
simulation. The right window displays a stack of dendritic profiles and their interiors. The user
can also see the current contour with the (partial) model, thereby monitoring progress. The
stacked set of contours and interiors can be rotated and viewed from different vantage points.
When all the dendritic profiles have been found, we use the segmented dendritic interior to build
a volumetric voxel model. We apply volume rendering techniques to visualize the volumetric
model.
Volume rendering refers to the direct rendering of scalar data sampled in three dimensions. These
techniques differ from traditional computer graphics techniques in that explicit surfaces need not
be extracted from the data before display. Rather, the entire 3D volume of data is used for display.
Yet, by displaying only the portions of the volume that have a given density or a high gradient,
features and surfaces may be elicited without explicit representation (Foley et al. 1990).
We use the VoxelView system (Vitallmages, Inc . 1990) to reconstruct a 3D volumetric model
of a dendrite from the dendritic interiors that have been segmented from the serial sections.
The segmented dendrite in each image is enclosed by a rectangular array of black pixels. The
resulting arrays, or images, are stacked in order. The sizes of the rectangular arrays are chosen so
that the stack yields a rectangular parallelepiped. Since the sampling rate is less in the stacking
634

Fig. 6: Shaded model of the reconstructed dendrite .

direction (z-direction) than in the x- and y-directions of the images, the VoxelView system linearly
interpolates an additional number of sections . In the examples shown below, we have reconstructed
41 sections of a dendrite, and interpolated three sections between each pair of original sections to
get approxim.ately the correct proportions along the z-axis as we view all the sections.
By rapidly moving from one image to the next, we generate an interactive "movie" that enables
us to follow certain features of interest through the dendrite . We can also slice the stack of images
along planes perpendicular to the stacking direction, and along any arbitrary plane. Because of
the tissue cutting direction, the dendrite is positioned obliquely in the image volume; therefore,
we must cut the image volume obliquely to slice the dendrite lengthwise. One such oblique slice
is shown in Fig . 5, where we can see the shape and extent of the mitochondrion through the
dendrite, and also some smooth endoplasmic reticulum .
By default, the volume is rendered without any shading; however, the user can specify the position
of a light source and obtain a shaded view of the model. In Fig. 6, we have rendered the dendrite
model with shading. The shading helps accentuate the 3D shape of the dendrite. Other features of
the VoxelView system that we find useful are the ability to tumble the model interactively and to
make portions of the model transparent by adjusting the opacity values of the pixels (VitalImages,
Inc. 1990).
By using a volumetric representation of the dendritic model , we can represent the 3D shape of
the model with accuracy limited only by the original sampling of the EM images . Furthermore,
we can visualize the cytoskeleton and the organelles interior to the dendrite.

4 Summary and Future Research


We have described a prototype system for the reconstruction and analysis of neuronal dendrites.
Our goal is to reduce the effort required to reconstruct and analyze a complete dendrite from a
few months to a few days . We are approaching this goal by exploiting three recently developed
techniques for volume reconstruction: a digital blink comparator for EM section registration,
snakes, or active energy-minimizing contours, for dendrite segmentation , and volume rendering
to visualize both the overall morphology of 3D dendrites and their cytoskeleton and internal
organelles.
Much work remains to be done to improve the reconstruction process . The digital blink com para-
635

tor opens a way to use direct digitization from electron microscopes for serial microscopy. This
would eliminate the need for rephotographing and digitizing EM photomicrographs, thus reducing
the reconstruction time and eliminating distortions and quantization errors introduced by these
processes. Direct digitization from an electron microscope has been used for single section studies,
but has until now been impossible to use for serial microscopy, which requires section alignment
(Stevens and Trogadis 1984).
We need to provide (at least) a semi-automatic approach to image registration, with the user
intervening only for optional fine-tuning. Currently, during manual alignment, we resample the
image using nearest pixel sampling. It would be desirable, however, to allow subpixel alignment
and resampling by using a spatially varying shift filter, but this is computationally prohibitive with
our current equipment. Despite the sampling limitations, we have found the resulting alignments
to be quite satisfactory.
We need to improve upon the snakes' behavior at spine branching points, to reduce the amount
of user assistance required when dendritic spines are no longer connected to the parent dendrite.
We need to improve upon interslice interpolation to allow fractional section interpolation; that
is, proper resampling of the volume to get accurate proportions in the x-, y-, and z-directions.
Similarly, we need to provide better inter-pixel interpolation at volume slicing and rendering in
order to reduce aliasing without compromising image accuracy.
We have not yet begun to tackle anatomical analysis of the dendrites. We expect, however, that
through (semi- )automated approaches for dendrite decomposition, anatomical measurements, and
statistical analysis of these measurements, we can achieve reductions in analysis times similar to
those that we are beginning to realize in the reconstruction phase.
When we have reached our goal, the time required to reconstruct and analyze neurons or parts
of neurons will be reduced from a few months to a few days. It will then be possible to obtain a
sufficiently large number of reconstructions to evaluate quantitatively the functional consequences
that alterations in neuronal morphology have for both the normal and diseased brain.

5 Acknowledgments
The authors would like to thank Victor Vyssotsky, Director of Digital Equipment Corporation's
Cambridge Research Lab, for his support of this research. This work has benefited from many
discussions with Gudrun Klinker and Richard Szeliski, and from the use of image conversion
software written by Richard Szeliski. We would like to thank Robert Ulichney and Victor Bahl
for their help with the Autokon scanner. We also thank Dick Beane, Gudrun Klinker, and Richard
Szeliski for reading and commenting on the manuscript. Ingrid Carlbom and Kristen Harris thank
David Margulies at The Children's Hospital for having introduced us.

References
Ayache, N., Boissonnat, J., Brunet, E., Cohen, L., Chieze, J., Geiger, B., Monga, 0., Rocchisani,
J., and Sander, P. (1989). Building highly structured volume representations in 3D medical
images. In Proc. 3rd International Symposium on Computer Assisted Radiology, CAR '89,
pages 765-772, Springer-Verlag, New York.
Boissonnat, J. (1988). Shape reconstruction from planar cross-sections. Computer Vision,
Graphics, and Image Processing, 44, 1-29.
Brown, T., Chapman, P., Kairiss, E., and Keenan, C. (1988a). Long-term synaptic potentiation.
Science, 242, 724-728.
636

Brown, T., Chang, V., Ganong, A., Keenan, C., and Kelso, S. (1988b). Biophysical properties
of dendrites and spines that may control the induction and expression of long-term synaptic
potentiation. Neurology and Neurobiology, 35,201-264.
Cat ala, I., Ferrer, 1., Galofre, E., and Fabreques, 1. (1988). Deceased numbers of dendritic
spines on cortical pyramidal neurons in dementia: A quantitative Golgi study on biopsy
samples. Human Neurobiology, 6,255-259.
Chakravarty, 1., Nichol, B., and Ono, T. (1986). The integration of computer graphics and
image processing techniques for the display and manipulation of geophysical data. In Kunii,
T., editor, Advanced Computer Graphics, pages 318-333, Springer-Verlag, Tokyo, Japan.
Crick, F. (1982). Do dendritic spines twitch? Trends in Neuroscience, 5, 44-46.
Croswell, K. (1990). The pursuit of Pluto. American Heritage of Invention and Technology,
5(3), 50-57.
Drebin, R., Carpenter, L., and Hanrahan, P. (1988). Volume rendering. Computer Graphics,
22(4),65-74.
Feldman, M. and Dowd, C. (1975). Loss of dendritic spines in aging cerebral cortex. Anatomy
and Embryology, 148,279-301.
Ferrer, 1., Fabreques, 1., Rairiz, J., and Galofre, E. (1986). Decreased numbers of dendritic
spines on cortical pyramidal neurons in human chronic alcoholism. Neuroscience Letters, 69,
115-119.
Foley, J., van Dam, A., Feiner, S., and Hughes, J. (1990). Computer Graphics: Principles and
Practice. Addison-Wesley Publishing Company, Reading, MA.
Giertsen, C., Halvorsen, A., and Flood, P. (1990). Graph-directed modelling from serial
sections. The Visual Computer, 6(5), 284-290.
Graveland, G., Williams, R., and DiFiglia, M. (1985). Evidence for degenerative and regener-
ative changes in neostriatal spiny neurons in Huntington's disease. Science, 227,770-773.
Harris, K. and Stevens, J. (1988). Dendritic spines of rat cerebellar Purkinje cells: Serial electron
microscopy with reference to their biophysical characteristics. The Journal of Neuroscience,
8(12),4455-4469.
Harris, K. and Stevens, J. (1989). Dendritic spines of CAl pyramidal cells in the rat hip-
pocampus: Serial electron microscopy with reference to their biophysical characteristics.
The Journal of Neuroscience, 9(8), 2982-2997.
Harris, K., Jensen, F., and Tsao, B. (1989). Ultrastructure, development, and plasticity of
dendritic spine synapses in area CAl of the rat hippocampus: Extending our vision with
serial electron microscopy and three-dimensional analysis. Neurology and Neurobiology, 52,
33-52.
Harris, K., Cruce, W., Greenough, W., and Teyler, T. (1980). A Golgi impregnation technique
for thin brain slices maintained in vitro. Journal of Neuroscience Methods, 2, 363-371.
Hohne, K., Bomans, M., Pommert, A., Riemer, M., and Tiede, U. (1988). 3D segmentation and
display of tomographic imagery. In Proc. International Conference on Pattern Recognition,
pages 1271-1276, IEEE Computer Society Press, Rome, Italy.
Hohne, K., Bomans, M., Pommert, A., Riemer, M., Schiers, C., Tiede, U., and Wiebecke, G.
(1989). 3D visualization of tomographic volume data using the generalized voxel-model. In
Upson, C., editor, Proceedings of Volume Visualization Workshop, pages 51-57, Department
of Computer Science, University of North Carolina, Chapel Hill, NC.
Horn, B. K. P. (1986). Robot Vision. MIT Press, Cambridge, Massachusetts.
Kandel, E. and Schwartz, J. (1985). Principles of Neural Science. Elsevier Science Publishing
Co., Inc., New York, NY.
Kass, M., Witkin, A., and Terzopoulos, D. (1987). Snakes: Active contour models. Interna-
tional Journal of Computer Vision, 1 (4), 321-331.
Leitner, F., Marque, 1., LaVallee, S., and Cinquin, P. (1990). Dynamic Segmentation: Finding
the Edge with Differential Equations and 'Spline Snakes '. Technical Report TIMB - TIM 3
637

- IMAG, Faculte de Medecine, 38700 La Tronche, France.


Levoy, M. (1990). Efficient ray-tracing of volume data. ACM Transactions on Graphics, 9(3),
245-261.
Leymarie, F. (1990). Tracking and Describing Deformable Objects Using Active Contour Models.
Master's thesis, Computer Vision and Robotics Laboratory, McGill Research Centre for
Intelligent Machines, McGill University, Montreal, QC, Canada.
Liu, H. (1977). Two and three dimensional boundary detection. Computer Graphics and Image
Processing, 6, 123-134.
Lorensen, W. and Cline, H. (1987). Marching cubes: A high resolution 3D surface construction
algorithm. Computer Graphics, 21(4), 163-169.
Marin-Padilla, M. (1976). Pyramidal cell abnormalities in the motor cortex of a child with
Down's syndrome: A Golgi study. Journal of Computational Neurology, 167,63-82.
Meagher, D. (1982). Geometric modeling using octree encoding. Computer Graphics and Image
Processing, 19, 129-147.
Meagher, D. (1984). Interactive solids processing for medical analysis and planning. In Pro-
ceedings of National Computer Graphics Association, NCGA '84.
Morgenthaler, D. and Rosenfeld, A. (1981). Multidimensional edge detection by hypersurface
fitting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 3(4),482-486.
Oppenheim, A. V. and Schafer, R. W. (1975). Digital Signal Processing. Prentice Hall, Inc.,
Englewood Cliffs, New Jersey.
Press, W. H., Flannery, B. P., Teukolsky, S. A., and Vetterling, W. T. (1986). Numerical
Recipes: The Art of Scientific Computing. Cambridge University Press, Cambridge, England.
Rail, W. (1974). Dendritic spines, synaptic potency, and neuronal plasticity. In Woody, C.,
Brown, K., Crow, T., and Knispel, J., editors, Cellular Mechanisms Subserving Changes in
Neuronal Activity, pages 13-21, Brain Information Service, Los Angeles, CA.
Sabella, P. (1988). A rendering algorithm for visualizing 3D scalar fields. Computer Graphics,
22(4), 51-58.
Scheibel, M., Crandall, P., and Scheibel, A. (1940). The hippocampal-dentate complex in
temporal lobe epilepsy. Epilepsia, 15, 55-80.
Spacek, J. (1987). Ultrastructural pathology of dendritic spines in epitumorous human cerebral
cortex. Acta Neuropathology, 73, 77-85.
Stevens, J. and Trogadis, J. (1984). Computer-assisted reconstruction from serial electron
micrographs: A tool for the systematic study of neuronal form and function. Advances in
Cellular Neurobiology,S, 341-369.
Stevens, J. and Trogadis, J. (1990). A systematic approach to 3D confocal microscopy: Appli-
cation of volume investigation methods. Journal of Microscopy, In press.
Stevens, J. K., Davis, T. L., Friedman, N., and Sterling, P. (1980). A systematic approach
to reconstructing microcircuitry by electron microscopy of serial sections. Brain Research
Reviews, 2, 265-293. .
Terzopoulos, D. (1987). On matching deformable models to images: Direct and iterative
solutions. In Topical Meeting on Machine Vision, pages 160-167, Optical Society of America,
Washington, D.C.
Ulichney, R. (1982). Image Lab Picture Files, Internal Memo. Digital Equipment Corporation,
Maynard, MA.
Upson, C., editor. (1989). Proceedings of the Chapel Hill Workshop on Volume Visualization,
Department of Computer Science, University of North Carolina, Chapel Hill, NC.
Upson, C., Faulha.ber Jr., T., Kamins, D., Laidlaw, D., Schlegel, D., Vroom, J., Gurwitz, R., and
van Dam, A. (1989). The application visualization system: A computational environment
for scientific visualization. IEEE Computer Graphics and Applications, 9(4), 30-42.
Vitallmages, Inc. (1990). VoxelView/PLUS 1.4, The interactive volume rendering system.
Fairfield, IA.
638

Wickens, J. (1988). Electrically coupled but chemically isolated synapses: Dendritic spines and
calcium in a rule for synaptic modification. Progress in Neurobiology, 31, 507-528.
Wilson, C., Murakami, F., Katsumaru, H., and Tsukahara, N. (1987). Dendritic and somatic
appendages of identified rubrospinal neurons of the cat. Neuroscience, 22, 113-130.
Wolphe, Jr., R. and Liu, C. (1988). Interactive visualization of 3D seismic data: A volumetric
method. IEEE Computer Graphics and Applications, 8(4), 24-30.
Zucker, S. and Hummel, R. (1981). A three-dimensional edge operator. IEEE Transactions on
Pattern Analysis and Machine Intelligence, 3(3), 324-331.

Ingrid earlhom is a member of the research staff at Digital Equipment


Corporation's Cambridge Research Lab. From 1980 to 1986 she was a
member of the professional staff at Schlumberger-Doll Research, Ridge-
field, Connecticut. Her research interests include scientific visualization,
geometric modeling, medical imaging, and computer graphics system ar-
chitecture. Carlbom received a PhD in computer science from Brown
University, a MS in computer science from Cornell University, and a
Fil.Kand. from the University of Stockholm, Sweden. She was a director
of SIGGRAPH from 1982 to 1986 and the chair of the SIGGRAPH Ad-
visory Board from 1986 to 1988. She is a member of ACM, SIGGRAPH,
and IEEE.
Address: Digital Equipment Corporation, Cambridge Research Lab,
One Kendall Square, Bldg. 700, Cambridge, MA 02139.

Demetri Terzopoulos is an associate professor of computer science


at the University of Toronto and a fellow of the Canadian Institute for
Advanced Research. For the past five years he has been affiliated with
Schlumberger, Inc., serving as a program leader at the Laboratory for
Computer Science, Austin, TX, and at the former Palo Alto Research
Laboratory. Previously he was a research scientist at the MIT Artificial
Intelligence Laboratory, Cambridge, MA. His areas of interest include
computer vision, computer animation, visualization, and massively par-
allel computation. Terzopoulos received a PhD in artificial intelligence
from MIT in 1984. He received an MEng in electrical engineering in 1980
and a BEng in honours electrical engineering in 1978, both from McGill
University. He is a member of the editorial boards of CVGIP: Graphical
Models and Image Processing and the Journal of Visualization and Com-
puter Animation and is a member of the IEEE, AAAI, and Sigma Xi.
Address: Department of Computer Science, University of Toronto, 10
King's College Road, Toronto, ON M5S 1A4.

Kristen Harris is an assistant professor of neuroscience at the Harvard


Medical School and The Children's Hospital in Boston, Massachusetts.
From 1982 to 1984 she held a postdoctoral position in neurocytology and
electron microscopy at the Massachusetts General Hospital in Boston,
Massachusetts. Her research interests include cellular neurobiology and
development. Harris received a BS in biology from Moorhead State Uni-
versity, graduating Summa Cum Laude in 1976, a MS in neurobiology
from the University of Illinois in 1979, and a PhD in neurobiology from
the Northeastern Ohio Universities College of Medicine in 1982.
Address: Neurological Research Department, Children's Hospital, 300
Longwood Avenue, Boston, MA 02115.
A Visualization and Simulation System for
Environmental Purposes
Markus H. GroB and Volker Kuhn

ABSTRACT
In this paper we describe an integrated software system applicable in the area of environment re-
search. It contains a modelling tool to geometrically describe terrain via a digital terrain model as
well as pollution sources and other environment objects. A simulation module based on particle
systems is used to analyse the distribution behavior of the pollutants. In our model particles repre-
sent the pollutants and interactions between them approximate its behavior. Physical laws based on
Newtoman Mechanics control the interactions between neighboring particles. The interactions are
microscopic that means they define the relation between atomic representatives to get the global or
macroscopic behavior of the whole particle system. The execution of the physical laws is synchro-
nized with an internal clock. The visualization module provides new paradigms and scientific visu-
alization techniques adapted to environmental applications. Sophisticated particle animations
considering time as a fourth dimension help the user to interact with our system. An OSF-MOTIF
user interface supports the dialogue in practice.

Keywords: Computer Graphics, Terrain Modelling, Scientific Visualization, Simulation, Anima-


tion, Particle-Based Modelling, User Interface, Environmental Protection

1. INTRODUCTION

One of the most striking problems we face today and we will face in the future is the protection of
our environment. The low cost CPU power available at present allows the application of sophisti-
cated simulation methods in the area of air pollution. Therefore, we developed a system called
TERRA, which combines terrain and environmental data processing and visualization as well as
the simulation of pollution behavior in the air. For other applications introduced in [ZPS 89] the
developed system integrates simulation and visualization techniques.
Systematic techniques for measuring such emissions are either unavailable or unsatisfactory. The
emission composition consists of different gases and particles. Its behavior is so complex that it is
difficult to produce accurate results with classical simulation methods. Different weather condi-
tions also have an influence on the pollutants' propagation behavior and the chemical reactions
with the atmosphere. They may change the reaction behavior, rate, and components [SPR 70]
[CAD 66]. Important weatherfactors can be wind, temperature, cloudiness, preCIpitation, air mois-
ture, air pressure, and airflow [PIE 84] [BLF 67]. Along with the introduction of various terrain
characteristics and/or topographical properties, the complexity of the resultant meteorological
flow fields can increase. To consider environmental individuality, we use digital terrain modelling
techniques.

639
640

The goal of all the research is to reconstruct terrain and to simulate and analyse the amount of
emitted pollutants and its distribution behavior in that terrain. Thus, we need sophisticated visual-
ization techniques adapted to our application as well as user interfaces that support the specifica-
tion of the enVIronment and the pollution behavior. Our work is motivated by digital terrain model-
ling, physicallybased modelling, particle systems and the challenge of producing easy-to-use tools
whIch exploit these emerging techniques.
The remainder of the paper is as follows. First, we discuss the principles of our new integrated soft-
ware system based on a modeller, a simulator, and a visualizer. The data flow and interface struc-
ture will be introduced. The modelling techniques are presented briefly. A digital terrain model is
used for reconstructing the landscape including flora, fauna, cities, buildings, and traffic. In addi-
tion to gravity we applIed a meso-scaled wind model on the particles as an external force field. In
detail, the simulation algorithm based on particle systems and mechanical physics will be discussed.
Furthermore, the visualization tool and its various rendering techniques are presented. The system
is applicable to analyse satellite data, for measuring security aspects during industrial interferences,
for measuring insurance claims during industrial interferences, and for authorization processes of
large-scale industrial installations. We conclude our work and close the paper with drawing some
perspectives for the future.

2. PRINCIPLES OF THE MODELLING, THE SIMULATION, AND THE


VISUALIZATION SYSTEM

Our integrated system for environmental applications is structured according to figure 1. The three
main components are modelling, simulation and visualization.

Modelling

/
.~. Simulation Visua' izat ion

Fig. 1: The three components of the TERRA system

A modelling task yields to a geometric description of the scenery and a set of interactions between
particles. The elevation data of interest is processed by a digital terrain model (DTM) that recon-
structs the terrain surface via interpolation methods. External geometry models containing build-
ings, flora, and pollution sources must also be integrated into the geometrical specification of the
scene. These models must be referenced in the DTM. The geometric description of the scenery
serves as input for the simulation module as well as for the visualization pipeline. The visualization
module represents an open toolbox containing state of the art techniques for scientific visualization
combined with new methods developed for air pollution.
Particles and particle systems provide a good solution for the simulation of pollution propagation.
A particle system approximates a pollution cloud and its shape with an amount of small particles.
The simulation module calculates the behavior of the pollutants. It is based on physical laws. The
laws will be encoded in C language. This simulation technique uses particles to approximate com-
plex natural objects or phenomena, like gases. Easy microscopic interactions between the particles
641

are formulated on the atomic or molecular level to get the macroscopic or global behavior of the
objects or phenomena [BRK 89].

Particles are atomic, discrete entities modelled as point masses in space. A particle contains fixed
parameters that control the interactions between other particles. Furthermore, it contains state
variables to modify while the simulation is in progress. A system of laws define how its state vari-
ables change as a function of time and from interaction with other particles. Knowledge bases could
be applied to proof the consistency of the system of laws and to evaluate it. In our model each par-
ticle is manipulated directly by its "neighbors" and from particles far away through a global infor-
mation exchange. Each particle contains a neighbor list. The list must be generated and modified
quickly. It is important for the correctness of the simulation that this list is always up-to-date. Par-
ticles will only mteract with particles in the neighbor list. The amount of particles in the list can
change while the simulation IS in progress. Such a change must be detected. Afterwards the neigh-
bor lists must be updated accordmg to events. The system of laws must be proved and corrected
when the occasion arises [KUE 90J.

3. DESCRIPTION OF THE SCENE

The basis for the simulation module as well as for the visualization module is the geometric descrip-
tion of the scenery. Applications located in the environmental research area provide a digital ter-
rain model (DTM) and geometric information about objects related to the terrain, like buildings
and pollution sources. Thus, the requirements of the modelling tasks are similar to low cost flight
simulators. The following sections describe the basic modelling techniques applied in our system.

3.1. Modelling of Terrain and Environment

The basic modelling task for the simulation and visualization of atmospheric data is to reconstruct
the surface of the terrain using a digital terrain model. DTMs are applied to several purposes with
great success [GRK 88], [NIS 89], [GRO 90]. Thus, a large number of DTM algorithms have been
introduced so far. The most common techniques used for digital terrain modelling are interpola-
tion and triangulation methods for given digitized contour lines [AUS 90] derived from
computional geometry. Other methods use remote sensing data and algorithms of computer vision,
like shape from shading or stereo matching. The chapter below describes the interpolation method
for contour lines and a post-processing algorithm for an adaptive subdivision of the terrain mesh
we use. Triangulation techniques, like Delaunay or others are not considered here. Detailed de-
scriptions of these techniques are given in [ZYD 86] and [AUS 90].

3.1.1. Interpolation techniques


Computional geometry delivers a wide range of interpolation techniques for surface reconstruction
from given contour lines. In our system we use the following first order interpolation according to
equation 1:
4
' " ' Z~i
L... d,
i= 1
(1)
4

Ii
i= 1
642

with ZWR : altitude of a mesh node to be determined


ZWi: altitudes of neighboring contour lines
di : horizontal and vertical distances to neighboring contour lines.

The method produces an equidistant mesh according to figure 2.

mesh node

YWR

YW

Zw Xw
XWR

Fig. 2: Interpolation method

Figure 3 and figure 4 show some results of this method using the Grand Canyon elevation data as
digital contour lines.
643

Fig. 3: Mesh of the Grand Canyon

Fig. 4: Gouraud-shaded mesh


644

3.1. 2. Adaptive subdivision


The interpolation technique discussed above provides an inefficient large number of equidistant
mesh patches, respectively in areas of low gradient of elevation. A solution for such a problem is an
adaptive subdivisIOn method. Neighboring patches can be composed using the normals of the
triangles. The criterion of equation 2 must be applied due to the generalized triangle inequality. We
start at the lower left patch and check equation 2:
If

(2)
i=1 ;=1

with the normals on the triangles;


if; :
e: threshold value (choosen by the user)

than 4 patches of the terrain surface can be composed to one larger patch and the corresponding
polygons are reduced from 8 to 2. Figure 5 illustrates the method that can be used as a recursive
post processor for terrain data. The result is a mesh, whose subdivision depends on the gradient of
the surface normals.

terrain mesh

Fig. 5: Post-processing of the terrain data

3.1. 3. External geometry


Besides a digital terrain model it is also necessary to include external geometry, like buildings, tlora
or pollution sources, which are derived from a geometric modelling system. The geometry files pro-
vide a polygonal description of the objects and attributes like retlectance or color. In order to refer-
ence the objects in the DTM, the external objects must be transformed from their local coordinate
systems into the terrain coordinate system that is related to the lower left mesh vertex. If the eleva-
645

tion data is given for each vertex ofthe mesh, the following bilinear interpolation method calculates
the altitude of an arbitrary location onto the terrain surface using the four vertices of the corre-
sponding patch (see figure 6).
Let zp be the altitude of an arbitrary point (xp, Yp) onto the terrain surface that must be determined.
The toll owing equation defines the bilinear function:

(3)

with ZG : linear interpolated altitude from z, and z.


ZH : linear interpolated altitude from Z2 and Z3

Thus, the reference of an object in the terrain is accomplished by the translation vector (xp, yp, zp).
This method is very efficient, especially when object motions or terrain-following fight SImulations
must be achieved in real time.

Yl~------------ __~----------~---7L-----

origin

Fig. 6: Bilinear interpolation of arbitrary surface points

3.2. Modelling of Wind Fields


Wind has a significant influence on the distribution behavior of pollutants. For our application we
developed a meso-scale wind model that acts on the particles like an external force field. In our
model the wind is divided in two components, a horizontal and a vertical component. The vertical
wind motion consists of gravity and the pressure gradient. The friction force is not considered here.
646

Therefore, the density of the surrounding air and the density of an air packet of interest defines the
vertical acceleration dVvldt:

dv, = a, = g' eu - ep (4)


dt ep

with acceleration of gravity g,


density of air eu = 1.25 '10- 3 gr/cm-3,
density ep of the air packet of interest.

The horizontal wind motion is based on the pressure gradient G, the coriolis force C and the cen-
trifugal force Z. From G + C + Z = 0 results the horizontal wind velocity Vh:

(5)

with angular velocity of the earth [2 = 7.29' 10-5 sec-I,


radius of curvature '8 = 1 . I dp I '
(Q. sin<P)" e d,
geographic latitude <j> (for Frankfurt-FRG = 50°),
pressure difference dp,
isobar distance dn,
density p.

A simulated air mass containing pollutants increases in height based on temperature differences.
When the temperature of the surrounding air is higher than the temperature of the simulated air
mass, the air mass will increase its height. The air temperature changes adiabatically in consider-
ation to the temperature gradient for dry air r:

r= dT = L. T
dz Cp r (6)

with gravity g,
specific heat at constant pressure Cp = 1.0078· 10' erg.g-1.grad-1,
temperature T,
surrounding temperature T .

Therefore, we get the change rate of the air temperature with the height z:
T(z) = To - z . r (7)

4. SIMULATION OF POLLUTION BEHAVIOR

The simulation module is used to store, manipulate and execute the physical laws that embody the
interactions between the pollutant particles distributed by the pollution source. The interaction
laws are written in a general fashion to describe the interaction between all participating objects
647

considering terrain features. Those general laws are modified in order to produce executable code
to each specific object involved in the simulation. They describe the microscopic changes, like the
position, for each particle to get the macroscopic or global behavior of the whole system. An inter-
nal clock coordinates and synchronizes the interpretation of the laws. The principle of this tech-
nique is based upon the message passing methodology of objectoriented programming systems
[BGA87].
The simulation module simply maintains collections of classified laws represented by procedures
containing the laws formulated in C language, and an internal clock. An algorithm determines how
the laws are executed and the clock determines when they are executed. A double looping construct
controls these actions. First, we need the information of how many particles are emitted during one
discrete time step. Within the double loop the interaction of each object with every other object is
determined. In this context an object can represent the pollution source, environmental concerns,
or pollutant particles. All participating objects are combined in an object list. The interaction laws
are divided into eight classified groups that are associated with each step of the double loop. The
double looping algorithm of the simulation module is illustrated in figure 7. In the double loop the
calls of the procedures containing the laws are positioned appropriately. The list containing the
simulation objects are called list A. List B is a copy of list A. It guarantees that the interaction of
every object with every other object may be determined.
call initialJawsO;
for (number of iterations)
{
call iniUawsO;
for (each member of list A)
{
call init_AJaws();
for (each member of list B && mem_A! = mem_B)
{
call calcJawsO;
}
call term_AJawsO;
}
call termJaws();
call actionJaws();
increment clock;
}
call terminJawsO;

Fig. 7: Double loop construct

The laws are classified in two categories. The first catesory contains the procedures initiatfaws,
action_laws, and termin_Iaws that provide support functIOns. The simulation variables can be ini-
tialized and intermediate results can be stored temporarily for visualization or control reasons. The
second category consists ofthe l?roceduresinit laws,init A laws,calc laws,term A laws, and term
}aws. These laws are used speCIfically to describe and define the object interactlon:-They manipu-
late the collection of interactins objects. In order to generate an interaction formulation common
to a number of objects using a smgle collection oflaws, a variable convention was introduced. Spe-
cific pointer variables to the object structures were designated as variables for the interaction laws.
The headers of the law procedures contain the pointer to the appropriate object. Therefore, the
laws will be executed in consideration to the current object structure. The simulation results in a
spatial and temporal list of particles are specified with four dimensional coordinates.
648

5. VISUALIZATION

5.1. Terrain Data Visualization

The rendering of terrain and object data is very difficult because we have to deal with a large
amount of data. Thus, an accurate representation of each detail in the scenery is not possible and
also not necessary. It leads to a level-of-detail classification that is very important in the area of
real time applications like flight simulations. Photorealistic rendering is provided with great suc-
cess by texture mapping of remote sensing data on the DTM. This method, presented in [NIS 89], is
useful, especially if the observer is located far away from the scene. For environmental data visual-
ization, hardware-supported illumination models provide a sufficient method to display the
scenery.

5.2. Environmental Data Visualization

The efficiency of sophisticated visualization techniques for environmental and atmospheric data is
demonstrated inJPAP 88] and (HIS 88]. The algorithms applied to this research area are derived
from volume ren ering for medical imaging applications [LEV 88] and [KAU 901 representing pol-
lutant clouds. Iso-surface reconstruction plays an important role and descends from the marching
cubes algorithm of [LOC 87]. Additional pseudo color mapping onto the terrain surface [GRO
90a] supplies further information for the user.

5.2.1. Four-dimensional visualization

Particle animations and tracing are the most important tools related to air pollution applications.
The simulation process determines spatial and temporal distributions of pollutant particles repre-
senting a cloud due to time. Each particle P is specified with four-dimenSIOnal coordinates (xp , yp,
Zp, tp ) and the simulation time steps yield to an extra degree of freedom. Thus, the data sets repre-
sented by the particles are four-dimensional discrete and scalar data.
For our real time applications fast and efficient methods are necessary rather then photorealistic
and expensive rendering techniques for volume data. Modern hardware rendering systems allowing
the definition of geometric primitives as 3D points or globes support these requirements. The prob-
lem we have to face results from the simulatIOn process and can be identified III the irregular spatial
particle distribution that must be evaluated to obtain a 3D regular mesh of pollutant concentra-
tions.

5.2.2. Pollution concentration

The algorithm that simulates the pollution behavior requests concentration boundaries in the par-
ticle cloud. The following method determines a regular 3D mesh of concentration values based on
the particle positions for each time step. First, a bounding box encapsulates the particle cloud. A
cube represents a marching, counting volume of size VA and is shifted through the cloud according
to figure 8.
The counting volume determines the concentration Cj j k of the position (i, j, k) in a virtual regular
grid. It adds up every enclosed particle. The mesh size i's obtained from the bounding box and from
the volume VA. The concentration is obtained according to equation 8:
649

p'article cloud

..
..
...
-- - -.
.
. .... . - - . . - . - - - ... -.. ... -. -.. -.
. .. . . .
.. pollution

- . . -. - . -. - -
.......................................................
---------~-. -.-
". source
j
- --
.. .. .. .. ..- ..-. . .. .. - - - bounding-box

Fig. 8: Concentration evaluation scheme of a particle cloud

C;jl< = Nijl<
(8)
VA

with Nijl<: Number of enclosed particles;


VA : Counting volume

The method provides a regular 3D distribution of the concentration values that can be used as an
input set for pseudocolored particles or for reconstructing surfaces with marching cubes [LOC 87].

6. APPLICATIONS

The following figures demonstrate the flexibility and power of our new system. The terrain data set
contains a power plant and surrounding villages represented with simple models shown in figure 9
and 10. The applied wind field flows from north-east and moves the emitted pollutants away from
the power plant up into the air where they got spread apart.
Figure 11 and figure 12 show the scenery from different view points including a pseudo-colored
pollution cloud. The colors represent the current concentration of the pollution emission deter-
mined with the counting volume method. The OSF-MOTIF user interface is demonstrated too.
The artificial horizon and the pseudo-colored map widget of the terrain are important features for
the navigation tasks. For example, a flight through the scenery can be animated by defining a trajec-
tory interactively into the map. The sample points on the trajectory represent control points for a
spline interpolation of the flight to achieve a smooth motion.
650

Fig. 9: Contour representation of the terrain

Fig. 10: Pseudo-Coloring of the elevation data


651

Fig. 11: Detailled image of a pollutant cloud with pseudo colors

- -. - ~ . -- -

Fig. 12: Modification of the viewing parameters


652

7. CONCLUSIONS

Our work demonstrates that the use of a digital terrain model, the simulation technigue based on
particle systems, and a meso-scale meteorological model allows modeling and analysIs of environ-
mental phenomena. Particle systems must be created and evaluated containing 106 to 10" particles.
The particles are assumed to be homogenous. The {'ropagation facilities can be specified via a sys-
tem of laws, dealing with predefined or terrain specific boundary conditions. After each simulation
step, the current situation can be visualized. In the real world, different weather conditions have an
influence on the chemical reactions between the pollutants. In a simulation these external in-
fluences can be transformed into additional interaction laws to supplement the system of laws and
increase the degree of reality of the simulation.

8. PERSPECTIVES

The simulation algorithm is too time-consuming. New techniques have been presented in the liter-
ature that are really fast, but they are too restrictive in the range of applications [GRE 87] [HUM
86] [GRR 88]. They are far away from general purpose tools and are designed to solve a specific
problem. The application must be transformed to meet the algorithmic constraints. Thus, the simu-
lation error increases and the result will not represent a realistic and correct behavior.
On the other hand powerful hardware rendering systems will allow to increase realism in real time
applications. Volume rendering techniques and texture mapping of remote sensing data (Landsat)
will be applied to the environmental research area discussed above. Another future research goal is
the improvement of3D interaction with the system. For example, the simulation parameters can be
modified by picking the visualized pollution cloud. Thus, we will also consider applications of the
virtual reality area.

9. ACKNOWLEDGEMENT

The authors thank their students Klaus Bohm and Martin Zindler for the implementation of the
TERRA-Visualization System and Prof. Dr. Ing. J.L.Encarna~ao for his steady interest in their
work.

10. REFERENCES

[AMS 74] American Meteorological Society: "Conference On Cloud Physics", Pergamon Press,
1974.
[AUS 90] Auerbach S., Schaeben H.: "Surface Representations Reproducing Given Digitized
Contour Lines", Mathematical Geology, Vol. 22, No.6, 1990, pp.723-742.
[BLF 67] Blair T. A, Fite R. c.: "Weather Elements", Prentice-Hall, Inc. Englewood Cliffs,
New York, 1967.
[BGA87] Breen D. E., Getto P. H., Apodaca A A, Schmidt D. G., Sarachan B. D.: "The Clock-
works: An Object-Oriented Computer Animation System", Eurographics '87 Pro-
ceedings, Elsevier Science Publishers B.Y., Amsterdam, The Netherlands, August
1987, pp. 275-282.
653

[BRK 89] Breen D. E., Kuhn Y.: "Message-Based ObjectOriented Interaction Modeling", Eu-
rographics '89 Proceedings, Elsevier Science Publishers B.Y., Hamburg, FRG, Sep-
tember 1989, pp. 489-503.
[CAD 66] Cadle R D.: "Particles in the Atmosphere and Space", Reinhold Publishing Corpora-
tion, New York, 1966.
[GRE 87] Greengard L. E: "The Rapid Evaluation of Potential Fields in Particle Systems", The
MIT Press, Cambridge, MA, 1987.
[GRK 88] GroB M., Koglin H.-J.: "Representation of planned Overhead-Lines- the Optical
Impression on the Landscape", lEE Conference Publication No. 297, London, 1988,
pp.151-155.
[GRO 90] GroB M: "Computer Graphics in Overhead Line Planning - Innovative Methods Us-
ing Visibility Analyses", Elektrizittswirtschaft, No.6, 1990, pp. 260-268.
[GRO 90a] GroB M.: "An Integrated Simulation and Visualisation System for Environmental
Protection", Proceedings 5. Symposium on Computer Science for Environmental
Problems, Springer-Verlag, Vienna, Austria, September 1990, pp. 808 - 817.
[GRR 88] Greengard L.E, Rokhlin Y.: "On the Efficient Implementation of the Fast Multipole
Algorithm", Research Report YALEUIDCS/RR-602, February 1988.
[HEN 86] Henselder R: "Vorschriften zur ReinhaItung der Luft (TA-Luft)", Koln, Bundesan-
zeiger Verlagsgesellschaft, 1986.
[HIS 89] Hibbard W, Santek D.: "Visualizing Large Data Sets in the Earth Sciences", IEEE
Computer, Vol. 22, No.8, 1989, pp. 53-57.
[HUM 86] Humphries S. (Jr.): "Principles Of Charged Particle Acceleration", John Wiley &
Sons, Chichester, New York, 1986.
[KAU 90] Kaufmann A., et. al.: "Volume Visualisation in Cell Biology", IEEE Visualisation '90
Conference Proceedings, IEEE Computer Society Press, Los Alamitos (CA), USA,
1990, pp. 160-167.
[KUE 90] Kuhn Y.: "Towards the Simulation of the Pollution Behavior of Airplanes", Proceed-
ings 5. Symposium on Computer Science for Environmental Problems, (Springer-
Verlag, Vienna, Austria, September 1990), pp. 583 - 598.
[LEV 88] Levoy M:: "Display of Surfaces from Volume Data", IEEE Computer Graphics and
ApplicatIOns, Vol. 8, No.5, 1988, pp. 29-37.
[LOC 87] Lorensen W. E., Cline H. E.: "Marching Cubes: A High Resolution 3D Surface Con-
struction Algorithm", ACM SIGGRAPH '87 Proceedings, Vol. 21, No.4, 1987, pp.
163-169.
[MAS 71] Mason B. J.: "The Physics Of Clouds", Clarendon Press, Oxford, 1971.
[NIS 89] Nishita T., et al.: "Three Dimensional Terrain Modelling and Display of Environmen-
tal Assessment", ACM SIGGRAPH '89 Proceedings, Vol. 23, No.3, 1989, pp.
207-213.
[PAF 88] Papathomas T. Y., et. al.: '~plications of Computer Graphics to the Visualisation of
Meteorological Data", ACM SIGGRAPH '88 Proceedings, Vol. 22, No.4, 1988, pp.
327-334.
[PEW 89] Pentland A., Williams J.: "Good Vibrations: Modal Dynamics for Graphics and Ani-
mation", ACM SIG- GRAPH '89 Proceedings, Vol. 23, No.3, July 1989, pp. 215 -
221.
654

[PIE 84] Pielke R. A: "Mesoscale Meteorological Modeling", Academic Press, 1984.


[PJP 90] Page B., Jaschke A, Pillmann W.: '~plied Computer Science in Environmental Pro-
tection", Part 1 and 2, Informatik Spektrum, Springer Verlag, 1990.
[ROG 76] Rogers R. R.: "i\. Short Course in Cloud Physics", Pergamon Press, Oxford, New York,
Toronto, 1976.
[SPR 70] Sproull W. T.: '~r Pollution and Its Control", Exposition Press, New York, 1970.
[ZPS 89] Zeltzer D., Pieper S., Sturman D. J.: "An Integrated Graphical Simulation Platform",
Graphics Interface '89 Proceedings, 1989, pp. 266 - 274.
[ZYD 87] Zyda M., et. al.: "Surface Construction from Planar Contours", Computer and Graph-
ics, Vol. 11, No.4, 1987, pp. 393-408.

Markus Gro6 was born in Nellnkirchen/Saar, Germany in 1963. He was


studying electrical Engineering at the University of Saarbrucken where, in
1986, he received the Dipl.-Ing. degree. During that time he also held a
scholarship from the Siemens AG, Munich. In 1986, he received the
VDE-Saar award. In 1989, he achieved the Ph.D. (referees: Koglin/En-
carna~ao) on an application of Computer Graphics and Image Analysis. In
1990, he joined the Computer Graphics Center in Darmstadt. His re-
search interests include scientific visualization, rendering techniques and
neuronal network applications for visual pattern recognition and compu-
ter Vision. At the Technical University of Darmstadt he is also lecturing
about "Human Vision and Computer Graphics". Markus GroB has publis-
hed numerous refereed technical papers in application areas of Computer
Graphics and Image Analysis.

M. GroB is a member of the VDE~

Mail Address: Dr.-Ing. Markus GroB, Computer Graphics Center, WiJ-


helminenstraBe 7, D 6100 Darmstadt, Germany,
E-Mail: gross@zgdvda.uucp

Volker Kuhn is a computer scientist, and currently a Ph.D.-Student and


researcher at the Technical University of Darmstadt, Germany. He started
work there in particle-based modelling in 1988. His research interests inc-
lude computer graphics, dynamic simulation and animation, physical-ba-
sed modelling, and environmental problems. He received an MS in com·
puter science from the Technical University of Darmstadt in 1988.

V. Kuhn is a member of the ACM, ACM SIGGRAPH, and ACM SIGSIM.


He is also a member of IEEE and Computer Society.

Mail Address: Dipl.-Inform. Voker Kuhn, Technical University Darm-


stadt, Computer Science Department, Graphics Interactive Systems, Wil-
helminenstraBe 7, D 6100 Darmstadt, Germany
Synchronized Acquisition of Three-Dimensional
Range and Color Data and its Applications
Yasuhiko Watanabe and Yasuhito Suenaga

ABSTRACT
This paper presents a method of acquiring three-dimensional human face
data using a newly developed device that acquires three-dimensional range
data and surface color data at the same time. Cylindrical range data is
measured by a laser light source and a CCO sensor with a resolution of 512
vertical scanlines, and 256 points per scanline. The color data is acquired
as a cylindrical projection image having 512 by 256 pixels, 24 bits/pixel (8
bits each for red, green, and blue). The scanner has been successfully
applied to the scanning of human faces and other three-dimensional objects.
This paper also details various applications using the scanner.

Key Words: three-dimensional range data, surface color data, scanner,


facial expression, clustering, template image

1. INTRODUCTION

The authors have been conducting research on computer vision and graphics
for the recognition and synthesis of human images. These research areas are
important in realizing a better human interface and developing a model-
based COOEC (Mase 1989; Akimoto 1986; Akimoto 1990; Watanabe 1989).

This paper presents a method of three-dimensional sensing using a newly


developed device that acquires three-dimensional range data and surface
color data at the same time. There have already been many reports of three-
dimensional range finders and scanners (Inokuchi 1990). Almost all of them
are designed for the detection of three-dimensional distance or range data
only. Although ordinary color data is commonly regarded as easy to acquire
with ordinary color TV cameras or color scanners, synchronizing the two
sets of data is not easy as explained below.

Roughly speaking, for such applications as the recognition and synthesis of

655
656

various objects, two kinds of data are used: three-dimensional shape


(range) data and texture (color) data. Usually, these two data types are
acquired at different times, using different acquisition systems. Due to the
time delay between scans, especially for movable objects like human faces,
it is difficult to match these two kinds of data (Williams 1990). Moreover,
camera angles and lighting conditions may differ during the acquisition of
each data set. Though adjustment may be possible to some extent, acquiring
consistent data for three-dimensional objects having various shapes and
colors is practically impossible.

Thus, it is necessary to acquire range and color data at the same time. This
synchronized three-dimensional range and color data set allows the
accurate reproduction of various objects. The scanner is designed based on
these considerations.

This paper details three typical applications that effectively use the
scanner: (1) facial expression generation, (2) three-dimensional area
extraction and (3) face image database preparation.

2. SCANNER
Digitizer unit
The ECHO scanner (4020/PS Rapid
three-dimensional Digitizer),
manufactured by Cyberware Laboratory,
USA, is a laser scanner for measuring
cylindrical range data of objects, as
illustrated in Figure 1. According to
our specifications, the manufacturer
installed a TV-camera based color data
acquisition subsystem in the digitizer
unit of the scanner, while preserving
its original functions , resulting in the
first cylindrical scanner in the world to
measure three-dimensional range data
and surface color data at the same time.
The object to be measured is placed on a
center table, and the digitizer unit with Fig. 1 Three-dimensional face
a laser light source, a CCD sensor, a data acquisition by the
nonflickering light source and a CCD synchronized cylindrical range
color TV camera rotates 360 degrees and color scanner.
657

horizontally around it, synchronously acquiring cylindrical range and color


data. It takes the scanner fifteen seconds to complete a full scan.

The cylindrical range data is measured by the laser light source and the CCD
sensor with a resolution of 512 vertical scan lines, and 256 points per
scanline. Measurement resolution is within 0.7 mm when measuring a
cylinder 350 mm high and 350 mm in diameter. Actually, the cylindrical
range data is converted to 512 x 256 sets of x, y and z coordinate values.
Surface color data is measured by a CCD color TV camera. The color data is
acquired as a cylindrical projection image having 512 by 256 pixels, 24
bits/pixel (8 bits each for red, green, and blue). Since the cylindrical range
and surface color are measured at the same time, all data is acquired in a
synchronized form.

Figures 2-1, 2-2, 2-3 and 2-4 show the measured results of a terrestrial
globe, a doll, a pot and a ski boot, respectively. In each of the figures, (a)
shows the wire-frame of the three-dimensional range data, and (b) shows
the cylindrical color data. Both range data and color data are easily
processed to rebuild the complete three-dimensional shape (with color) as
shown in (c) of each figure. Figures 3 (a) and (b) show an example of
acquired cylindrical range and color data for a human head. Figure 3 (c)
shows the human head images generated from the combination of range and
color data using various viewing points.

3. APPLICATIONS

3-1 Facial Expression

There are many Computer Graphics applications that produce facial


expressions by moving only a few surface control points on a three-
dimensional face model. The surface around the control points is modified
according the specified functions to produce appropriate skin deformation.
The texture mapping technique is also used to reproduce a realistic
appearance.

Synthesized facial expressions were produced using the data acquired by the
scanner. Since the three-dimensional shape data includes both three-
dimensional range and surface color data, realistic facial expressions are
easily produced without texture mapping as in conventional graphics
pipelines. In Figure 4, the left-most frames are generated from the original
658

(a) Wire-frame (b) Color data (c) 3-D shape


Fig. 2-1 Terrestrial globe.

(a) Wire-frame (b) Color data (c) 3-D shape


Fig. 2-2 Doll.

(a) Wire-frame (b) Color data (c) 3-~ shape


Fig. 2-3 Pot.

(a) Wire-frame (b) Color data (c) 3-D shape


Fig. 2-4 Ski boot.
659

(a) Front and side views made from range data only

(b) Color data


(Cylindrical projection image, 512x256 pixels, 24bits)

Front and side views made from range and color data
Fig. 3 Cylindrical range and color data.
660

Fig. 4 Three-dimensional facial animation (smile).


661

data. All other frames were generated to simulate a smiling expression by


controlling only four points in the range data set. Both sides of the lips
were moved up and both sides of the eyes were moved a little bit down to
create a smiling expression. The surfaces around these points were moved
inversely proportional to their distance from the specified pOints .

3-2 Three-dimensional Area Extraction

As described in the previous section, the acquired three-dimensional shape


data can be manipulated using the color data. Since the color data is
composed of a 2-dimensional cylindrical projection image, conventional
image processing techniques can be applied to this data. Figure 5 (a) shows
a face that was scanned. Figure 5 (b) shows the color data taken by the
scanner. Figure 5 (c) shows the image clustered in RGB color space by using
the MDL (Minimum Description Length) clustering technique (Wallace 1990).
The three-dimensional shape data could be easily extracted using this
clustered color image data. Figure 6 shows the extracted three-dimensional
area recognized as "skin area." This proves that the two-dimensional color
data is very useful in extracting the desired three-dimensional shape from
original data.

(b) Color data

(a) Face image

(c) Clustered in RGB space


Fig. 5 Area extraction.
662

Fig. 6 Extracted area.

3-3 Face Image Database

In the development of face recognition systems, the most important thing is


the construction of an extremely large face image database to allow
accurate matching. Since facial pattern matching must be done under
various conditions, the preparation of a sufficient number of template face
images is not easy. Y

Wh~n using the scanner, the three-


dimensional shape data can be used for
normalization of face position . Figure 7
shows the determination of standard
axes in this experiment. Both
translation and rotation are done in
three-dimensional space. Scanning o
emulations are done using the
normalized three-dimensional shape as
shown in Figure 8. Fig. 7 Standard axes.
Only one vertical scanline is rendered in each step, and the process is
repeated for 512 steps to obtain a new cylindrical projection image. This
process produces template face images for various conditions quite easily.
Figure 9 shows the synthesized color data from the normalized three-
dimensional shape. This color data can be used as a face image template for
matching.
663

Cylindrical projection image

'6c"-t--r.~ ...... .. .................. .

(512 tep)
Observation point
Fig. 8 Emulation of scanning .

• • 111 _ ~,~4.__
.~

Fig. 9 Template face image.


664

4. CONCLUSION

The authors have been using the scanner mainly for acquiring three-
dimensional data of human faces, hands, legs, and trunks. The proposed
method is very useful in making a complete database for three-dimensional
images of the human body. This database is needed to improve the human
interface and for the development of a model based CODEC. The method
proposed in this paper is a general purpose three-dimensional data
acquisition method. It has opened the way for the acquisition of highly
accurate three-dimensional data through the direct synchronized
measurement of shape and color of objects. The method is widely
applicable to the preparation of very practical three-dimensional databases.

ACKNOWLEDGEMENTS

The authors wish to thank Mr. David Addleman, President, Cyberware


Laboratory Inc. for his cooperation in realizing the scanner. The authors
thank Dr. Takahiko Kamae, Director, NTT HI Labs, Dr. Yukio Kobayashi,
Executive Manager, Visual Perception Lab, and the members of the VP Lab,
for their encouragement and valuable discussions. The authors also thank
Mr. Pierre Poulin for assisting in the experiments and Dr. Richard Wallace
for improving the manuscript.

REFERENCES

K. Mase, Y. Watanabe, Y. Suenaga (1989) A real-time head motion detection


system. SPIE Workshop on Sensing and Reconstruction of Three-
Dimensional Objects and Scenes, Vol. 1290, pp262-269.
L. Williams (1990) Performance-Driven Facial Animation. Computer
Graphics, Vol. 24, No.4, pp235-242.
R. S. Wallace, Y. Suenaga (1990) Color face image segmentation using MOL
clustering. 1990 Spring national convention record of IEICE, SD-11-3,
pp379-380.
S. Inokuchi (1990) State of the art of 3D sensing technologies. Gazo Lab. ,
Vol. 1, No.4, pp44-47 (in Japanese).
T. Akimoto (1986) Expressive facial animation by 3D jaw model and
automatic shape modification. NICOGRAPH '86, pp207-213 (in Japanese).
T. Akimoto, R. S. Wallace, Y. Suenaga (1990) Automatic creation of facial
model for generating facial images. IAPR Workshop on Machine Vision
Architecture '90, pp.291-294.
Y. Watanabe, Y. Suenaga (1989) Drawing human hair using wisp model. CG
International '89, pp691-700.
665

SYNCHRONIZED ACQUISITION OF THREE-DIMENSIONAL RANGE AND


COLOR DATA AND ITS APPLICATIONS

Yasuhiko WATANABE and Yasuhito SUENAGA

Yasuhiko Watanabe is Senior Research Engineer in


the Visual Perception Laboratory of the NTT Human
Interface Laboratories. He is presently engaged in
research on 3D model based coding systems. Since
joining the Electrical Communications Laboratories,
NTT, in 1981, he has been working on facsimile
communication systems and Videotex communication
systems .

He received the Bachelor's degree from Niigata University, Niigata, Japan, in


1981 . He is a member of the Information Processing Society of Japan.
Address : Visual Perception Laboratory (420C), NTT Human Interface
Laboratories, 1-2356, Take, Yokosuka-Shi, Kanagawa, 238-03 Japan.
CS-Net : watanabe%nttcvg.NTT.jp@relay.cs .net

Yasuhito Suenaga is Senior Research Engineer,


Supervisor, of Visual Perception Laboratory in NTT
Human Interface Laboratories. He leads a research
group of computer graphics and vision. Since joining
the Electrical Communications Laboratories, NTT, in
1973, he has been engaged in the research of image
processing .

He received the B.S ., M.S. , and Ph.D. degrees in electrical engineering from
Nagoya University, Nagoya, Japan, in 1968, 1970 and 1974 respectively. He
is a member of the Institute of Electronics and Communication Engineers of
Japan, and the Information Processing Society of Japan.
Address : Visual Perception Laboratory (420C), NTT Human Interface
Laboratories, 1-2356, Take, Yokosuka-Shi, Kanagawa, 238-03 Japan.
CS-Net : suenaga%nttcvg.NTT.jp@relay.cs.net
Piecewise Planar Surface Models from Sampled
Data
David A. Southard

ABSTRACT

Interactive visualization of three dimensional data requires construction of a geometric model for
rendering by a graphics processor. We present an automated method for transforming dense,
uniformly sampled data grids to an irregular triangular mesh that represents a piecewise planar
approximation to the sampled data. The mesh vertices comprise surface-specific points, which
characterize important surface features. We obtain surface-specific points by a novel application of
linear and non-linear filters, and thresholding. We define a procedure for constructing a
triangulation, derived from a Delaunay triangulation, that conforms to the sampled data. In our
example application, modeling a terrain surface over a large area, an 80% reduction in polygons
maintains an acceptable fit. This method also extends to the tessellation of images. Applications
include scientific visualization and construction of virtual environments.

Key words: geometric modeling, irregular mesh, curvature features, Delaunay triangulation,
terrain modeling.

INTRODUCTION

The surfaces of natural objects are complex and irregular, yet most attempts to model natural
surfaces use regular polygonal meshes. To capture features in detail, one must sample the surface at
a high spatial sampling rate. Unfortunately, this procedure over-samples local regions exhibiting
low variation, leading to redundancy in the geometric model. If the number of polygons required to
model a surface could be reduced, there would be an improvement not only in visualization response
times, but also in the accuracy achievable with a given polygon budget.

Background

For interactive and real-time visualizations, ray-tracing techniques cannot be used. The geometrical
primitives offered on graphics workstations are the quickest, most convenient way to render such vi-
sualizations. If the surface model uses spline surface patches, one must convert the shape to, a mesh
of polygons for rendering on a graphics processor.

The most common way of modeling surfaces is to construct a regular grid of sample values over the
surface. The grids formed are usually quadrilateral, although a few applications use triangular or

667
668

lexagonal grids. The grid points constitute the vertices of a regular polygonal mesh covering the
urface.

\'lthough regular grids are convenient, they are arbitrary with respect to the shape of the surface it-
elf. Mark (1979) argues that data structured on the phenomena itself, instead of on an arbitrary
:rid structure, improves both accuracy and compactness of terrain models. A grid point, for
~xample, does not necessarily correspond to any significant feature of the surface. More likely, grid
)()ints will straddle the features, resulting in errors and visible artifacts introduced by the sampling
:rid. It is necessary to sample the surface using a very fine grid to capture significant features, and
o reduce artifacts introduced by the sampling grid itself. The number of polygons used to model
he surface increases as the square of the spatial sampling rate.

n contrast to the surface-arbitrary sampling of regular grids, the properties of the surface itself de-
ermine the selection of surface-specific points. Mark (1975) shows that terrain surface models con-
.tructed from surface-specific points are superior as a basis for data base characterization of terrain
ourfaces. Unfortunately, much of the data acquired automatically is inherently gridded. Interpreta-
ion of these data would benefit from a way to extract surface features automatically.

(lrevious Work

rohnson and Rosenfeld (1975) determined surface-specific points by searching for local maxima and
ninima. Peucker and Douglas (1975) experimented with similar techniques. They used local eleva-
ion profiles to characterize terrain features. They also used a "climbing" technique to disqualify
x>ints that they did not consider to be surface features. Fowler and Little (1979) applied Peucker
md Douglas' techniques to an automated terrain modeling system, which converted sampled digital
:errain models to triangulated irregular network (TIN) terrain models. Recently, Scarlatos has
~xplored techniques for detecting surface features (l990a), and for constructing triangulated surface
models (1990b).

t\pproach

Geographers classify terrain features as

a. peaks
o. ridges
~. valleys, ravines
d. passes, where ridges and valleys intersect
e. pits, depressions
f. breaks, changes in slope.

These features form a necessary and sufficient set of features to characterize any terrain surface.
Consider a plan view, or map, of an area. We see that peaks, pits, and passes are point features,
whereas ridges, valleys, and breaks are linear features. We can model the areas between these fea-
tures as planar surfaces.

A common attribute of each of these six classifications is that each describes a kind of curvature in
the surface. In this context, we define curvature simply as a change in slope. Previous approaches
669

to the detection of surface-specific point


relied primarily on comparisons of
elevation values. These methods tend to
ignore breaks in slope, because they do
not represent local extrema. Breaks are
necessary for a complete surface
description. We avoid this problem by
concentrating on detecting curvature,
which is a characteristic of all features.

Our goal is to create a triangular


network of planar facets, constructed
from surface-specific points. We want
to tailor the size and shape of each facet
Threshold Sampling
Value Rate to fit the underlying surface. Response
time is a critical performance criterion in
interactive visualization applications.
We want to reduce the number of
polygons in the surface model as much
as possible, and still maintain an
acceptable visual appearance.

Overview

We describe, in succeeding sections, our


method for selecting surface-specific
points, and for constructing triangulated
surface models. To assess the value of
this method, we present an empirical
Fig. 1. Overview of Terrain Model Analysis. The error analysis, in comparison with
data path on the left represents the method presented terrain models derived from regular
here, the data path on the right represents the grids. Figure 1 illustrates an overview
conventional technique. of this analysis.

Throughout this exposition, we will use


a terrain modeling application as our running example. Our data set is a rectangular grid of 721,801
terrain elevation samples. Each sample is measured in meters above mean sea level. The samples
occur at intervals of approximately 100m. The elevation samples are precise to ± 1m; the absolute
vertical accuracy of an elevation measurement is ±30m. These samples represent an area of about
10,OOOkm2 . This area contains a variety of land formations, including mountains, plains, and river
Valleys. This data set is typical of the digitized terrain elevation data that are widely available.

SELECTION OF SURFACE-SPECIFIC POINTS

We represent the surface function with a rectangular grid of sample values. We write this sample
function array as F, and a particular sample element as F ... We use a cross-correlation (a form of
IJ
670

convolution) mask approach, common in image processing applications (Gonzalez, Wintz 1987), tl
implement digital linear filtering operators. An operator is represented by an n Xn array H. Th
discrete correlation operation 0 is defined by

(1

We extend F by Ln/zj rows and columns at the edges so that index wrap-around errors do ne
occur. We discard the additional rows and columns after the cross-correlation operation so that th
output array has the same dimensions as the input array.

Low-Pass Filtering

The first task in selecting surface-specific points is to ensure that the data are band-limited. If th
data are noisy or contain sampling precision artifacts, spurious features will become apparent durin
processing. The first step, therefore, is to process the data with a digital low-pass filter. Several se
lections for H are appropriate for low-pass filtering. We use

161[121]
242, (2
121
which approximates a spatial low-pass filter with a sharp cut-off at one-half the spatial sampling fre
quency.

Curvature Analysis

We base the selection of surface-specific points on an analysis of the curvature in the sample
surface data. Curvature can be detected by use of a second derivative operator. The Laplacia
operator,

(3

can detect curvature, provided that we disregard the partial-derivative cross-products a21axay -
a2layax. We apply the Laplacian to the function describing the surface, and then apply a threshol
to the absolute value of the result. Values near zero reflect a nearly planar region, which will b
represented by the interior of a planar facet. Absolute values that exceed the threshold values repre
sent positions with significant curvature, and thus comprise surface-specific points.

Digitally, we use second-difference operators to approximate the Laplacian. In one dimension, th


Laplacian may be expressed as !:l.2fI = f 1- 1- 2fI + f 1+ 1. The Laplacian is a linear operator, and a
with low-pass filtering, it can be implemented using a correlation mask. Pratt (1978) provide
several forms of the digital Laplacian in two dimensions:
671

0-1 0]
[ -I 4-1 (4)
0-1 0

-I -1 -1 ]
[ -1 8-1 (5)
-I -1 -1

[-21-24-2I] (6)
I -2 1

rhe first version (4) uses a cross-shaped neighborhood, summing the second differences in the verti-
;al and horizontal directions. The second form (5) also includes the second differences along the di-
19onal directions. According Prewitt (1970), the third form (6) actually represents the cross term of
.he bi-Laplacian operator,

(7)

ntuitively, this operator responds to "twists" and "wrinkles" in the surface formations. All forms
)f the Laplacian will yield a value of zero in regions of constant gradient. We chose the second
orm (5) for this analysis.

)ur aim is to let the surface-specific points (peaks, pits, passes, and bends in ridges, valleys, and
)reaks) form the vertices of our final polygonal mesh. The edges of the polygons will then corre-
pond to the straight sections of the linear features (ridges, valleys, and breaks). Regions with little
)r no curvature will be modeled as the interiors of the polygons.

>Jeighborhood Ranking

~he first impulse is to select an absolute "curvature" value to use as threshold for any given area.
"Ie find, that this technique produces an adequate surface model in regions exhibiting a high degree
If variation, such as in mountainous areas, and a completely inadequate model of regions with low
urvatures, such as in the plains. The solution is to make the threshold adaptive to each local
egion. We have articulated a form of non-linear filter, similar to a median filter, that provides
ood results. We call it a neighborhood rank filter, because we define a neighborhood, and rank the
bsolute value of the central point with respect to the surrounding points in the neighborhood. We
efine the rank as the number of the points in the neighborhood that are less than the central value.
Ve use a square shaped neighborhood of size pXp; circular neighborhoods are possible as well.
'ractical values for p range from 3 to 7 .

•daptive Thresholding

'he neighborhood filter results in an output array with values in the range [0, i-I]. We now
hoose a threshold in this range as a criterion for selection from the band-limited elevation samples
rid. In effect, this procedure adaptively adjusts the threshold to pick points with locally significant
672

curvature. The selected set of points comprises surface-specific points, which are distributed irregu-
larly, yet evenly over the sample area.

TRIANGULATION

Once the adaptive thresholding procedure has selected the surface-specific points, we must construct
a triangulated irregular network (TIN) from these points. Our sample array is a projection of three
dimensional points onto a plane. Although we are constructing a three dimensional surface model, a
two dimensional triangulation algorithm is appropriate for this class of problem. Boissonnat (1984)
showed that this practice is valid, provided that the samples constitute a proper, single valued func-
tion, and that the planar area's dimensions are less than the principal radius of curvature of the
three dimensional object. For terrain modeling, we are concerned with points that lie on the surface
of the earth. The principal radius corresponds to the radius of the earth, which is about 6378km.
Our example problem is well within this proportion.

Delaunay Triangulation

The Delaunay triangulation in the plane can be constructed from an arbitrary distribution of points.
Lawson (1977), and Preparata and Shamos (1985) discuss the mathematical properties of the Delau-
nay triangulation. Briefly, the Delaunay algorithm triangulates solely from the positions of the
points. The Delaunay triangulation of a set of points is unique. It is optimal by the circle criterion,
which states that a circle defined by any three mutually adjacent vertices in the triangulation contains
no other vertex of the triangulation. This criterion results in the most regular and equiangular
triangulation possible. Our implementation (Southard 1991) uses an asymptotically optimal,
O(n log n) divide-and-conquer algorithm for Delaunay triangulation developed by Lee and Schachter
(1980). This implementation is suitable for large data sets, such as we use in our terrain modeling
example.

Triangulation Conditioning

A difficulty with this use of the Delaunay triangulation is that, because it triangulates only from the
positions of the points, it disregards the interconnectedness of the underlying, surface-specific,
linear features. There is no guarantee that an edge in the triangulation will follow a linear feature.
For example, instead of following a ridge line, an edge may connect across a ravine to the opposite
ridge. This trait results in gross errors in the surface model.

The general optimal triangulation problem is difficult. Using naive programming techniques, the
general triangulation problem exhibits exponential growth in running time. The simplicity of curva·
ture analysis, on the other hand, using a Laplacian operator to select surface-specific points, is ex·
tremely attractive. Our approach is to perform a series of conditioning operations on the Delauna)
triangulation to transform it into a triangulation that fits the surface more closely. The advantage tc
this approach is that we begin with a well-defined, well-behaved triangulation. The effect of condi·
tioning is subsequently confined to a small region. The final triangulation, although no longer a De·
launay triangulation, retains some of its desirable characteristics.
673

Each conditioning step can be done in O(n)


time. For each point in the triangulation, we
sequentially examine each adjacent edge. Each
edge forms a diagonal of the quadrilateral
defined by the two triangles adjacent to it (see
fig. 2). If the quadrilateral thus defined is
convex, we calculate a heuristic error measure
e associated with the existing edge and with the B
alternate diagonal of this quadrilateral. If the
alternate diagonal has a smaller error value, we
delete the existing edge and insert the alternate
diagonal into the triangulation. We examine
every edge at least once during this procedure.
After several passes the number of edge
Fig. 2. Triangulation Conditioning. Delaunay
exchange operations is small, and the benefit
edge AC forms a diagonal of quadrilateral
of continuing the procedure diminishes. We
ABeD. This figure shows the alternate
have found three conditioning passes to be
diagonal BD as a dashed line. In the
sufficient for our terrain data. The first pass
conditioning algorithm, we calculate a heuristic
modifies approximately 50% of the edges, the
error measure e for both edges AC and BD. If
second pass about 4 %, and the final pass
e(AC) < e(BD) , we delete the Delaunay edge AC
modifies about 1%.
from the triangulation, and we insert BD. Note
We use as our error measure the absolute error that there is no alternate diagonal for
integrated over the path length of each edge. quadrilateral ADEF, because it is non-convex.
We approximate this by calculating the error at
each intersection of the diagonal edge with the
grid ordinates. For a predominantly vertical line, we evaluate the error at each vertical grid
ordinate. We figure the error similarly for predominantly horizontal edges. We compute the error
by linearly interpolation of the elevation along the diagonal, minus the elevation value linearly
interpolated from the two nearest sample points along the corresponding grid ordinate. By
comparing the totals of the absolute errors long each edge, we show a weak preference for the
shorter edge. Other options for the error measure could be the maximum absolute error, the root-
mean-square error, or the net error. We hope to investigate these alternatives in the future.

ANALYSIS

In our running example of terrain modeling, we find that nearly 25 % of the samples in our
band-limited elevation sample grid have a Laplacian response of zero. From the viewpoint of
piecewise planar modeling, then, about 25% of the points are redundant. We could construct our
model from the remaining 75% of the samples and get almost no error. The question is, whether
we can reduce the number of point even more, and maintain an acceptable degree of error. We
measure error as the deviation of the piecewise planar approximation from the sampled data set. We
also want to compare our method to the prevalent method of sub sampling the data to a coarser grid
spacing (see fig. 1).
674

(a) (b) (c)

Fig. 3. Errors in Terrain Elevation Models. This illustration shows the sampled elevation data as
relief maps. The concentration of red represents the relative distributions of error for three
piecewise planar models: (a) sub sampled grid, (b) Delaunay triangulation, (c) conditioned
triangulation.

Error Distribution

We examined error distributions of piecewise planar models constructed for three cases:

a. subsampled grid
b. Delaunay triangulation
c. conditioned Delaunay triangulation.

In case (a), we selected points by choosing a sub sampling rate (every second, or every third point,
for example), and triangulated by arbitrary selection of a diagonal for each subsampled grid cell.
For cases (b) and (c), we selected points using the curvature analysis and adaptive thresholding
technique described here, then triangulated with the Delaunay algorithm. Case (c) received three
conditioning passes.
675

20
19
18
17 - - Conditioned
CJ) 16 Triangulation
OJ 15
Delaunay
0.. 14
E 13 Triangulation I \
I
cd I
\
\
Cf) 12 I
I \
\
I
'+-
0 11 I
\
\
I
"-- 10 I
I
\
\
OJ 9 I
\
\
.0 I \
E 8
I
,
\
I \
I
::I 7 ,
I \
Z 6 I
\
\ ,
,
I
C\I
5 I V

.",
0) I
I \
0 4
3 ,,'It
II
\

2 , :
',' i
1\',
~
~/\
",
"',\
1 : \ I'
' -\
/' I \
0
-70 -50 -30 -10 10 30 50
Error, m.
Fig. 4. Error Histograms for Terrain Models.

Figure 3 illustrates the relative distribution of errors for each case. The intensity of red, superim-
posed on a relief map of the elevation data, indicates the degree of error. Although all three models
represent a comparable level of detail, each using only about 7% of original sampled data, the inten-
sity of red is comparable only within one model, not between models. The same intensity of red
could represent different amounts of error when compared between models. The sub sampled grid
model distributes the error evenly over the mountainous regions; the underlying grid pattern is also
apparent. The Delaunay triangulation concentrates error in local regions of high curvature. Close
inspection of the illustration reveals that, indeed, the Delaunay triangulation sometimes triangulates
across ridges and valleys, introducing significant errors in those regions. The conditioned triangula-
tion, on the other hand, alleviates this problem to a significarit degree. At this level of detail, how-
ever, certain local regions with considerable variation still show a high concentration of error.

Histogram Analysis

Figure 4 shows the improvement due to the Delaunay triangulation conditioning. This graph shows
the histograms from the error analysis. In this analysis we compare the elevation value from the
band-limited sampled data to the elevation value interpolated from the piecewise planar model. This
comparison yields 721,801 measurements of the deviation. We plot the number of measurements at
each error value. An ideal plot would take the form of a narrow spike centered on zero. As both
cases approach this ideal shape, we use a logarithmic scale to illustrate the difference more clearly.
The trial illustrated in fig. 4 used 18% of the data samples. The histogram of he conditioned
triangulation is significantly narrower and taller than that of the Delaunay triangulation.
676

Figures 5 and 6 summarize the results of a


series of these comparisons. We selected
two statistics obtained from the histogram
analysis to characterize the quality of fit:
the standard deviation and the maximum
absolute error. The standard deviation
measures how well the piecewise planar
model fits the sampled data. It is the
expected error in the average case. The
maximum absolute error measures the
worst case fit. Since the data are precise
to ±lm and accurate to ±30m, we would
A Conditioned Triangulation
be satisfied with a standard deviation of
o Delaunay Triangulation
about 2m and a maximum error of about
o Subsampled Grid
'/64 L -_ _-'--_ _--'-_ _--'-_ _ _L-_-----' 30m. Other applications, and other data
0.5 1.0 2.0 4.0 8.0 16.0
sets, with different precision and accuracy
Standard Deviation. m. (1092 scale) characteristics, would naturally require
different criteria for "satisfactory." We
Fig. 5. Terrain Models Fit Comparison: Standard see from the trend lines in fig. 5 that the
Deviation. Delaunay triangulation does not surpass
the sub sampled grid until at least 25 % of
the samples have been incorporated into
the triangulation. The conditioned
triangulation, on the other hand,
consistently shows a better fit than either
the subsampled grid or the Delaunay
triangulation. The maximum error is less
consistent, but it indicates, nonetheless,
that it takes a little over 25 % of the
samples to meet our 30m criterion. About
20% of the total number of samples are
required to meet our standard deviation fit
criterion of 2m. This percentage
corresponds to an 80% reduction in the
o Delaunay Triangulation
number polygons, compared to the fully
o Subsam led Grid
triangulated grid data.

16 32 64 128 256
Max. Absolute Error. m. (1092 scale) Subjective Evaluation

Fig. 6. Terrain Models Fit Comparison: Maximum There is one aspect of the TIN model,
Absolute Error. compared to the subsampled grid model,
that is difficult to measure, which seems
significant nonetheless. The eye can
easily detect the underlying grid structure in a model visualization. There is something about the
regUlarity of the grid that attracts our attention. A TIN model seems less susceptible to this
distraction, perhaps because of the varying size and shape of the planar pieces.
677

Fig. 7. Terrain Model Perspective Visualization.

APPLICATIONS

Terrain modeling has a direct bearing on the performance and quality of many visual simulation ap-
plications, such as computer image generation (CIG) systems used in flight simulation (Schachter
1983). Super graphics workstations (Salzman and Grimes 1989) and low-cost CIG systems has
made visual simulation more affordable (Zyda et al. 1988). In military applications, interactive per-
spective views of terrain could be incorporated into low-cost workstations supporting mission plan-
ning, mission rehearsal, group .training exercises, and command and control operations. In civilian
applications, geographic information systems (GIS), which traditionally used only two dimensional
map views, could include perspective views, which are useful for environmental impact planning
and assessment, architecture, and civil engineering. Efficient implementations of terrain models are
central to each of these applications. Figure 7 illustrates an example of a synthetic terrain model
perspective view visualized on a graphics workstation.

The modeling techniques described here are not limited to terrain modeling. This method could be
used for scientific visualization of the surfaces of other natural objects, or for representation of ab-
678

stract surfaces derived from scientific experiments. In an interactive visualization environment, effi-
cient surface rendering leads to productivity gains.

One exciting application is virtual environments, in which the computer user immerses himself in a
computer generated environment. The user can then interact with the objects contained in this envi-
ronment. The techniques described here could be used to construct natural looking surfaces for vir-
tual environments, derived from real-world data.

We have applied this technique to the tessellation of images. The interesting point here is that the
Gouraud shading hardware of graphics workstations can be used to reconstruct, approximately, im-
age regions of constant gradient. This idea is similar to run-length encoding, except that we
extended the concept to runs of constant gradient, in two dimensions. Using this method, the image
can be converted to a tiled surface that can be incorporated into a polygonal model. The image can
then be formed into non-planar shapes that match the shape of the imaged objects. Since graphics
workstations can efficiently render connected polygonal meshes with Gouraud shading, this
technique can be used to include images in real-time or interactive visualizations. This idea finds its
best application when texture mapping hardware support is not available, or if texture support is
limited to small, generic texture maps.

CONCLUSION

We have presented a method for modeling densely sampled grid data as a triangulated irregular net-
work (TIN). The technique allows a significant reduction in the number of polygons required to
represent a surface for rendering and visualization. The method is founded uPon curvature analysis
of the sampled data using a Laplacian operator. We have articulated methods for adaptive
thresholding using a non-linear neighborhood ranking technique, and conditioning of a Delaunay
triangulation to achieve a good fit. Our empirical analysis of a sampled terrain elevation data set
showed that this method provides a better fit than the popular technique of subsampling to a coarser
grid.

Further Refinements

We believe that curvature analysis with Laplacian operators will facilitate ongoing improvements in
piecewise planar surface modeling. Ideas for further investigation include:

• Compare the effects of other forms of the digital Laplacian.

• Analyze alternate triangulation conditioning heuristics.

• Design a dynamic programming or greedy triangulation algorithm based on a heuristic


fit-optimization measure.

• Determine linear features (ridges, valleys, breaks) directly from the Laplacian image, and
triangulate the resulting combination of surface-specific lines and points using a constrained
Delaunay triangulation (Chew 1989).
679

• Use curvature analysis as a basis for constructing a refined triangulation hierarchy (Scarlatos
1990).

• Use curvature analysis as a basis for constructing a Delaunay pyramid hierarchy (DeFloriani
1989).

ACKNOWLEDGEMENT

The work described herein was supported by the Electronics Systems Division of the U.S. Air Force
Systems Command, Hanscom Air Force Base, MA, contract number F19628-89-C-0001, under the
auspices of the Rome Air Development Center, Griffiss Air Force Base, NY.

REFERENCES

Boissonnat JD (1984) Geometric Structures for Three-Dimensional Shape Representation. ACM


Trans. Graph. 3(4): 266-286
Chew LP (1989) Constrained Delaunay Triangulations. Algorithmica 4: 97-108
DeFloriani L (1989) A Pyramidal Data Structure for Triangle-Based Surface Description. IEEE
Comput. Graph. Appl. 9(2): 67-78
Fowler RI, Little JJ (1979) Automatic Extraction of Irregular Network Digital Terrain Models.
Comput. Graph. 13(3): 199-207,
Gonzalez RC, Wintz PA (1987) Digital Image Processing (2nd ed), Addison-Wesley, Reading MA
pp. 81-92
Johnson EG, Rosenfeld A (1975) Digital Detection of Peaks, Pits, Ridges, and Ravines. IEEE
Trans. Syst. Man Cybern. SMC5: 472-480
Lawson CL (1977) Software for C 1 Surface Interpolation. In: Rice JR (ed) Mathematical Software
lII, Academic Press, New York NY, pp. 161-194
Lee DT, Schachter BJ (1980) Two Algorithms for Constructing a Delaunay Triangulation. Int. J.
Comput. In/. Sci. 9(3): 219-242
Mark DM (1975) Computer Analysis of Topography: A Comparison of Terrain Storage Methods.
Geografiska Annaler 57A(3-4): 179-188
Mark DM (1979) Phenomenon-Based Data-Structuring and Digital Terrain Modeling.
Geo-Processing 1: 27-36
Peucker TK, Douglas DH (1975) Detection of Surface-Specific Points by Local Parallel Processing
of Discrete Terrain Elevation Data. Comput. Gr. Image Process. 4: 375-387
Pratt WK (1978) Digital Image Processing, John Wiley & Sons, New York NY, p. 482
Preparata FP, Shamos MI (1985) Computational Geometry, Springer-Verlag, New York NY, pp.
234 ff.
Prewitt JM (1970) Object Enhancement and Extraction. In: Lipkin BS, Rosenfeld A (eds) Picture
Processing and Psychopictorics, Academic Press, New York NY, pp. 75-149
680

Salzman D, Grimes J (eds) (1989) IEEE Comput. Graph. Appl. 9(4); this issue features several ar-
ticles on the theme of superworkstations and visualization.
Scarlatos LL (199Oa) An Automatic Critical Line Detector For Digital Elevation Matrices. Proc.
1990 ACSM-ASPRS Annual Convention, 18-23 March 1990, Vol. 2, Denver CO, pp. 43-52
Scarlatos LL (l990b) A Refined Triangulation Hierarchy for Multiple Levels of Terrain Detail.
Proc. 1990 Image V Conference, 19-22 June 1990, Phoenix AZ, pp. 114-122
Schachter BJ (1983) Computer Image Generation, John Wiley & Sons, New York, NY
Southard DA (1991) Implementation of an Optimal Algorithm for Delaunay Triangulation in the
Plane. Adv. Eng. Softw. (in press)
Zyda MJ, McGhee RB, Ross RS, Smith DB, Streyle DG (1988) Flight Simulators for Under
$100,000. IEEE Comput. Graph. Appl. 8(1): 19-27

David A. Southard is a lead engineer in The MITRE Corporation's Ap-


plied Technology department. He earned a BS in physics from Pacific
Lutheran University, Tacoma, WA, and a MS in computer science from
West Coast University, Los Angeles, CA. He is continuing graduate stud-
ies in computer science at the University of Lowell, MA. His interests in-
clude three dimensional geometry, modeling, and visualization; user inter-
faces; and real-time interactive computer graphics. Prior to joining
MITRE, he worked for Logicon, developing petroleum engineering
display applications; and for Computer Sciences Corporation,
programming acoustic digital signal processing research tools at the Naval
Ocean Systems Center, San Diego, CA. He is a member of the IEEE
Computer Society.

Address: The MITRE Corporation, Mail Stop E073, Burlington Road,


Bedford, MA 01730-0208, USA. E-mail: m20190@mitre.org
Conference Organization Committee

Conference Co-Chairs:
C. Chryssostomidis (MIT, USA)
B. Herzog (University of Michigan, USA)

Program Chair:
N. M. Patrikalakis (MIT, USA)

International Coordinator:
R. A. Earnshaw (University of Leeds, UK)

International Program Committee:


1. Bardis (National Technical University of Athens, Greece)
F. Baskett (Silicon Graphics, USA)
T. S. Chua (National University of Singapore, Singapore)
R. A. Earnshaw (University of Leeds, UK)
G. E. Farin (Arizona State University, USA)
D. C. Gossard (MIT, USA)
1. J. Guibas (MIT, USA)
W. Hansmann (University of Hamburg, Germany)
C. M. Hoffmann (Purdue University, USA)
A. E. Kaufman (State University of New York at Stony Brook, USA)
T. 1. Kunii (University of Tokyo, Japan)
E. Nakamae (University of Hiroshima, Japan)
A. P. Pentland (MIT, USA)
B. Ravani (University of California at Davis, USA)
D. F. Rogers (US Naval Academy, USA)
J. R: Rossignac (IBM, USA)
W. K. Stewart (Woods Hole Oceanographic Institution, USA)
D. Thalmann (Swiss Federal Institute of Technology, Switzerland)
N. M. Thalmann (University of Geneva, Switzerland)
G. T. Toussaint (McGill University, Canada)
T. C. Woo (University of Michigan, USA)
B. Wyvill (University of Calgary, Canada)
G. Wyvill (University of Otago, New Zealand)
F. Yamaguchi (Waseda University, Japan)
D. Zeltzer (MIT, USA)

Conference Secretariat:
B. Dullea (MIT, USA)

681
682

Editorial Assistants:
M. Chryssostomidis (MIT, USA)
B. Dullea (MIT, USA)
K. Hartley (MIT, USA)
M. Lowry-Maloney (MIT, USA)
B. A. Moran (MIT, USA)
H. M. Quinn (MIT, USA)
List of Sponsors

Organized by:
Massachusetts Institute of Technology
Computer Graphics Society

Sponsored by:
Massachusetts Institute of Technology Sea Grant College Program

In Cooperation With:
American Society of Mechanical Engineers (USA)
Association for Computing Machinery (USA)
British Computer Society (UK)
IEEE Computer Society (USA)
Information Processing Society of Japan (Japan)
Institute of Electronics, Information and Communication Engineers (Japan)
International Society for Productivity Enhancement (USA)

Supported by:
Department of the Navy, Office of Naval Research (USA)
FUJITSU Limited (Japan)
Hitachi, Ltd. (Japan)
Intergraph Corporation (USA)
Massachusetts Institute of Technology Department of Ocean Engineering (USA)
Massachusetts Institute of Technology Media Laboratory (USA)
Massachusetts Institute of Technology Sea Grant College Program (USA)
MATSUSHITA ELECTRIC INDUSTRIAL CO., LTD. (Japan)
MITSUBISHI ELECTRIC CORPORATION (.Japan)
National Science Foundation (USA)
NEC Corporation (Japan)
NIPPON STEEL Information & Communications Systems Inc. (Japan)
NIPPON TELEGRAPH AND TELEPHONE CORPORATION (Japan)
NISSAN MOTOR CO., LTD. (Japan)
NTT DATA COMMUNICATION SYSTEMS CORPORATION (Japan)
RAILWAY INFORMATION SYSTEMS CO., LTD. (Japan)
Silicon Graphics Computer Systems (USA)
TOSHIBA CORPORATION (Japan)

683
List of Technical Reviewers

Atluri, S.N. (MIT, USA)


Bardis, L. (National Technical University of Athens, Greece)
Barsky, B.A. (University of California at Berkeley, USA)
Batty, M. (University of Wales Institute of Science and Technology, UK)
Beier, K-P. (University of Michigan, USA)
Boehm, W. (Technical University of Braunschweig, Germany)
Calvert, T.W. (Science Council of British Columbia, Canada)
Chryssostomidis, C. (MIT, USA)
Chua, T.S. (National University of Singapore, Singapore)
Cohen, M. (University of Utah, USA)
Cole, A.J. (UK)
Crilly, T. (Middlesex Polytechnic, UK)
De Floriani, L. (University of Genoa, Italy)
DeRose, A.D. (University of Washington, USA)
Dutta, D. (University of Michigan, USA)
Earnshaw, R.A. (University of Leeds, UK)
Farin, G.E. (Arizona State University, USA)
Farrell, E.J. (IBM Thomas J. Watson Research Center, USA)
Fuchs, H. (University of North Carolina at Chapel Hill, USA)
Fujimura, K (Oak Ridge National Laboratory, USA)
Gerschon, N. (MITRE Corp., USA)
Gursoy, H.N. (Intergraph Corporation, USA)
Gursoz,1. (Carnegie-Mellon University, USA)
Hansmann, W. (University of Hamburg, Germany)
Herman, G.T. (University of Pennsylvania, USA)
Hersch, R.D. (Swiss Federal Institute of Technology, Switzerland)
Hoffmann, C.M. (Purdue University, USA)
Hoschek, J. (Technical University of Darmstadt, Germany)
Howell, K (Naval Research Laboratory, USA)
Hu, J. (State University of New York at Stony Brook, USA)
Kaandorp, J. (University of Amsterdam, The Netherlands)
Kamgar-Parsi, B. (Naval Research Laboratory, USA)
Kaufman, A.E. (State University of New York ~t Stony Brook, USA)
Kolb, C. (Yale University, USA)
Kriezis, G.A. (Parametric Technology Corp., USA)
Kunii, T.1. (University of Tokyo, Japan)
Kwok, P. (University of Calgary, Canada)
Leech, J. (University of North Carolina at Chapel Hill, USA)
Levoy, M. (Stanford University, USA)
Lorensen, W.E. (General Electric Co., USA)
Lucier, B. (Purdue University, USA)

685
686

Mallat, S. (Courant Institute of Mathematical Sciences, USA)


Max, N.L. (Lawrence Livermore National Laboratory, USA)
Mitchell, J. (Prime Computer, Inc., USA)
Mong, L.C. (National University of Singapore, Singapore)
Musgrave, F.K. (Yale University, USA)
Nakamae, E. (University of Hiroshima, Japan)
Naylor, B. (AT&T Bell Laboratories, USA)
Noma, T. (Kyushu Institute of Technology, Japan)
Papalambros, P.Y. (University of Michigan, USA)
Pavlidis, T. (State University of New York at Stony Brook, USA)
Pegna, J. (Rensselaer Polytechnic Institute, USA)
Pentland, A.P. (MIT, USA)
Picard, R.W. (MIT, USA)
Prakash, P.V. (Prime Computer Inc., USA)
Prusinkiewicz, P.(University of Regina, Canada)
Ravani, B. (University of California at Davis, USA)
Rhoades, J. (University of North Carolina at Chapel Hill, USA)
Rogers, D.F. (U.S. Naval Academy, USA)
Rosenberg, R. (Naval Research Laboratory, USA)
Rosenblum, L. (Naval Research Laboratory, USA)
Rossignac, J.R. (IBM Thomas J. Watson Research Center, USA)
Sakurai, H. (Colorado State University, USA)
Samet, H. (University of Maryland, USA)
Sapidis, N. (General Motors Corp., USA)
Satoh, T. (Ricoh Co., Ltd., Japan)
Scarlatos, L. (State University of New York at Stony Brook, USA)
Schuette, 1. (Naval Research Laboratory, USA)
Shaffer, C.A. (Virginia Polytechnic Institute, USA)
Shu, R. (National University of Singapore, Singapore)
Stewart, W.K. (Woods Hole Oceanographic Institution, USA)
Suffern,K.G. (University of Technology, Australia)
Terzopoulos, D. (University of Toronto, USA)
Thalmann, D. (Swiss Federal Institute of Technology, Switzerland)
Thalmann, N.M. (University of Geneva, Switzerland)
Toussaint, G.T. (McGill University, Canada)
Warren, J. (Rice University, USA)
Webber, R.E. (University of Western Ontario, Canada)
Williams, J.R. (MIT, USA)
Wolter, F.-E. (MIT, USA)
Woo, T.C. (University of Michigan, USA)
Woodwark, J.R. (Information Geometers Ltd., UK)
Wyvill, B. (University of Calgary, Canada)
Wyvill, G. (University of Otago, New Zealand)
Yamaguchi, F. (Waseda University, Japan)
Zeltzer, D. (MIT, USA)
Zyda, M.J. (Naval Postgraduate School, USA)
List of Contributors

Arya, K. 147 Khorasani, A. 147


Klassen. R. V. 363
Bacon, B. 147 Konno, K. ,135
Balaguer, F. 135 Krueger, R.C. 251
Breen, D.E. 113 Kuehn, V. 639
Kunii, T.1. 3
Campa, A. 299 Kuwahara, K. 589
Carlbom, 1. 623
Chiyokura., H. 435 Lamotte, W. 189
Cohen, D. 211 Lee, C. 395

De Floriani, L. 157 Max, N.1. 333


McN aughton, C. 379
Ekns, K. 189
Nagasawa, M. 589
Feiner, S.K 525 Naka., T. 345
Fernande,,;, V. 573 N akamae, E. 609
Flcrackers, E. 189 Naka.se, Y. 345
Naylor, B. 545
Gao, M. 573 Nishimura, K. 315
Getto, P.H. 113, 317 Norton, A. 147
Gobbetti, E. 135 Nowacki, H. 61
Gross, M. 639
Guibas, L.J. 45 Pentland, A.P. 507
Guo, B. 48.5 Puppo, E. 457

Hall, P.M. 235 Ravani, B. 395


Harada, T. 417 Robertson, P.r<:. 163
Harrington, S ..]. 363
Harris, K.M. 623 Sagan C. 37
Haumanll, D. 147 Samtaney, R. 573
House, D.H. 113 Seligmann, D.D. 52.5
Senay, H. 269
Ignatius, E. 269 Shinagawa, Y. 3
Ihm,1. 545 Shu, R. 251
Ishida, S. 609 Silver, D. 573
Southard, D.A. 667
Jones, H. 299 Stewart, W.K. 85
Suenaga, Y. 655
Kaneda, K. 609 Suffern, K.G. 317
Kaufman, A.E. 27, 211 Sweeney, P. 147

687
688

Taguchi, F. 345
Takahashi, T. 283
Takamura, T. 435
Tanaka, T. 283
Terzopoulos, D. 623
Thalmann, D. 135
Thompson, W.R. 37
Turner, R. 135

Ueda, K. 417

Vezina, G. 163

Watanabe, Y. 655
Watt, A.H. 235
Wejchert, J. 147
Wyvill, G. 333, 379

Yagel, R. 211
Yang, A.T. 395

Zabusky, N. 573
Keyword Index

Algebraic Curves and Surfaces 545 Dot Design 363


Animation 147,639 Dynamics 135,147
Anti-Aliasing 283
Automated Picture Generation 525 Edge Highlighting and Shading 283
Efficiency 363
Bernstein-Bezier Representation 485 Ellipse 573
Bezier Patch 417, 435 Ellipsoid 573
Binary Space Partitioning Tree 545 Environmental Protection 639
Binormal Indicatrix 395 Error Diffusion 363
Boundary Fill 235
Brown's Square 417 Facial Expression 655
Bump Mapping 333 Feature Tracking 573
Filtering 211
Caching 189 Finite Elements 507
CAD 457 Flexible Objects 147
CAM 457 Fluid Dynamics 573
Cloth Modeling 113 Forest Growth Model 3
Clustering 655 Free-Form Surface 485
Coherence 189
Color Conversion 345 Generalized Gregory Patch 435
Composition Rule 269 Geometric Contact 395
Computational Geometry 45, 545 Geometric Continuity 395
Computer-Aided Design 61,457 Geometric Modeling 485, 667
Computer Animation 589 Glyph 269
Computer Graphics 85, 163, 639 Gradient Index Media 317
Contour Tracking 623 Gradient Index Rod Lenses 317
Conversion Algorithms 4.57 Grayscale 379
Convex Combination Surface 417 Gregory Patch 417, 435
Coral 333 Greyscale 379
Cross Boundary Derivative 435 Growth Model 333
Curvature Features 667
Curved Surfaces 189 Halftoning 379
Homotopy 3
Data Structures 457 Horizontal and Vertical Scanning 283
Delaunay Triangulation 45, 667 Hybrid Solid Models 457
Dendrites 623
Dendritic Spines 623 Illumination 345
Design 485 Image Processing 37,85, 163, 609
Deterministic Fractals 299 Image Registration 623
Digitized Space Curves 545 Image Segmentation 623
Directional Halftone Cell 363 Image Synthesis 525, 609
Discrete Shading 211 Implicit Patch 485
Dithering 379 Information Locality 3
689
690

Integrated Visualization Model 3 Segmentation 211


Integration by Table Referencing 283 Self- Visualizing Visualization Model 3
Integration of Small Reflection 283 Sierpinski Tetrahedron 299
Interactive Design 61 Sierpinski Triangle 299
Interactive Visjlalization 163 SIMD Parallel Algorithm 163
Irregular Mesh 667 Simulation 147, 639
Singularity 3
Knowledge-Based Graphics 525 Smoothed Particle Method 589
Snakes 623
Lofting 507 Soft Object :333
Luneberg Lensys 317 Solid Modeling 251,457
Sonar 8.5
Medical Imaging 251 Space Filling Curves 379
Metropolis Algorithm 113 Spacecraft :37
Motion Control 135 Surface Color Data 65.5
Multiple Images 609 Surface Tracking 251

Newton's Algorithm 189 Technical Illust.ration 52.5


Template Image 65.5
Object Representation 485 Terrain Modeling 6:39, 667
Oceanography 85 3D Discrete Spa.ce 27
Optical Microscope Image 609 3D Graphics 52.5
3D Interaction 1:3.5
Pan-Focused Image 609 3D Range Data 6.55
Parallelism 189 3D Raster 27
Particle-Based Modeling 113, 639 3D Reconstruction 623
Patches 189 3D Scan COllversion 27
Perspective Viewing 163 Transputers 189
Photorealistic 345 Tunnel-Free Surface 27
Physically-Based Modeling 113,147
Piecewise Linear Approximation 545 UnderwaterPhot.ography 8.5
Planetary Science 37 Underwater Robotics 8.5
Platonic Solids 299 Uniform Color Space 34.5
Point-Location 45 User Interface 6:39
Polar Curve 39.5
Preconditioning 507 Variable Refractive Index 317
Principal Evolute 395 Vector Fields 147
Virtual Cameras 13.5
Radiosity 345, 589 Visual Computer 3
Rational Bezier Patch 417,435 Visual Perception 269
Rational Boundary Gregory Patch 417 Visualization 37, 6], 85, 113, .573
Ray Casting 235 Visualization Mapping 269
Ray Tracing 45, 189, 317 Visualization Tree 269
Reeb Graph 3 Volume Rendering 211, 235, 251, 589, 623
Regularization 507 Volume Shading 211
Remote Sensing 85 Volume Texture 333
Volume Visualization 27, 45
Scanner 655 Volumetric Graphics 27
Scientific Visualization 251, 623, 639 Voxel 27
Scientific Visualization in Surfaces Voxelization 27
and Volumes 61
Wavelets 507

You might also like