Conference PaperPDF Available

Projecting Tetrahedra without Rendering Artifacts.

Authors:

Abstract and Figures

Hardware-accelerated direct volume rendering of unstructured volumetric meshes is often based on tetrahedral cell projection, in particular, the projected tetrahedra (PT) algorithm and its variants. Unfortunately, even implementations of the most advanced variants of the PT algorithm are very prone to rendering artifacts. In this work, we identify linear interpolation in screen coordinates as a cause for significant rendering artifacts and implement the correct perspective interpolation for the PT algorithm with programmable graphics hardware. We also demonstrate how to use features of modern graphics hardware to improve the accuracy of the coloring of individual tetrahedra and the compositing of the resulting colors, in particular, by employing a logarithmic scale for the preintegrated color lookup table, using textures with high color resolution, rendering to floating-point color buffers, and alpha dithering. Combined with a correct visibility ordering, these techniques result in the first implementation of the PT algorithm without objectionable rendering artifacts. Apart from the important improvement in rendering quality, our approach also provides a test bed for different implementations of the PT algorithm that allows us to study the particular rendering artifacts introduced by these variants.
Content may be subject to copyright.
Projecting Tetrahedra without Rendering Artifacts
Martin Kraus
Purdue University
Wei Qiao
Purdue University
David S. Ebert
Purdue University
ABSTRACT
Hardware-accelerated direct volume rendering of unstructured vol-
umetric meshes is often based on tetrahedral cell projection, in par-
ticular, the Projected Tetrahedra (PT) algorithm and its variants.
Unfortunately, even implementations of the most advanced variants
of the PT algorithm are very prone to rendering artifacts.
In this work, we identify linear interpolation in screen coordi-
nates as a cause for significant rendering artifacts and implement
the correct perspective interpolation for the PT algorithm with pro-
grammable graphics hardware. We also demonstrate how to use
features of modern graphics hardware to improve the accuracy of
the coloring of individual tetrahedra and the compositing of the
resulting colors, in particular, by employing a logarithmic scale
for the pre-integrated color lookup table, using textures with high
color resolution, rendering to floating-point color buffers, and al-
pha dithering. Combined with a correct visibility ordering, these
techniques result in the first implementation of the PT algorithm
without objectionable rendering artifacts. Apart from the important
improvement in rendering quality, our approach also provides a test
bed for different implementations of the PT algorithm that allows
us to study the particular rendering artifacts introduced by these
variants.
CR Categories: I.3.3 [Computer Graphics]: Picture/Image
Generation—Display algorithms, Bitmap and framebuffer opera-
tions; I.3.7 [Computer Graphics]: Three-Dimensional Graphics and
Realism—Raytracing
Keywords: volume visualization, volume rendering, cell projec-
tion, projected tetrahedra, perspective interpolation, dithering, pro-
grammable graphics hardware
1 INTROD UC TI ON AND PRE VI OU S WORK
There are several approaches to direct volume rendering of unstruc-
tured meshes, e.g., ray casting and cell projection. However, most
implementations for OpenGL graphics hardware are based on cell
projection; more specifically, the Projected Tetrahedra (PT) algo-
rithm published by Shirley and Tuchman [18].
The PT algorithm exploits the triangle rasterization performance
of graphics hardware by decomposing the projected silhouette of
a tetrahedron into three or four triangles, as in Figure 1. Previous
research has focused on two aspects of this algorithm: the color
computation for these triangles [6, 14, 16, 19] and the computation
of a visibility ordering of the tetrahedra [2, 3, 10]. In this work, we
are not concerned with the latter and assume that a correct visibil-
ity ordering is available. In fact, we employ the extension of the
meshed polyhedra visibility ordering (MPVO) algorithm for non-
convex meshes suggested by Williams [22], which computes a cor-
rect ordering for most tetrahedral meshes.
e-mail: kraus@purdue.edu
e-mail: qiaow@purdue.edu
e-mail: ebertd@purdue.edu
class 1a class 1b class 2
Figure 1: Classification of non-degenerate projected tetrahedra (top
row) and the corresponding decompositions (bottom row).
To compute the colors of triangles generated by the PT algo-
rithm, Shirley and Tuchman [18] suggested computing correct col-
ors only for the triangles’ vertices. Thus, the efficient linear in-
terpolation of vertex colors provided by graphics hardware can be
exploited. Unfortunately, this results in rendering artifacts even for
uniform attenuation coefficients as noted by Stein, Becker, and Max
[19]. Their solution was to interpolate the thickness and the average
attenuation coefficient of the projected tetrahedron for each frag-
ment by means of texture coordinate interpolation. Based on these
two interpolated values, the opacity of each fragment is determined
by a two-dimensional texture lookup. This opacity is either mod-
ulated with a constant color, or with an interpolated vertex color.
Unfortunately, the linear interpolation of the average attenuation
coefficient and the thickness is only correct for orthogonal projec-
tions and leads to rendering artifacts for perspective projections.
An efficient computation of the correct perspective interpolation
has been published by Heckbert and Moreton [7] and independently
by Blinn [1]. Moreover, correct perspective interpolation is usually
offered by modern graphics hardware supporting OpenGL. How-
ever, the perspective interpolation employed in OpenGL [17] can-
not directly cure this problem for the PT algorithm because it was
designed for an interpolation of values on triangles—not within
tetrahedra.
Another well-known disadvantage of the method by Stein et
al. [19] is the restriction of the attenuation coefficients to linear
functions within each tetrahedron. This problem was addressed
by a software-based computation for arbitrary attenuation trans-
fer functions published by Max, Hanrahan and Crawfis [13]. The
availability of hardware-supported three-dimensional texture maps
allowed R¨ottger, Kraus, and Ertl [16] to implement a generalization
of this method in graphics hardware. This technique is now known
as pre-integrated cell projection. The basic idea is to employ lin-
ear interpolation of texture coordinates to interpolate the thickness
of the tetrahedron and the scalar data value on the front and back
faces of the tetrahedron for each fragment. Hardware-accelerated
three-dimensional texture mapping is then exploited to perform a
lookup of the color for a particular fragment. Note that this particu-
October 10-15, Austin, Texas, USA
IEEE Visualization 2004
0-7803-8788-0/04/$20.00 ©2004 IEEE
27
lar implementation of pre-integrated cell projection by R¨ottger et al.
is also restricted to orthogonal projections and generates rendering
artifacts for perspective projections. Pre-integrated cell projection
was further improved [6, 14]; however, the artifacts caused by per-
spective projections were never addressed.
An alternative to cell projection is ray casting in unstructured
meshes. The basic algorithm for traversing cells of a tetrahedral
mesh along viewing rays and an implementation in software were
published by Garrity [5]. More recently, Weiler et al. [20, 21] pub-
lished a cell projection algorithm based on the idea of ray casting
single tetrahedra in graphics hardware and a hardware-based im-
plementation of a pre-integrated variant of Garrity’s algorithm us-
ing floating-point color buffers. With respect to rendering artifacts,
there are two advantages of the ray casting approach in contrast to
the PT algorithm: the use of floating-point precision to composite
colors and the absence of any interpolation errors due to perspec-
tive projection. In order to achieve a similar rendering quality with
the PT algorithm, we derive the correct perspective interpolation for
projected tetrahedra and discuss its implementation with the help of
programmable graphics hardware in Section 2.
For the coloring of individual tetrahedra, Weiler et al. [20]
employed a pre-integrated lookup table implemented by a three-
dimensional floating-point RGBA texture. Unfortunately, the lack
of trilinear interpolation in these textures and the limited resolu-
tion result in rendering artifacts. Our solution to these problems is
the use of textures with 16 bits per color component and the imple-
mentation of a logarithmic scale for the pre-integrated lookup table.
These enhancements are described in Section 3.
While the compositing of color contributions is performed with
floating-point precision by most ray casters, hardware-accelerated
cell projection with programmable graphics hardware has been re-
stricted to 8-bit fixed-point color components until recently. Even
the currently available hardware support for floating-point color
buffers does not allow us to blend small, overlapping triangle prim-
itives without rendering artifacts. Since the PT algorithm tends to
generate many small triangles, it is prone to these artifacts. In Sec-
tion 4, we show how to avoid them at the cost of rendering perfor-
mance. Moreover, we present an alternative approach for 8-bit color
components similar to the alpha dithering technique suggested by
Williams, Frank, and LaMar [23, 11].
A comparison of the common rendering artifacts of implemen-
tations of the PT algorithm is given in Section 5. The rendering
performance of our enhanced variants of the PT algorithm are also
discussed in this section. Section 6 presents our conclusions and
plans for future work.
2 PERSPECTIVE INTERPOLATI ON
Before discussing the interpolation of vertex attributes with per-
spective correction for the PT algorithm in Section 2.2, we will
introduce our notation and derive the required equations.
2.1 Interpolation in Normalized Device Coordinates
Our notation is based on the OpenGL specification [17]; in partic-
ular, the homogeneous coordinates of a vertex v
o
in object space
are denoted by (x
o
,y
o
,z
o
,w
o
)
T
. Assuming w
o
is not equal to 0,
this four-dimensional vector represents a three-dimensional vec-
tor (x
o
/w
o
,y
o
/w
o
,z
o
/w
o
)
T
. The model-view matrix M maps a
vector v
o
from object space to a vector v
e
= (x
e
,y
e
,z
e
,w
e
)
T
in
eye space, i.e., v
e
= Mv
o
. The mapping from eye space to clip
space is performed by the projection matrix P: v
c
= Pv
e
with
v
c
= (x
c
,y
c
,z
c
,w
c
)
T
. Finally, we define the normalized device co-
ordinates as the components of the four-dimensional vector v
d
=
eye point
view plane
v
d
H1L
v
d
H2L
f Hv
o
L
v
c
H1L
v
c
H2L
0
1
w
c
H1L
w
c
H2L
w
c
Figure 2: The projection of a linear function f (v
o
) onto the view
plane results in a nonlinear function. In the depicted case of a linear
color interpolation between vertices v
(1)
c
and v
(2)
c
, the center point
between v
(1)
c
and v
(2)
c
and its color are not projected to the center
point between v
(1)
d
and v
(2)
d
as illustrated by the centered dashed line.
(x
d
,y
d
,z
d
,1)
T
obtained by a “perspective division” from v
c
, i.e,
v
d
= v
c
/w
c
.
For the derivation of the perspective interpolation
1
, we have to
consider linear functions in object space. In homogeneous co-
ordinates, any scalar function f (v
o
) that is linear in the three-
dimensional coordinates x
o
/w
o
, y
o
/w
o
, and z
o
/w
o
is of the form
f (v
o
) = c
x
x
o
w
o
+ c
y
y
o
w
o
+ c
z
z
o
w
o
+ c
w
= c ·
v
o
w
o
with a constant four-dimensional vector c =
c
x
,c
y
,c
z
,c
w
T
. Note
that the projection to normalized device coordinates will turn this
linear function into a nonlinear function as illustrated in Figure 2.
Assuming w
c
is not equal to 0 and the matrix product PM has an
inverse, we can also write:
f (v
o
) = c · (PM)
1
PM
v
o
w
c
w
c
w
o
.
With the constant vector c
0
=
(PM)
1
T
c and the equality
PMv
o
/w
c
= v
d
we obtain
f (v
o
)
w
o
w
c
= c
0
· v
d
.
This equation implies that f (v
o
)w
o
/w
c
is a linear function of the
normalized device coordinates x
d
, y
d
, and z
d
. Therefore, given any
function f (v
o
) that is linear in the three-dimensional coordinates
x
o
/w
o
, y
o
/w
o
, and z
o
/w
o
, we may linearly interpolate values of
f (v
o
)w
o
/w
c
in normalized device coordinates.
An important example is f (v
o
) 1. Since 1 is constant, it is also
a linear “function” of x
o
/w
o
, y
o
/w
o
, and z
o
/w
o
. In this particular
case, the result from above implies that w
o
/w
c
is a linear function
of x
d
, y
d
, and z
d
; therefore, we may linearly interpolate values of
w
o
/w
c
in normalized device coordinates.
These results were exploited by Heckbert and Moreton [7] and
independently by Blinn [1] for the perspective interpolation of at-
tributes between vertices, e.g., color or texture coordinates. If two
vertices are connected by a line, the corresponding attributes of the
two vertices define a linear function on the line segment. Analo-
gously, if three vertices are connected by a triangle, the correspond-
ing attributes define a linear function on the triangle. The domain
1
Compare Equations 3.4 and 3.6 in the OpenGL 1.5 specification [17].
28
view plane
b
o
t
o
a
o
d
o
c
o
eye point
Figure 3: Decomposition of the tetrahedron (a
o
,b
o
,c
o
,d
o
) into three
smaller tetrahedra corresponding to the triangles generated by the PT
algorithm for class 1a (see Figure 1). The new vertex t
o
is determined
by the intersection of triangle (a
o
,c
o
,d
o
) with the extension of the
line from the eye point to b
o
. The three tetrahedra are (a
o
,b
o
,c
o
,t
o
),
(c
o
,b
o
,d
o
,t
o
), and (d
o
,b
o
,a
o
,t
o
).
of these linear functions f (v
o
) may be extended to the whole three-
dimensional object space without any complications. The projec-
tion of these functions is, however, not linear in normalized device
coordinates (see Figure 2). Therefore, instead of linearly interpolat-
ing vertex attributes, the values of f (v
o
)w
o
/w
c
and w
o
/w
c
are inter-
polated separately for each vertex, and these values are interpolated
linearly in normalized device coordinates for each fragment. The
value of f (v
o
) is then reconstructed by one division per fragment:
f (v
o
) =
f (v
o
)
w
o
w
c
interpolated
w
o
w
c
interpolated
.
Assuming w
o
is equal to 1 (or at least all w
o
s are the same for all
vertices), this leads directly to Equations 3.4 and 3.6 in the OpenGL
1.5 specification [17].
2.2 Interpolation for Projected Tetrahedra
The original PT algorithm [18] decomposes the (non-degenerate)
projection of a tetrahedron into three or four triangles (see Fig-
ure 1), computes colors for all triangle vertices, and employs
hardware-accelerated triangle rasterization to interpolate the frag-
ment color. Note that a correct perspective interpolation of colors
is impossible for this algorithm because each rasterized fragment
corresponds to a viewing ray segment through the tetrahedron with
a range of w
c
coordinates instead of a single w
c
coordinate.
Instead of decomposing the projection of a tetrahedron into three
or four triangles, we can also decompose the tetrahedron itself into
three or four smaller tetrahedra. In this case, the decomposition is
performed in object coordinates instead of normalized device coor-
dinates. As illustrated in Figure 3, each of these smaller tetrahedra
is projected to one of the triangles of the original PT decomposi-
tion. Moreover, each of the smaller tetrahedra features one pair of
vertices, which are projected to the same two-dimensional vertex
in the view plane. In Figure 4, for example, v
(1f)
o
(the “front” ver-
tex) and v
(1b)
o
(the “back” vertex) are projected to the same vertex
in the view plane. In order to process all three vertices of the tri-
angles in the view plane in exactly the same way (as required by
our implementation), we duplicate the other two three-dimensional
vertices of the tetrahedron. For example, v
(2f)
o
and v
(2b)
o
in Figure 4
view plane
v
o
H1fL
v
o
H1bL
v
o
H2fL
=v
o
H2bL
v
o
H3fL
=v
o
H3bL
eye point
Figure 4: The tetrahedron (d
o
,b
o
,a
o
,t
o
) from Figure 3 in our notation
for triangle vertices. Note that v
(1f)
o
and v
(1b)
o
are projected to the
same point on the view plane.
are identical copies of one vertex. Analogously, v
(3f)
o
and v
(3b)
o
are
identical.
Once the six vertices v
(1f)
o
, v
(1b)
o
, v
(2f)
o
, v
(2b)
o
, v
(3f)
o
, and v
(3b)
o
are
computed, either the front facing triangle spanned by the vertices
v
(1f)
o
, v
(2f)
o
, and v
(3f)
o
or the back facing triangle spanned by the ver-
tices v
(1b)
o
, v
(2b)
o
, and v
(3b)
o
can be rasterized because both triangles
will cover the same pixels. During the rasterization of either trian-
gle, we can interpolate any vertex attribute either on the front facing
triangle by interpolating between vertices v
(1f)
o
, v
(2f)
o
, and v
(3f)
o
, or
on the back facing triangle by interpolating (with the same weights)
between vertices v
(1b)
o
, v
(2b)
o
, and v
(3b)
o
. The correct perspective in-
terpolation can be performed as described in Section 2.1.
This approach avoids rendering artifacts caused by an incorrect
(nonperspective) interpolation in several important variants of the
PT algorithm. For example, in the variant of the PT algorithm pub-
lished by Stein et al. [19], we can interpolate the attenuation coef-
ficients
τ
(f)
and
τ
(b)
with perspective correction on the front facing
and the back facing triangle, respectively, while rasterizing either
one. Similarly, we can interpolate the scalar data values s
(f)
and
s
(b)
on the two triangles in the pre-integrated variant of the PT al-
gorithm suggested by R¨ottger et al. [16].
These two variants of the PT algorithm also require the thick-
ness of the tetrahedron for the rasterized fragment. This thickness l
may be computed as the distance (in three-dimensional eye space)
between the points on the two triangles corresponding to the raster-
ized fragment:
l =
v
(b)
e
w
(b)
e
v
(f)
e
w
(f)
e
.
All coordinates of v
(f)
e
/w
(f)
e
and v
(b)
e
/w
(b)
e
are linear functions of
the three-dimensional object coordinates for any model-view ma-
trix M with a fourth row vector of (0, 0, 0, 1)
T
; therefore, v
(f)
e
/w
(f)
e
and v
(b)
e
/w
(b)
e
can be interpolated with perspective correction on
the front facing and back facing triangle, respectively, as described
above.
Usually all w
e
coordinates are 1. Furthermore, we can exploit
the fact that the three-dimensional points represented by v
(f)
e
and
v
(b)
e
are on one line with the origin:
l =
v
(b)
e
v
(f)
e
=
v
(f)
e
z
(b)
e
z
(f)
e
z
(f)
e
.
29
tetrahedron
apply PT decomposition
front facing
triangles
specify vertices
input
vertex
attributes
v
o
H1fL
, s
H1fL
, v
o
H1bL
, s
H1bL
v
o
H2fL
, ... v
o
H3fL
, ...
call vertex
program
output
vertex
attributes
v
d
H1fL
, 1w
c
H1fL
, s
H1fL
w
c
H1fL
,
v
e
H1fL
w
c
H1fL
, 1w
c
H1bL
, ...
v
d
H2fL
, ... v
d
H3fL
, ...
...
interpolate
for triangle
rasterization
fragment
attributes
v
d
HfL
, 1w
c
HfL
, s
HfL
w
c
HfL
,
v
e
HfL
w
c
HfL
, 1w
c
HbL
, ...
call fragment program
fragment
color
RGBAHs
HfL
, s
HbL
, lL
blend with color in frame buffer
Figure 5: Data flow in our implementation of perspective interpola-
tion for the PT algorithm.
Note that |v
(f)
e
| denotes the Euclidean vector norm of the three-
dimensional eye space vector represented by v
(f)
e
. The latter equa-
tion is also employed in our implementation, which is presented in
the next section.
2.3 Implementation with Programmable Graphics Hardware
Our implementation is based on the OpenGL 1.5 ARB extensions
for vertex programs and fragment programs [17]. Figure 5 illus-
trates the data flow for perspective interpolation in our implemen-
tation of the PT algorithm for pre-integrated cell projection.
As discussed in Section 2.2, each tetrahedron is decomposed into
three or four smaller tetrahedra corresponding to the triangles of
the original PT algorithm. For each of the smaller tetrahedra, the
six vertices v
(1f)
o
, v
(1b)
o
, v
(2f)
o
, v
(2b)
o
, v
(3f)
o
, and v
(3b)
o
are computed.
Moreover, six corresponding scalar data values s
(1f)
, s
(1b)
, s
(2f)
,
s
(2b)
, s
(3f)
, and s
(3b)
are determined. We rasterize the front facing
triangle spanned by the vertices v
(1f)
o
, v
(2f)
o
and v
(3f)
o
. Apart from
these coordinates, each triangle vertex is also provided with the ob-
ject coordinates of the corresponding “back” vertex and the scalar
data values for the front and back vertex. For example, for the i-th
vertex with object coordinates v
(if)
o
we specify the scalar data values
s
(if)
and s
(ib)
, and the vector v
(ib)
o
as additional vertex attributes.
These vertex attributes are the input parameters for our ver-
tex program. For the i-th vertex, the vertex program computes
clip coordinates v
(if)
c
= PMv
(if)
o
and normalized device coordinates
v
(if)
d
= v
(if)
c
/w
(if)
c
. By performing the perspective division and re-
turning normalized device coordinates instead of clip coordinates,
we ensure that OpenGL does not use perspective interpolation but
linear interpolation since we specify that w
c
is equal to 1.
Apart from the projected position, the output parameters of our
vertex program for the i-th vertex are 1/w
(if)
c
, s
(if)
/w
(if)
c
, v
(if)
e
/w
(if)
c
,
1/w
(ib)
c
, s
(ib)
/w
(ib)
c
, and z
(ib)
e
/w
(ib)
c
. Note that we set all w
o
s to 1;
thus, they do not appear in these quantities.
These output vertex attributes are linearly interpolated between
the three vertices of each triangle. We denote the results of these
interpolations by 1/w
(f)
c
, s
(f)
/w
(f)
c
, v
(f)
e
/w
(f)
c
, 1/w
(b)
c
, s
(b)
/w
(b)
c
, and
z
(b)
e
/w
(b)
c
, respectively. These are input parameters for our fragment
program, which completes the perspective interpolation by dividing
interpolated values:
s
(f)
=
s
(f)
/w
(f)
c
1/w
(f)
c
,v
(f)
e
=
v
(f)
e
/w
(f)
c
1/w
(f)
c
,s
(b)
=
s
(b)
/w
(b)
c
1/w
(b)
c
,z
(b)
e
=
z
(b)
e
/w
(b)
c
1/w
(b)
c
.
The thickness l of the tetrahedron is computed from these quantities
as described in Section 2.2. Note that it is often possible to simplify
the computation of l without introducing visible rendering artifacts
with the help of the approximation l |z
(b)
e
z
(f)
e
|.
This completes the perspective interpolation of s
(f)
, s
(b)
, and the
computation of l. Based on these parameters, the color of the frag-
ment is determined by the fragment program as discussed in the
next section.
3 ACCU RATE COLORING
In the PT algorithm, many tetrahedra can contribute to the color of a
single pixel; therefore, even small color contributions of individual
tetrahedra can sum up to a significant contribution to the final im-
age. Thus, in order to avoid rendering artifacts, the accuracy of the
color computation for a single tetrahedron has to exceed the color
accuracy of the final image.
Fortunately, arithmetic computations in fragment programs may
be performed with floating-point precision. Therefore, it is benefi-
cial to replace, for example, the texture lookup for the correct expo-
nential attentuation suggested by Stein et al. [19] by a more accurate
computation in a fragment program. For some optical models, the
three-dimensional texture lookup for pre-integrated cell projection
can also be replaced by a fragment program as suggested by Guthe
et al. [6]. In general, however, the pre-integrated lookup can only
be replaced by an expensive numerical integration.
Therefore, instead of replacing the pre-integrated texture lookup,
we improve its accuracy by employing textures with 16-bit color
components (so called “HILO textures” [9]), which support trilin-
ear interpolation. Since HILO textures are only available with two
color components, we have to split the RGBA lookup texture into
two HILO textures and perform two texture lookups to get all four
color components. Note that we avoid floating-point textures be-
cause they only permit nearest-neighbor “interpolation.
Apart from the accuracy of the colors tabulated in the three-
dimensional lookup texture, we also have to consider its minimum
dimensions. The three coordinates for this texture lookup corre-
spond to the scalar data value at the front s
(f)
, the scalar data value
at the back s
(b)
, and the thickness l [16]. The texture coordinates
are usually computed by a linear mapping of the whole range of
s
(f)
, s
(b)
, and l, respectively, to the range of texture coordinates. In
our implementation, the range of a particular texture coordinate is
[(2n)
1
,1 (2n)
1
], where n is the dimension of the texture in the
corresponding direction. Note that (2n)
1
and 1 (2n)
1
specify
the coordinates of the centers of the 0-th and the (n 1)-th texel,
respectively, in OpenGL textures.
For the dimensions corresponding to s
(f)
and s
(b)
a resolution
corresponding to the Nyquist sampling rate of the transfer func-
tions will avoid most visible artifacts. In contrast to the scalar data
values, the range of the thickness l depends on the tetrahedral mesh.
More specifically, the minimum thickness is zero and the maximum
30
116
316
516
716
916
1116
1316
1516
l
max

64
l
max

8
l
max

4
l
max

2
l
max
l
r
Figure 6: Mapping of l [0,l
max
] to texture coordinate r [(2n
r
)
1
,1
(2n
r
)
1
] for dimension n
r
= 8. Note that the mapping is linear instead
of logarithmic for 0 l 2
(n
r
2)
l
max
= l
max
/64.
thickness is equal to the length l
max
of the longest edge of all cells
in eye space. If the model-view matrix does not include any scal-
ing, l
max
is also the length of the longest edge in object space. Note
that fragments on the silhouette of a tetrahedral cell usually require
a thickness very close to zero; thus, the pre-integrated lookup ta-
ble has to cover the whole range [0,l
max
]. The dependency of the
color on the thickness l is in general a very smooth function since
the most relevant dependencies are the exponential attentuation and
the linear accumulation of emitted light. Therefore, previous im-
plementations of pre-integrated cell projection have often chosen a
rather low resolution for the thickness.
However, this is not appropriate for data sets featuring a very
high ratio between the lengths of the longest and the shortest edges
of the mesh. In these cases, the thickness of many small tetrahedra
will be close to zero. Therefore, on the one hand, these data sets
require a very high resolution of the lookup table for small values
of l; on the other hand, the thickness is still a smooth function for
large values of l; thus, a coarse resolution for large l is sufficient.
Therefore, a logarithmic mapping of the thickness l to the texture
coordinate r is more appropriate than a linear mapping. Note that
a thickness of l = 0 should be mapped to the texture coordinate
of the 0-th texel r = (2n
r
)
1
where n
r
is the r-dimension of the
texture. In order to satisfy these constraints, we chose a mapping
of l to r, which is linear for 0 l l with l = 2
(n
r
2)
l
max
and
logarithmic for l l; see Figure 6 and Equation 1.
r =
max
log
2
l
l
,0
+ min
l
l
,1

.
n
r
+
1
2n
r
(1)
A straightforward implementation of Equation 1 using the OpenGL
ARB extension for fragment programs requires seven instructions.
(Implementing this mapping by means of a 1D texture lookup
would usually require a prohibitively large texture size.) To com-
pute the pre-integrated lookup table efficiently, we employ an
adapted variant of the incremental pre-integration method [20] that
incrementally computes two-dimensional slices of the table for
l = 0,l,2l,4l, 8l,...,2
n
r
2
l.
4 ACCURATE COM PO SI TI NG
In order to avoid rendering artifacts in volume rendering algo-
rithms, colors have to be composited with a higher accuracy than
most graphics adapters offer for frame buffers. Unfortunately, the
current hardware support for floating-point color buffers has some
crucial limitations, which prevent their efficient use in the context
of the PT algorithm as discussed in detail in Section 4.1. Section 4.2
describes an approach based on randomized dithering using frame
buffers with 8-bit color components to improve the quality of the
final image.
4.1 Floating-Point Color Buffer
Current graphics adapters (based on NVIDIAs NV3x and ATI’s
R3xx chipsets) allow fragment programs to write to and read from
floating-point textures at the same time. Although the results of
these texture read operations are undefined, this technique has been
successfully employed to implement color blending for slice-based
volume visualization [8].
For the small triangle primitives of the PT algorithm, however,
the results of these texture read operations are in fact erroneous—
presumably due to caching of texture data. If the compositing com-
putation in the fragment program is based on outdated data, severe
caching artifacts can appear. For the NVIDIA Quadro FX 3000
graphics board used in this work, we found several ways to avoid
all these caching artifacts. The two most important are:
1. rasterizing a large point primitive after each triangle fan, or
2. binding the color buffer texture and sending a fence (see the
GL
NV fence extension [9]) after each triangle fan.
2
Although rather slow, the latter approach performed better and was,
therefore, used for our measurements in Section 5.
The recently released NVIDIA GeForce FX 6800 (NV40
chipset) supports rendering to 16-bit floating-point color buffers
with alpha blending; thus, there is no need to simultaneously read
from and write to the same texture. Our preliminary measurements
indicate that this allows us to use floating-point color buffers at in-
teractive frame rates for the PT algorithm.
We found it extremely useful to employ floating-point color
buffers to generate reference images because these visualizations
avoid all artifacts due to color quantization during color composit-
ing. Thus, we are able to reveal rendering artifacts that would oth-
erwise be hidden by color quantization artifacts.
4.2 Alpha Dithering
For frame buffers with very limited color resolution, e.g., 8 bits per
color component, Williams, Frank, and LaMar [23, 11] suggested
alpha dithering as an efficient way to overcome quantization arti-
facts in volume rendering algorithms.
Our variant of alpha dithering customizes the 8-bit color quan-
tization of the color result of the fragment program. The de-
fault quantization on our graphics hardware maps an output
α
-
component between 0 and 1 to b255
α
+ 1/2c/255 with the floor
function bxc denoting the largest integer smaller than or equal to x.
In contrast to this default quantization, we implement the following
quantization at the very end of our fragment program:
α
7→
(b255
α
c +1) /255 if 255
α
b255
α
c > q
b255
α
c/255 otherwise
with a pseudo-random number q [0,1], which is determined by
texture lookups in tables of random numbers. In other words, we
round up with a probability equal to the fractional part of 255
α
,
otherwise we round down.
Our randomized rounding performs very well in many cases;
however, it does not avoid all rendering artifacts since the graphics
2
This method was suggested by Nick Triantos (NVIDIA).
31
(a) (b) (c)
Figure 7: Rendering of the NASA blunt fin data set: (a) without HILO lookup textures, (b) without floating-point color buffer, and (c) with
both HILO lookup textures and floating-point color buffer.
Table 1: Total rendering times per frame (including cell sorting) and
overhead introduced by our enhancements for the blunt fin data set.
rendering technique/enhancement time in secs
basic pre-integrated PT algorithm (i) 0.195
overhead for perspective interpolation (ii) 0.048
overhead for HILO lookup textures (iii) 0.031
overhead for logarithmic lookup (iv) 0.040
total for frame buffer (i+ii+iii+iv) 0.314
total with alpha dithering 0.621
total for floating-point color buffer 1.781
Table 2: Number of tetrahedra and total frame buffer rendering times
per frame for different data sets.
data set no. tets time in secs tets per sec
heat sink 121,668 0.252 483K
blunt fin 187,318 0.314 597K
cylinder 624,960 0.929 673K
X38 1,943,483 3.07 633K
hardware performs a second quantization after the blending oper-
ation of the fixed-function OpenGL pipeline. Unfortunately, this
blending operation is not customizable; thus, we cannot avoid ar-
tifacts introduced at this point. The only exception to this rule is
purely additive blending. In this case, the result of the blending is
guaranteed to be quantized, and the second quantization is, there-
fore, ineffective. This is also the only case in which a randomized
rounding of the red, green, and blue components is preferable to the
randomized rounding of
α
.
5 RESULTS
5.1 Timings
We tested our implementation on a Windows XP PC with an AMD
Athlon 64 3400+ processor (2.4 GHz), an AGP 8× bus, and an
NVIDIA Quadro FX 3000 graphics adapter with 256 MB of video
memory. As discussed in Section 2.3, our implementation takes ad-
vantage of the programmable vertex and fragment processing pro-
vided by the graphics hardware. Many factors influence the render-
ing performance, for example, the number of tetrahedra, the image
dimensions, and the depth complexity. The transfer function does
(a) (b)
Figure 8: Isosurface visualization of the heat sink data set: (a) with-
out and (b) with perspective interpolation.
not affect the rendering performance since different transfer func-
tions are achieved by modifying the pre-integrated lookup table.
Table 1 shows the timing results for the NASA blunt fin data set,
which is decomposed into 187,318 tetrahedra before rendering. All
times were measured for images of 800 × 600 pixels. Since our
system is fragment-bound, additional instructions in the fragment
program increase the rendering time. Note that we could greatly
improve the rendering performance by culling transparent tetrahe-
dra in software; however, for benchmarking purposes we are pro-
jecting all tetrahedra.
Perspective interpolation is implemented using vertex and frag-
ment programs with twelve and eleven arithmetic instructions, re-
spectively (see Section 2.3). The use of HILO lookup textures
requires an additional 3D-texture lookup per fragment (see Sec-
tion 3). The logarithmic lookup is implemented with seven arith-
metic instructions per fragment, as mentioned in Section 3. For
the images in this paper we always employed lookup tables of di-
mensions 256 × 256 × 16 where 16 is the dimension of the texture
corresponding to the thickness l.
Ten arithmetic instructions and two 2D-texture lookups in
pseudo-random number tables are needed per fragment for alpha
dithering (see Section 4.2). Due to the additional operations re-
quired to eliminate the texture caching artifacts as discussed in Sec-
tion 4.1, the color compositing in a floating-point color buffer is
considerably slower than in a frame buffer.
Table 2 presents the rendering times per frame for different data
sets rendered with perspective interpolation and logarithmic HILO
lookup textures. The times are roughly linear in the number of pro-
jected tetrahedra but also depend strongly on the number of raster-
ized fragments.
32
(a) (b) (c)
Figure 9: Rendering of the NASA tapered cylinder data set: (a) with color compositing in the frame buffer, (b) with alpha dithering and color
compositing in the frame buffer, and (c) with color compositing in a floating-point color buffer.
(a)
(b)
Figure 10: Isosurface visualization of the pressure component of the
NASA X38 data set: (a) with a uniform lookup texture and (b) with
a logarithmic lookup texture.
Our preliminary measurements on an NVIDIA GeForce FX
6800 GT clocked at 350 MHz show only a slight increase in per-
formance. For example, the rendering with all enhancements to
an 8-bit frame buffer is about 22 % faster than with the NVIDIA
Quadro FX 3000. This is presumably due to bottlenecks caused by
the CPU or the AGP bus. There is, however, an important excep-
tion: the rendering with hardware-supported blending to a 16-bit
floating-point color buffer is only less than 1 % slower than the ren-
dering to an 8-bit frame buffer since no additional operations are
necessary to avoid caching artifacts.
5.2 Comparison of Artifacts
One advantage of our implementation is that it allows us to study
artifacts in terms of their causes and remedies. In the following
discussion, we compare different renderings and relate artifacts to
techniques that remove them.
Perspective interpolation of vertex attributes is required in the
PT algorithm since an incorrect linear interpolation of texture co-
ordinates causes erroneous coloring (see Section 3). In Figure 8,
we show a volume visualization of the heat sink data set, which
mimics an isosurface rendering by employing a transfer function
with a sharp peak at the isovalue. Without perspective interpola-
tion, artifacts in the form of cracks and overlaps occur along the
intersections of the isosurface with cell faces, as shown in the inset
in Figure 8a. In contrast, rendering with perspective interpolation
shows no artifacts, as shown in the inset in Figure 8b. Note that all
insets in Figures 7 to 10 are magnified and contrast-enhanced.
An 8-bit per component pre-integrated lookup texture does not
offer sufficient color depth, as mentioned in Section 3. For example,
the structured artifacts shown in Figure 7a result from the coarsely
quantized colors of such a lookup texture. On the other hand, HILO
lookup textures produce artifact-free renderings as shown in Fig-
ure 7c. Color compositing also requires a high color accuracy, as
noted in Section 4.1. Figure 7b shows a rendering using a frame
buffer with only 8 bits per component for color compositing. This
limited color depth results in structured artifacts, which are similar
to those in Figure 7a.
As explained in Section 4.2, alpha dithering alleviates render-
ing artifacts resulting from quantization errors to a certain extent.
Figure 9a shows the NASA tapered cylinder data set rendered with
an 8-bit per component frame buffer. Alpha dithering lessens arti-
facts in inset “1” and almost removes them in inset “2” as shown in
Figure 9b. With floating-point precision for the color compositing,
alpha dithering is unnecessary, as shown in Figure 9c.
As explained in Section 3, a uniform lookup texture is inappro-
priate if the ratio between the lengths of the longest and the shortest
mesh edges is very high, which is the case in the NASA X38 data
set. Very small tetrahedra have thicknesses close to zero; thus, they
require lookup textures with an extremely high resolution for small
thicknesses. Uniform lookup textures cannot offer such a high res-
olution at a feasible size and, therefore, lead to under-sampling
errors for extremely small tetrahedra. This results in darker ren-
derings with substantial edge artifacts (Figure 10a) as opposed to
the artifact-free rendering with a logarithmic lookup texture (Fig-
ure 10b).
33
6 CONCLUSIONS AND FUT UR E WO RK
We have identified and cured all major rendering artifacts that are
common in implementations of the PT algorithm. These include in-
correct interpolation, insufficient accuracy and dimensions of pre-
integrated lookup tables, and insufficient accuracy of the frame
buffer used for compositing. With our improvements, the PT al-
gorithm is capable of achieving a rendering quality that was previ-
ously only possible with ray casting approaches.
Our solution to the limited accuracy of frame buffers is the use
of floating-point color buffers, which is well supported only by the
latest graphics hardware. This hardware allows us to provide the
highest rendering quality at interactive frame rates.
Several of our improvements are not restricted to the PT algo-
rithm. For example, the correct perspective interpolation could
also be applied to pre-integrated texture-based volume rendering
[4], and logarithmic lookup textures could be used for hardware-
accelerated ray casting algorithms with an adaptive sampling rate
[15, 20].
In order to further improve the PT algorithm, we plan to integrate
the volume lighting method described by Lum et al. [12] with cor-
rect perspective interpolation. Moreover, we intend to validate the
volume visualizations generated by our system by means of a com-
parison with a software ray caster. This will allow us to study the
effect of incorrect visibility orderings and to compare the numeri-
cal pre-integration with analytic solutions of the volume rendering
integral for piecewise-linear transfer functions.
7 ACKNOWLEDGMENTS
The authors would like to thank Cass Everitt, Randall Frank,
Markus Hadwiger, Mark Kilgard, Eric LaMar, Kirk Riley, Nik
Svakhine, Nick Triantos, Peter Williams, and the anonymous re-
viewers for many helpful discussions, comments, and suggestions.
We would also like to thank Kelly Gaither for providing the NASA
X38 dataset and NVIDIA for providing a pre-release GeForce FX
6800 board. This material is based upon work supported by the
National Science Foundation under Grant Nos. 0222675, 0081581,
0121288, 0196351, and 0328984.
REFERENCES
[1] James F. Blinn. Jim Blinn’s corner: Hyperbolic interpolation. IEEE
Computer Graphics and Applications, 12(4):89–94, 1992.
[2] Paolo Cignoni, Claudio Montani, Donatella Sarti, and Roberto
Scopigno. On the optimization of projective volume rendering. In
R. Scanteni, J. van Wijk, and P. Zanarini, editors, Visualization in Sci-
entific Computing ’95, pages 58–71. Springer-Verlag Wien, 1995.
[3] Jo˜ao Comba, James T. Klosowski, Nelson Max, Joseph S. B. Mitchell,
Claudio T. Silva, and Peter L. Williams. Fast polyhedral cell sorting
for interactive rendering of unstructured grids. Computer Graphics
Forum (Proceedings Eurographics ’99), 18(3):369–376, 1999.
[4] Klaus Engel, Martin Kraus, and Thomas Ertl. High-quality pre-
integrated volume rendering using hardware-accelerated pixel shad-
ing. In William Mark and Andreas Schilling, editors, Proceedings
Graphics Hardware 2001, pages 9–16. ACM Press, 2001.
[5] Michael P. Garrity. Raytracing irregular volume data. ACM Computer
Graphics (Proceedings San Diego Workshop on Volume Visualization
1990), 24(5):35–40, 1990.
[6] Stefan Guthe, Stefan Roettger, Andreas Schieber, Wolfgang Strasser,
and Thomas Ertl. High-quality unstructured volume rendering on
the pc platform. In Thomas Ertl, Wolfgang Heidrich, and Michael
Doggett, editors, Proceedings Graphics Hardware 2002, pages 119–
125. ACM Press, 2002.
[7] Paul S. Heckbert and Henry P. Moreton. Interpolation for polygon
texture mapping and shading. In David F. Rogers and Rae A. Earn-
shaw, editors, State of the Art in Computer Graphics: Visualization
and Modeling, pages 101–111. Springer-Verlag, 1991.
[8] Jens Kr¨uger and R¨udiger Westermann. Acceleration techniques for
gpu-based volume rendering. In Greg Turk, Jarke J. van Wijk,
and Robert Moorhead, editors, Proceedings IEEE Visualization 2003,
pages 287–292. IEEE Computer Society Press, 2003.
[9] Mark J. Kilgard. NVIDIA OpenGL Extension Specifications. NVIDIA
Corporation, 2001.
[10] Martin Kraus and Thomas Ertl. Cell-projection of cyclic meshes. In
Thomas Ertl, Kenneth Joy, and Amitabh Varshney, editors, Proceed-
ings IEEE Visualization 2001, pages 215–222. IEEE Computer Soci-
ety Press, 2001.
[11] Eric C. LaMar. On issues of precision for hardware texture-based vol-
ume visualization. In R. F. Erbacher, P. C. Chen, J. C. Roberts, and
Craig M. Wittenbrink, editors, Visual Data Exploration and Analysis,
pages 19–23. SPIE The International Society for Optical Engineer-
ing, 2004.
[12] Eric B. Lum, Brett Wilson, and Kwa-Liu Ma. High-quality lighting
and efficient pre-integration for volume rendering. In Proceedings
Joint Eurographics-IEEE TVCG Symposium on Visualization 2004
(VisSym ’04), pages 25–34, 2004.
[13] Nelson Max, Pat Hanrahan, and Roger Crawfis. Area and volume co-
herence for efficient visualization of 3d scalar functions. ACM Com-
puter Graphics (Proceedings San Diego Workshop on Volume Visual-
ization 1990), 24(5):27–33, 1990.
[14] Stefan Roettger and Thomas Ertl. A two-step approach for interac-
tive pre-integrated volume rendering of unstructured grids. In Roger
Crawfis, Chris Johnson, and Klaus Mueller, editors, Proceedings Vol-
ume Visualization and Graphics Symposium 2002, pages 23–28. ACM
Press, 2002.
[15] Stefan Roettger, Stefan Guthe, Daniel Weiskopf, Thomas Ertl, and
Wolfgang Strasser. Smart hardware-accelerated volume rendering. In
Proceedings Joint Eurographics-IEEE TCVG Symposium on Visual-
ization (VisSym ’03), pages 231–238, 2003.
[16] Stefan R¨ottger, Martin Kraus, and Thomas Ertl. Hardware-accelerated
volume and isosurface rendering based on cell-projection. In Thomas
Ertl, Bernd Hamann, and Amitabh Varshney, editors, Proceedings
IEEE Visualization 2000, pages 109–116. IEEE Computer Society
Press, 2000.
[17] Mark Segal and Kurt Akeley. The OpenGL Graphics System: A Spec-
ification (Version 1.5). Silicon Graphics, Inc., 2003.
[18] Peter Shirley and Allan Tuchman. A polygonal approximation to di-
rect scalar volume rendering. ACM Computer Graphics (Proceed-
ings San Diego Workshop on Volume Visualization 1990), 24(5):63–
70, 1990.
[19] Clifford M. Stein, Barry G. Becker, and Nelson L. Max. Sorting and
hardware assisted rendering for volume visualization. In Arie Kauf-
man and Wolfgang Krueger, editors, Proceedings 1994 Symposium on
Volume Visualization, pages 83–89. ACM Press, 1994.
[20] Manfred Weiler, Martin Kraus, Markus Merz, and Thomas Ertl.
Hardware-based ray casting for tetrahedral meshes. In Greg Turk,
Jarke J. van Wijk, and Robert Moorhead, editors, Proceedings IEEE
Visualization 2003, pages 333–340. IEEE Computer Society Press,
2003.
[21] Manfred Weiler, Martin Kraus, Markus Merz, and Thomas Ertl.
Hardware-based view-independent cell projection. IEEE Transactions
on Visualization and Computer Graphics, 9(2):163–175, 2003.
[22] Peter L. Williams. Visibility ordering meshed polyhedra. ACM Trans-
actions on Graphics, 11(2):103–126, 1992.
[23] Peter L. Williams, Randall J. Frank, and Eric C. LaMar. Alpha
dithering to correct low-opacity 8 bit compositing errors. Lawrence
Livermore National Laboratory Technical Report UCRL-ID-153185,
March 2003.
34
... Williams extended Shirley-Tuchman's approach to implement direct projection of other polyhedral cells in their HIAC rendering system [25] and used high accuracy light integration functions to model the light transport through the medium [24]. Recently, with the advent of programmable graphics hardware, a tremendous amount of work has been done in implementing the Shirley-Tuchman algorithm on graphics hardware using the programmable vertex and fragment shader pipelines on the GPUs [21][28] [15]. In all of the above cases, the rendering performance of the projected tetrahedra algorithm is typically proportional to the number of cells to be rendered. ...
... The current rendering technique uses the hardware implemented projected tetrahedron method [28]. We can use the new PT implementation presented in [15] to improve the quality of the images and we can consider to speed up the projection from 4D to 3D by taking advantage of the modern graphics hardware. Also, the current algorithm can be augmented with feature detection techniques to aid the user in identifying useful/interesting intervals in the field. ...
Article
Full-text available
In this paper, we study the interval segmentation and direct rendering of time-varying volumetric data to provide a more effective and interactive volume rendering of time-varying structured and unstructured grids. Our segmentation is based upon intervals within the scalar field between time steps, producing a set of geometrically defined time-varying interval volumes. To construct the time-varying interval volumes, we cast the problem one dimension higher, using a five-dimensional iso-contour construction for interactive computation or segmentation. The key point of this paper is how to render the time-varying interval volumes directly. We directly render the 4D interval volumes by projecting the 4D simplices onto 3D, decomposing the projected 4-simplices to 3-simplices and then rendering them using a modified hardware-implemented projected tetrahedron method. In this way, we can see how interval volumes change with the time in one view. The algorithm is independent of the topology of the polyhedral cells comprising the grid, and thus offers an excellent enhancement to the volume rendering of time-varying unstructured grids. Another advantage of this algorithm is that various volumetric and surface boundaries can be embedded into the time-varying interval volumes.
... Even with GPU acceleration and parallelism, unstructured volume rendering is generally slower than structured volume rendering under similar constraints . Moreover, due to the complexity of the geometry of unstructured data, rendering quality is not only affected by interpolation artifacts (Kraus et al. 2004) but can also be distorted by variations in mesh resolution and structure. ...
Article
3D volume rendering is widely used to reveal insightful intrinsic patterns of volumetric datasets across many domains. However, the complex structures and varying scales of volumetric data can make efficiently generating high-quality volume rendering results a challenging task. Multivariate functional approximation (MFA) is a new data model that addresses some of the critical challenges: high-order evaluation of both value and derivative anywhere in the spatial domain, compact representation for large-scale volumetric data, and uniform representation of both structured and unstructured data. In this paper, we present MFA-DVR, the first direct volume rendering pipeline utilizing the MFA model, for both structured and unstructured volumetric datasets. We demonstrate improved rendering quality using MFA-DVR on both synthetic and real datasets through a comparative study. We show that MFA-DVR not only generates more faithful volume rendering than using local filters but also performs faster on high-order interpolations on structured and unstructured datasets. MFA-DVR is implemented in the existing volume rendering pipeline of the Visualization Toolkit (VTK) to be accessible by the scientific visualization community.
... However, even with GPU acceleration [62], unstructured volume rendering algorithms struggle to render large-scale volumetric data due to the superlinear time complexity with respect to data size. Moreover, their rendering quality is not only affected by linear interpolation artifacts [31] but can also be distorted by the uneven sampling of the 3D volumetric space. ...
Preprint
Full-text available
3D volume rendering is widely used to reveal insightful intrinsic patterns of volumetric datasets across many domains. However, the complex structures and varying scales of datasets make generating a high-quality volume rendering results efficiently a challenging task. Multivariate functional approximation (MFA) is a new data model that addresses some of the key challenges of volume visualization. MFA provides high-order evaluation of values and derivatives anywhere in the spatial domain, mitigating the artifacts caused by the zero- or first-order interpolation commonly implemented in existing volume visualization algorithms. MFA's compact representation improves the space complexity for large-scale volumetric data visualization, while its uniform representation of both structured and unstructured data allows the same ray casting algorithm to be used for a variety of input data types. In this paper, we present MFA-DVR, the first direct volume rendering pipeline utilizing the MFA model, for both structured and unstructured volumetric datasets. We demonstrate improved rendering quality using MFA-DVR on both synthetic and real datasets through a comparative study with raw and compressed data. We show that MFA-DVR not only generates more faithful volume rendering results with less memory footprint, but also performs faster than traditional algorithms when rendering unstructured datasets. MFA-DVR is implemented in the existing volume rendering pipeline of the Visualization Toolkit (VTK) in order to be accessible by the scientific visualization community.
... detalle 0, correspondiente a la data original. Si el dataset es representado con m niveles de detalle, se requiere generar una tabla de O(mn 2 ) entradas, tal y como fue introducido inicialmente para mallas tetraédricas [KRA04]. La solución trivial consiste en basarse en el algoritmo presentado en [LUM04], en donde cada tabla 2D de n 2 entradas es calculada en O(n 2 ). ...
Thesis
Full-text available
Las técnicas multi-resolución son requeridas comúnmente para el despliegue de volúmenes que exceden las capacidades de la tarjeta gráfica, e incluso de la memoria principal. En cada cuadro de imagen, se selecciona la resolución adecuada para las distintas áreas del volumen en función de la cantidad de memoria disponible, y un conjunto de parámetros que pueden estar definidos en el espacio objeto o en el espacio imagen. Debido al ancho de banda limitado entre la memoria principal y la tarjeta gráfica, es posible que la resolución requerida en todas las áreas del volumen no sea satisfecha durante la interacción con el volumen. Así, la representación del volumen deber ser actualizada constantemente entre cuadros de imágenes durante esta interacción, que puede incluir el cambio del punto de vista, de la región de interés y de la función de transferencia. Se presenta un algoritmo voraz que mejora la aproximación cuadro a cuadro, utilizando un conjunto de operaciones de refinamiento y reducción – Split-and-Collapse – sobre las distintas áreas del volumen. Este algoritmo es guiado por una medida de error, que sugiere la selección de la siguiente operación de refinamiento o colapso, basada en la mejora de la aproximación multi-resolución, considerando las restricciones de hardware establecidas. Adicionalmente, se ha implementado un algoritmo óptimo polinomial, que obtiene la representación multi-resolución con mínimo error, con el objetivo de verificar que los resultados con el algoritmo voraz están cercanos al óptimo. Se han considerado técnicas out-of-core compatibles con los requerimientos de páginas de memoria demandados por el algoritmo de Split-and-Collapse. El análisis de complejidad de los algoritmos muestra que no dependen del tamaño del volumen, demostrando que son adecuados para desplegar volúmenes de gran tamaño. El despliegue puede ser realizado tanto con ray casting basado en GPU, como con planos alineados al viewport, con clasificación pre-integrada. En ambos casos se implementan técnicas de aceleración como salto de espacios vacíos, terminación temprana del rayo y muestreo adaptativo, explotando las bondades del hardware gráfico actual, y ajustando adecuadamente los segmentos de rayo en las fronteras de los subvolúmenes o bricks. Particularmente en el caso de ray casting basado en GPU se ha introducido una técnica para la reducción de artefactos entre bricks con distintos niveles de detalle, basada en la mezcla ponderada de niveles de detalle contiguos. Finalmente, se reduce el tiempo de cálculo para construir la tabla de pre-integración 3D, reutilizando integrales entre tablas de pre-integración 2D.
... Shirley and Tuchman introduced Projected Tetrahedra (PT) [Shirley and Tuchman 1990] to compute the rendering contribution of each tetrahedra as it is splatted to the screen. Numerous improvements have been made to the original algorithm since then, such as removing artifacts [Kraus et al. 2004], adding hardware acceleration [Wylie et al. 2002], and accelerating the render-order sort via the GPU [Maximo et al. 2010]. PT has been shown to scale well for distributed rendering of large-scale data [Ma and Crockett 1997]. ...
Conference Paper
Dark matter simulations, performed using N-body methods with a finite set of tracer particles to discretize the initially uniform distribution of mass, are an invaluable method for exploring the formation of the universe. Definining a tetrahedral mesh in phase space-with the tracer particles at initialization serving as vertices-yields a more accurate density field. At later timesteps, the mesh self-intersects to an enormous degree, making pre-sorting impossible. Kaehler et al [2012] visualize the mesh using cell projection, but their method requires order-independent compositing, which limits its flexibility. Our work renders the mesh using state of the art order-independent transparency (OIT) techniques to composite fragments in correct depth order. This also allows us to render variables other than density, such as velocity. We implement a number of OIT optimizations to handle the high depth complexity (on the order of 10⁷ depth layers for 2x10⁹ particles) of the data. Our performance measurements show near-interactive framerates for our hybrid renderer despite the large number of depth layers.
... A 2D table, however, is not sufficient for 2D object-aligned slices. In this case the use of logarithmically-scaled tables [108] might be beneficial. ...
Thesis
Modern numerical simulation and data acquisition techniques create a multitude of different data fields. The interactive visualization of these large, three-dimensional, and often also time-dependent scalar, vector, and tensor fields plays an integral part in analyzing and understanding this data. Although basic visualization techniques vary significantly depending on the type of the respective data fields, there is one key feature that is dominating in today's visualization research. Driven by the need for interactive data inspection and exploration and by the extraordinary rate of increase of the computational power provided by modern graphics processing units, the attempt for consequent application of graphics hardware in all stages of the visualization pipeline has become a central theme in order to cope with the challenges of data set sizes growing at an ever increasing pace and advancing demands on the accuracy and complexity of visualizations. Contemporary graphics processing units now have reached a level of programmability roughly resembling their CPU counterparts. However, there are still important differences that strongly influence the design and implementation of GPU-based visualization algorithms. This thesis addresses the problem of how to efficiently exploit the programmability and parallel processing capabilities of modern graphics processors for interactive visualization of three-dimensional data fields of varying data complexity and abstraction level. In particular new methods and GPU-based solutions for high-quality volume ray casting, the reconstruction of polygonal isosurfaces, and the point-based visualization of symmetric, second-order tensor fields, such as obtained by diffusion tensor imaging or resulting from CFD simulations, by means of ellipsoidal glyphs are presented that by combining the mapping and rendering stage onto the GPU result in an improved visualization cycle. Furthermore, a new approach for the topological analysis of noisy vector fields is described. Although this work is focused on a number of specific visualization problems, it also intends to identify general design principles for GPU-based visualization algorithms that may prove useful in the context of topics not covered by this thesis.
... A 2D table, however, is not sufficient for 2D object-aligned slices. In this case the use of logarithmically-scaled tables [108] might be beneficial. ...
Article
A parallel preintegration volume rendering algorithm based on adaptive sampling is proposed in this paper to visualize large-scale scientific data effectively on distributed-memory parallel computers. The algorithm sets sampling points adaptively by detecting the extremal points of a data field along the rays, so it can grasp the data variation exactly. After the data field is sampled distributedly on CPU cores, the resulting sampling points are sorted by piecewise packing orderly sampling points and then composited along each ray using the preintegration technique. In the algorithm, a static load balancing scheme based on information entropy is also proposed to balance the loads of both data reading and ray sampling. In addition, a mixed logarithmic quantization scheme is suggested to quantize depth distance so as to shorten the preintegration table while preserving the rendering quality. It is demonstrated that the presented algorithm can show inner features in a data field clearly and achieve a rendering speedup ratio between 1.8 and 4.4, compared with the traditional parallel volume rendering algorithm. Graphical abstract
Thesis
Full-text available
Scalar fields play a fundamental role in many scientific disciplines and applications. The increasing computational power offers scientists and digital artists novel opportunities for complex simulations, measurements, and models that generate large amounts of data. In technical domains, it is important to understand the phenomena behind the data to advance research and development in the application domain. Visualization is an essential interface between the usually abstract numerical data and human operators who want to gain insight. In contrast, in visual media, scalar fields often describe complex materials and their realistic appearance is of highest interest by means of accurate rendering models and algorithms. Depending on the application focus, the different requirements on a visualization or rendering must be considered in the development of novel techniques. The first part of this thesis presents three novel optical models that account for the different goals of photorealistic rendering and scientific visualization of volumetric data. In the first case, an accurate description of light transport in the real world is essential for realistic image synthesis of natural phenomena. In particular, physically based rendering aims to produce predictive results for real material parameters. This thesis presents a physically based light transport equation for inhomogeneous participating media that exhibit a spatially varying index of refraction. In addition, an extended photon mapping algorithm is introduced that provides a solution of this optical model. In scientific volume visualization, spatial perception and interactive controllability of the visual representation are usually more important than physical accuracy, which offers researchers more flexibility in developing goal-oriented optical models. This thesis presents a novel illumination model that approximates multiple scattering of light in a finite spherical region to achieve advanced lighting effects like soft shadows and translucency. The main benefit of this contribution is an improved perception of volumetric features with full interactivity of all relevant parameters. Additionally, a novel model for mapping opacity to isosurfaces that have a small but finite extent is presented. Compared to physically based opacity, the presented approach offers improved control over occlusion and visibility of such interval volumes. In addition to the visual representation, the continuously growing data set sizes pose challenges with respect to performance and data scalability. In particular, fast graphics processing units (GPUs) play a central role for current and future developments in distributed rendering and computing. For volume visualization, this thesis presents a parallel algorithm that dynamically decomposes image space and distributes work load evenly among the nodes of a multi-GPU cluster. The presented technique facilitates illumination with volumetric shadows and achieves data scalability with respect to the combined GPU memory in the cluster domain. Distributed multi-GPU clusters become also increasingly important for solving compute-intense numerical problems. The second part of this thesis presents two novel algorithms for efficiently solving large systems of linear equations in multi-GPU environments. Depending on the driving application, linear systems exhibit different properties with respect to the solution set and choice of algorithm. Moreover, the special hardware characteristics of GPUs in combination with the rather slow data transfer rate over a network pose additional challenges for developing efficient methods. This thesis presents an algorithm, based on compressed sensing, for solving underdetermined linear systems for the volumetric reconstruction of astronomical nebulae from telescope images. The technique exploits the approximate symmetry of many nebulae combined with regularization and additional constraints to define a linear system that is solved with iterative forward and backward projections on a distributed GPU cluster. In this way, data scalability is achieved by combining the GPU memory of the entire cluster, which allows one to automatically reconstruct high-resolution models in reasonable time. Despite their high computational power, the fine grained parallelism of modern GPUs is problematic for certain types of numerical linear solvers. The conjugate gradient algorithm for symmetric and positive definite linear systems is one the most widely used solvers. Typically, the method is used in conjunction with preconditioning to accelerate convergence. However, traditional preconditioners are not suitable for efficient GPU processing. Therefore, a novel approach is introduced, specifically designed for the discrete Poisson equation, which plays a fundamental role in many applications. The presented approach builds on a sparse approximate inverse of the matrix to exploit the strengths of the GPU.
Conference Paper
We describe a new dynamic level-of-detail (LOD) technique that allows real-time rendering of large tetrahedral meshes. Unlike approaches that require hierarchies of tetrahedra, our approach uses a subset of the faces that compose the mesh. No connectivity is used for these faces so our technique eliminates the need for topological information and hierarchical data structures. By operating on a simple set of triangular faces, our algorithm allows a robust and straightforward graphics hardware (GPU) implementation. Because the subset of faces processed can be constrained to arbitrary size, interactive rendering is possible for a wide range of data sets and hardware configurations.
Article
Full-text available
This paper describes and analyzes a dithering technique for accurately specifying small values of opacity () that would normally not be possible because of the limited number of bits available in the alpha channel of graphics hardware. This dithering technique addresses problems related to compositing numerous low-opacity semitransparent polygons to create volumetric effects with graphics hardware. The paper also describes the causes and a possible solution to artifacts that arise from parallel or distributed volume rendering using bricking on multiple GPU's.
Article
Direct volume rendering based on projective methods works by projecting, in visibility order, the polyhedral cells of a mesh onto the image plane, and incrementally compositing the cell’s color and opacity into the final image. Crucial to this method is the computation of a visibility ordering of the cells. If the mesh is “well-behaved” (acyclic and convex), then the MPVO method of Williams provides a very fast sorting algorithm; however, this method only computes an approximate ordering in general datasets, resulting in visual artifacts when rendered. A recent method of Silva et al. removed the assumption that the mesh is convex, by means of a sweep algorithm used in conjunction with the MPVO method; their algorithm is substantially faster than previous exact methods for general meshes.In this paper we propose a new technique, which we call BSP-XMPVO, which is based on a fast and simple way of using binary space partitions on the boundary elements of the mesh to augment the ordering produced by MPVO. Our results are shown to be orders of magnitude better than previous exact methods of sorting cells.
Article
This paper discusses issues with the limited precision of hardware texture-based volume visualization. We will describe the compositing OVER operator and how fixed-point arithmetic affects it. We propose two techniques to improve the precision of fixed-point compositing and the accuracy of hardware-based volume visualization. The first technique is to perform dithering of color and alpha values. The sedond technique we call exponent-factoring, and captures significanly more numeric resolution than dithering, but can only produce monochromatic images.
Article
An efficient technique is described for rendering volume data which is not organized as a regular, three-dimensional grid. The algorithm works on collections of convex cells with shared faces. Datasets of this form are quite common in such applications as Computational Fluid Dynamics (CFD) and Finite Element Modeling (FEM).
Article
Pre-integrated volume rendering is an effective technique for generating high-quality visualizations. The pre-computed lookup tables used by this method are slow to compute and can not include truly pre-integrated lighting due to space constraints. The lighting for pre-integrated rendering is therefore subject to the same sampling artifacts as in standard volume rendering. We propose methods to speed up lookup table generation and mini-mize lighting artifacts. The incremental subrange integration method we describe allows interactive lookup ta-ble generation in O n 2 ¡ time without the need for approximation or hardware assistance. The interpolated pre-integrated lighting algorithm eliminates discontinuities by linearly interpolating illumination along the view direc-tion. Both methods are applicable to any pre-integrated rendering method, including cell projection, ray casting, and hardware-accelerated algorithms.
Article
Direct volume rendering based on projective methods works by projecting, in visibility order, the polyhedral cells of a mesh onto the image plane, and incrementally compositing the cell’s color and opacity into the final image. Crucial to this method is the computation of a visibility ordering of the cells. If the mesh is “well-behaved” (acyclic and convex), then the MPVO method of Williams provides a very fast sorting algorithm; however, this method only computes an approximate ordering in general datasets, resulting in visual artifacts when rendered. A recent method of Silva et al. removed the assumption that the mesh is convex, by means of a sweep algorithm used in conjunction with the MPVO method; their algorithm is substantially faster than previous exact methods for general meshes. In this paper we propose a new technique, which we call BSP-XMPVO, which is based on a fast and simple way of using binary space partitions on the boundary elements of the mesh to augment the ordering produced by MPVO. Our results are shown to be orders of magnitude better than previous exact methods of sorting cells.
Conference Paper
We present the first algorithm that employs hardware-accelerated cell-projection for direct volume rendering of cyclic meshes, i.e., meshes with visibility cycles. The visibility sorting of a cyclic mesh is performed by an extended topological sorting, which computes and isolates visibility cycles. Measured sorting times are comparable to previously published algorithms, which are, however, restricted to acyclic meshes. In practice, our algorithm is also useful for acyclic meshes as numerical instabilities can lead to false visibility cycles.Our method includes a simple, hardware-assisted algorithm based on image compositing that renders visibility cycles correctly. For tetrahedral meshes this algorithm allows us to render each tetrahedral cell (whether it is part of a cycle or not) by hardware-accelerated cell-projection. In its basic form our method applies only to convex cyclic meshes; however, we present an exact and a simpler but inexact extension of our method for nonconvex meshes.