ArticlePDF Available

ISEA discrete global grids

Authors:
  • The University of Colorado Boulder

Abstract and Figures

This article describes a recently proposed standard, ISEA discrete global grids, for gridding information on the surface of the earth. The acronym ISEA stands for icosahedral Snyder equal area. The grid cells not only have equal areas, they are hexagons when projected onto an icosahedron! Being an advocate of hexagon binning, and corresponding graphics, my (Dan) enthu-siasm is such that I want to call attention to this new approach. Jon Kimerling, a geosciences professor, won the Ore-gon State University (OSU) Milton Harris Award for his research on the topic. Since I was peripherally involved in the development of the grids, I asked Kevin, Ralph and Tony to help me with portions of this article. They all interacted with Jon in the re-search and we all shared the desire to promote ISEA grids. Kevin, an admirer of Buckminster Fuller, de-veloped many of the ISEA algorithms and graph-ics. Along with OSU collaborators Mathew Gre-gory and Larry Hughes, he developed a web site, http://bufo. geo. orst. edu/tc/firma/gg/, that contains foldable figures, descriptions and three reference lists. At the Milton Harris Seminar, Ralph provided an excellent overview on the relevance to summarization and presentation of data from the Earth Observing System (EOS). EOS is a series of NASA satellites designed to detect and monitor global cli-mate change, starting in summer 1998.Material from Ralph's talk is available at the above web site and we include portions here. Tony was the instigator of the whole development with his push to develop a globally consistent environmental sampling methodology. Tony and his EMAP co-workers helped EPA Regions, states and nations develop environmental sampling plans us-ing an EMAP grid that was developed in the early stages of the research. The structure of this article is as follows. Section 2 traces some of the history behind the ISEA grids. Sec-tions 3 describes the ISEA grid. Section 4 introduces a potential application, storage of summaries derived from Earth Observing System sensors. Section 5 dis-
Content may be subject to copyright.
TOPICS IN INFORMATION VISUALIZATION
ISEADiscreteGlobal
Grids
By Dan Carr, Ralph Kahn, Kevin Sahr,
and Tony Olsen
1. Introduction
This article describes a recently proposed standard,
ISEA discrete global grids, for gridding information on
the surface of the earth. The acronym ISEA stands for
icosahedral Snyder equal area. The grid cells not only
have equal areas, they are hexagons when projected
onto an icosahedron! Being an advocate of hexagon
binning, and corresponding graphics, my (Dan) enthu-
siasm is such that I want to call attention to this new
approach.
Jon Kimerling, a geosciences professor, won the Ore-
gon State University (OSU) Milton Harris Award for
his research on the topic. Since I was peripherally
involved in the development of the grids, I asked
Kevin, Ralph and Tony to help me with portions of
this article. They all interacted with Jon in the re-
search and we all shared the desire to promote ISEA
grids. Kevin, an admirer of Buckminster Fuller, de-
veloped many of the ISEA algorithms and graph-
ics. Along with OSU collaborators Mathew Gre-
gory and Larry Hughes, he developed a web site,
http://bufo.geo.orst.edu/tc/firma/gg/,
that contains foldable gures, descriptions and three
reference lists. At the Milton Harris Seminar, Ralph
provided an excellent overview on the relevance to
summarization and presentation of data from the Earth
Observing System (EOS). EOS is a series of NASA
satellites designed to detect and monitor global cli-
mate change, starting in summer 1998. Material from
Ralphs talk is available at the above web site and we
include portions here. Tony was the instigator of the
whole development with his push to develop a globally
consistent environmental sampling methodology. Tony
and his EMAP co-workers helped EPA Regions, states
and nations develop environmental sampling plans us-
ing an EMAPgrid that was developed in the early stages
of the research.
The structure of this article is as follows. Section 2
traces some of the history behind the ISEA grids. Sec-
tions 3 describes the ISEA grid. Section 4 introduces
a potential application, storage of summaries derived
from Earth Observing System sensors. Section 5 dis-
cusses graphics for hexagon grids, Splus algorithms for
low resolution ISEA grid smoothing on the globe, and
new, resolution 9 (about 200,000 cells) binned map of
global elevation data. Section 6 closes with challenges
for future research.
2. The Recent Historical Development of
ISEA Grids
The impetus for the global grid system came from what
many would call an unusual perspective - survey sam-
pling. In 1989, Denis White and Scott Overton, a ge-
ographer and a statistician from Oregon State Univer-
sity, held a workshop in Corvallis, Oregon, to discuss
the geographic requirements for a general survey de-
sign. The survey design would be the foundation for all
surveys conducted as part of the Environmental Moni-
toring and Assessment Program (EMAP) (Messer et al.,
1991; Stevens, 1994). Scott Overton, leader of the sur-
vey design effort, recommended basing the design on
a systematic grid with a random start (Overton et al.,
1990). We all know how to accomplish that for pla-
nar surfaces but when the design must cover the United
States, we are faced with a non-planar surface - the
earth. Jon Kimerling, along with the other geographers
at the workshop, devised a discrete grid system that sat-
ised the needs of EMAP at that time (White et al.,
1992). The system used a truncated icosahedron model
of the earth with a triangular point grid applied to the
large hexagon plates. It worked for the contiguous 48
states. However, the initial discrete grid system did not
solve all the underlying issues and the embedded tri-
angular grid structure had elements that were arbitrary.
As an example, the EMAP team also applied the grid
to China, Russia, and Indonesia. The team knew prob-
lems would exist for China and Russia, as a single large
hexagon plate would not cover either country. Although
small in area, Indonesia is stretched out and also is not
covered! The initial discrete grid system had problems
at the boundaries of the plates.
In 1993, Tony Olsen, faced with these inadequacies,
initiated a research effort with Jon Kimerling, Kevin
Sahr, and Denis White in the OSU Geosciences depart-
ment to investigate an alternative discrete global grid
system. Tony required the system to be truly global
and result in an equal area tessellation. He also had a
preference for compact areas, minimal shape distortion,
a triangular point grid, and a hierarchical grid structure
allowing multiple grid densities. These characteristics
would enable global implementation of survey designs
for continuous spatial populations (Stevens, 1997).
Vol.8 No.2/3 Statistical Computing & Statistical Graphics Newsletter 31
My (Dan) formal involvement did not start until Oregon
State researchers held a workshop on discrete global
grids at Santa Barbara in 1994. Others in attendance
were Denis White, Jon Kimerling, Michael Goodchild,
Waldo Tobler, Tony Olsen, Geoff Dutton, Frank Davis,
and David Mark. Many in the group had already de-
veloped their own approaches for global grids. Waldo
Tobler was already using his methodology to show pop-
ulations on the global. Geoff Dutton had developed a
gridding system that modeled the earth as an octahe-
dron with an appropriate map projection. Kimerling
and White presented their icosahedral alternative to the
EMAP (truncated) icosahedron model. There are of
course additional approaches that work more directly
on the globe. (For a recent discussion of distributing
points on a sphere, see Saff and Kuijlaars 1997). All
methods must deal with the fact that there is no perfect
regular partition for the surface of a sphere. One mem-
ber noted that there is always at least one singularity,
as he humorously pointed to the bald spot on his head.
Michael Goodchild suggested that the meeting produce
a list of desirable properties for gridding systems. The
list appears below. Tony knew my objective when I pro-
posed cells being compact (having a small dimension-
less second central moment - see Conway and Sloane
1988 ). It was my attempt to promote hexagon cells.
At the Santa Barbara workshop Michael Goodchild pro-
posed a prioritized attribute list for a discrete global grid
system. The elements of the list are: the domain is
the globe (sphere, spheroid), areas exhaustively cover
the domain, areas are equal in size, areas are compact,
areas are equal in shape, areas have same number of
edges, edges of areas are of equal length, edges of ar-
eas are straight on some projection, areas form a hierar-
chy preserving some properties for
areas, each
area is associated with only one point, points are maxi-
mally central within areas, points are equidistant, points
form a hierarchy preserving some properties for
points, addresses of points and areas are regular and re-
ect other properties.
With methods and evaluation criteria at hand, the group
planned two sessions at the GIS/LIS94 meeting. To
have something to contribute, I, on the spur of the mo-
ment, concocted a method based on projecting 3-D lat-
tice points neara sphere surface onto the surface. My
subsequent attempts with different lattices, packings,
and notions of near, did not lead to hexagon patterns
over the whole sphere. The redeeming features of my
talk at GIS/LIS94 turned out to be the color anaglyph
stereo viewgraphs and brevity. The other presentations
carried the two sessions.
After GIS/LIS94, work proceeded on the icosahedron
model (see Kimerling, Sahr, Song, White, and Iltis,
1995). I called the research to the attention of Ralph
Kahn (NASA-JPL) who was looking for better ways to
summarize the global data expected from EOS. Tony
sought to involve Noel Cressie for dealing with spatial
estimation issues.
Jon Kimerling subsequently won the Milton Harris
Award. In May 1997, he held the Milton Harris Award
Symposium on Global Grids: New Approaches to
Global Data Analysis. In addition to presentations by
team members Kevin Sahr and Denis White, he invited
Ralph Kahn (NASA -JPL), Noel Cressie (Iowa State
University), Ross Kiester (USDA-Forest Science Lab-
oratory), Tony Olsen (USEPA-Corvallis) and myself to
make presentations. The following sections cover se-
lected topics from the Symposium and Kevins web site.
3. Icosahedral Snyder Equal Area
(ISEA) Grids
The S in ISEA refers to John P. Snyder. He came out of
retirement specically to address projection problems
with the original EMAP grid (see Snyder, 1992). He
developed the equal area projection that underlies the
gridding system. His work at the U.S. Geological Sur-
vey on map projections is known by all who spend any
time with map projections. John Snyder died this year.
By all reports, he was a modest man who would not
seek to have procedures named after him. Nonetheless,
in honor of his contributions to the eld of map pro-
jections, those developing the gridding system have de-
sired to use his name.
ISEA grids are simple in concept. Begin with a Snyder
Equal Area projection to a regular icosahedron (see the
stereo pairs in Figure 1) inscribed in a sphere. In each
of the 20 equilateral triangle faces of the icosahedron
inscribe a hexagon by dividing each triangle edge into
thirds (see the large gray hexagon in Figure 2). Then
project the hexagon back onto the sphere using the In-
verse Snyder Icosahedral equal area projection. This
yields a coarse-resolution equal area grid called the res-
olution 1 grid. It consists of 20 hexagons on the surface
of the sphere and 12 pentagons centered on the 12 ver-
tices of the icosahedron.
To form higher resolution grids, tessellate each equi-
lateral triangle in the planar view with more hexagons
and use the inverse projection back to the sphere. The
details of the regular tessellation are as follows: Al-
ways center a hexagon about the center point of the
32 Statistical Computing & Statistical Graphics Newsletter Vol.8 No.2/3
Figure 1. Stereo pairs of a regular icosahedron.
equilateral triangle. For odd resolution grids, orient the
hexagon so its base is parallel to the base of the trian-
gle. For even resolution grids orient the hexagon so a
vertex points at the base of the triangle. (Figure 2 shows
the central hexagons for resolutions 1 and 2 in gray and
black, respectively.) Select the edge length of a resolu-
tion
hexagon so it is times the edge length of
a resolution
hexagon. Thus, the area of a hexagon re-
duces by a factor of 3 with each increase in resolution.
As the resolution increases by 1, the tessellation pro-
cedure produces a hexagon centered on each hexagon
vertex and center point of the lower resolution tessella-
tion.
As illustrated in Figure 2, the procedure partitions a
lower resolution hexagon cell into one central cell and
six fractional (1/3) cells. This is not as simple as parti-
tioning a large square into exactly four smaller squares.
While the merits of strictly nesting cells within cells de-
pend on the context, one clear merit is aggregation sim-
plicity. The ISEAfractional cells create aggregation and
disaggregation problems that are currently under inves-
tigation.
The orientation of the icosahedron relative to the globe
is an important consideration. The selected orientation
for the ISEA grid creates symmetry about the equa-
tor. This is desirable for numerical modeling purposes.
There are always 12 pentagon cells about the vertices of
the icosahedron. The selected orientation places 11 of
the pentagon cells over water areas, so that most land
mass views will be completely composed of hexagons.
Table 1 on the next page (taken from Kevins web site)
gives the number of cells and characteristic hexagon
edge lengths for ISEA grids of increasing resolution.
The advantages of the ISEA grids are (1) they have ir-
regularities (12 pentagon cells) that are minor nuisances
rather than being pathological singularities, (2) they are
suitable for modeling on all parts of the globe includ-
ing the poles, (3) they preserve symmetry about the
equator, (4) they provide an innite nesting of equal-
area sub-grids, and (5) they provide a basis for uni-
form global density of sampling for data at all spatial
resolutions. The grid facilitates comparisons between
high and low latitude data and high and low spatial-
resolution data. The grid also improves the isotropy
of nite-difference quantities compared to those cal-
culated for rectangular grid schemes. For example
Fisch, Hasslacher and Pomeau (1986) note that two-
dimensional Navier-Stokes implementations are opti-
mal with hexagons. Finally, no ambiguity exists about
nearest neighbors as all nearest neighbor cells share an
edge with a reference cell and their distances to the cen-
ter of a reference cell are nearly equal.
4. EOS and the Potential Application of
ISEA Grids
There are many potential applications of ISEA grids.
We are particularly mindful of NASAs Earth Observ-
ing System and the wealth of global earth science data
that it will collect. The EOS AM-1 Platform is sched-
uled for launch in June 1998. The summarization of this
data provides a rapidly approaching opportunity to use
ISEA grids.
Vol.8 No.2/3 Statistical Computing & Statistical Graphics Newsletter 33
Resolution Number of Cells Length Scale (km)
1 32 4,684.2571
2 92 2,694.2932
3 272 1,553.6212
4 812 896.6139
5 2,432 517.5892
6 7,292 298.8166
7 21,872 172.5192
8 65,612 99.6035
9 196,832 57.5060
10 590,492 33.2011
11 1,771,472 19.1687
12 5,314,412 11.0670
13 15,943,232 6.3896
14 47,829,692 3.6890
15 143,489,072 2.1299
16 430,467,212 1.2297
17 1,291,401,632 0.7100
18 3,874,204,892 0.4099
Table 1. The number of cells and the characteristic
hexagon edge lengths for ISEA grids of increasing res-
olution.
More specically, ISEA grids are relevant to Level 3
Products in the EOS Data Product Classication. Level
1 Products involve raw radiances with geometric and
radiometric calibration. Level 2 Products are geophysi-
cal parameters at the highest resolution available. These
data sets preserve the non-uniform spatial and temporal
sampling of the satellite instruments. Level 3 products
are globally and temporally uniform data sets. Level
3 products are needed where large-scale, uniform cov-
erage is required (e.g., global-scale budgets, and prob-
lems that depend on data sets from multiple sources).
Various tradeoffs will drive the selection of spatial and
temporal scales chosen for Level 3 standard products so
a multiple-resolution equal-area global grid system is
immediately relevant.
The massive amount of data and the resolution issues
drive the need for professional algorithms. Forexample,
one instrument on the platform (MISR) will help char-
acterize, on a global basis, atmospheric aerosol type and
optical depth, surface bi-directional reectance proper-
ties, and cloud properties. The amount of data to be
collected from this one sensor is enormous. With a
spatial resolution of 16 values per km
and 36 chan-
nels, a global description will involve
basic
measurements. The MISR collection rates will be 40
Gbytes/day of raw data, 300 Gbytes/day total data, and
Figure 2. Subdividing the faces of a regular icosa-
hedron: Gray and black regions represent the central
hexagons for resolutions 1 and 2, respectively.
15-100 Tbytes/yr for at least 5 years. The computing
tools developed for the graphics in this article will not
handle such data.
Of course there is the old alternative to handle Level 3
gridding, equal angle grids. The equal angle grid re-
lies on the global latitude-longitude system and uses
a cylindrical map projection. It typically has a spa-
tial resolution of
( 112 km) and sub-grids based
on equal-angle divisions at
( 56 km) and
( 28 km). The advantages of the equal angle grid are
that the latitude-longitude system is convenient, famil-
iar, and entrenched. Also very important is the fact that
the results are easy to represent in 2-D arrays. How-
ever, the equal angle approach leads to several issues
such as rapidly changing spatial resolution at high lati-
tudes, non-uniform resolution for ne scales, ambiguity
of nearest neighbor operations and problems in repre-
senting data at multiple scales. The current solutions to
the multiple scale problem are discipline-specic vari-
ations, for example, specialized grids for polar and for
local high-resolution applications. The ISEA approach,
among other things, would provide compatible grids
across disciplines.
Those seeking additional information on alternative
grids and EOS sensors can access Ralphs descrip-
tions at
http://bufo.goe.orst.edu/tc/firma/
gg/kahntoc.html. Of particular interest is an ex-
ample that shows the huge discrepancies that can re-
sult from changing from one grid to another and back.
Those seeking more information on Level 2 products
or discussion of problems in validating satellite derived
parameters can start with Kahn et al. (1991).
34 Statistical Computing & Statistical Graphics Newsletter Vol.8 No.2/3
Vol.8 No.2/3 Statistical Computing & Statistical Graphics Newsletter 35
36 Statistical Computing & Statistical Graphics Newsletter Vol.8 No.2/3
5. Graphics for Hexagon Cells,
Global Binning and Foldable Figures
Many graphics are available for hexagon cells. Some
of these graphics involve spatial smoothing. The gure
on page 35 (adapted from Yang and Carr, 1995) shows
a breeding bird diversity map based on smoothing to
the previous EMAP grid. The brute force smoothing
of ten year prevalence data for 615 bird species to the
13000 cell grid involved close to 8 million local logis-
tic regressions! Soon after an article on mortality map
smoothing (Carr and Pickle, 1993), Andrew Carr and I
created a point and click Splus function (UNIX only)
for selecting U.S. cancer mortality rates and smoothing
the rates to hexagon grids. The smoother in that con-
text was loess. This collection of functions is avail-
able as an Splus data.dump le, nchs.dmp, by
anonymous ftp to galaxy.gmu.edu. It is located in
pub/dcarr/newsletter/nchs. While hexagon
cell maps are relatively uncommon, the general notion
of choropleth maps is, of course, not new.
There are several sources for innovative hexagon graph-
ics. Carr et al. (1987), and Carr (1991) present various
density representations and a practical bivariate gener-
alization of box plots. Kevin and Ron Keister (per-
sonal communication) have developed an approach for
showing the change from cell to cell by coloring trian-
gles within the hexagons. Papers of Carr (1989), Carr
(1991), and Carr, Olsen and White (1992) address sym-
bol congestion control with the rst showing a stereo
regression diagnostic and the last two papers focusing
attention on maps. The idea is to partition the map (or
plot) using hexagon cells and provide symbols to repre-
sent the summary for each cell with data. For example,
the angle of a ray glyph can represent a continuous vari-
able, such as a trend estimated from a time series. The
rays can point down (below horizontal) for small values
or negative trends and up for large values or positive
trends. The rays can plot on top of condence arcs that
represent associated condence bounds. Two rays with
common origin, one pointing to the left and one to the
right can easily represent two continuous variables on a
map.
Splus derivatives of my 2-D lattice functions now facil-
itate hexagon binning, gray level erosion, smoothing,
hexagon plotting and ray plotting. Familiarity and con-
venience suggested following the conventions in this
software when developing binning, smoothing and dis-
play procedures for global grids. The result is a set of
closely related Splus functions for low resolution grids.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
1 2 3 4 5 6 7 8 9 10 11 12 13 14
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Figure 3. A attened icosahedron foldable gure for
resolution three.
The task addressed here is binning of global ETOPO
minute elevation data into an ISEA resolution 9 grid.
Conceptually the computation of a cell id for a latitude
longitude pair involves four steps. First, the Snyder
Equal Area projection produces coordinates in one of
the 20 triangles of the icosahedron. Second, an afne
transformation (one for each of the twenty triangles)
maps the coordinates into a attened icosahedron fold-
able gure as shown for resolution three in Figure 3.
Next, a hexagon index routine (like xy2cellin Splus)
produces a planar cell id. The last step uses a look-up
table (a vector) to convert planar cell id into a globe cell
id. (Globe cell ids are integers ranging from 1 to the
number of globe cells.) Given globe cell ids, compu-
tation of summary statistics for data falling in the cells
is straightforward. The work was in generating the pla-
Vol.8 No.2/3 Statistical Computing & Statistical Graphics Newsletter 37
nar cell id to globe cell id conversion vector. The re-
indexing omits unused hexagons that cover the Figure 3
rectangle. The re-indexing also accounts for split fold-
able gure planar cells (for example those containing
the ve left triangles tips) that are really parts of the
same cell on the globe.
Rather than reading the large ETOPO le into Splus, I
modied one of Kevins programs. The program im-
ports the resolution 9 re-indexing vector generated in
Splus and the bins on the y. After reading the binned
results into Splus, I used three additional bookkeeping
vectors to compute hexagon boundaries and colors for
all cells (196832 globe cells and the 832 cloned cells)
in the foldable gure. The gure on page 36 shows
the average elevation for each cell. Jon Kimerling sug-
gested the basic elevation and depth coloring scheme.
A further renement requiring additional data would be
to distinguish land hexagons that are slightly below sea
level from ocean oor hexagons. I had problems pro-
ducing the whole postscript le for the gure on page
36 so I wrote out pieces and connected them using Unix
tools. A procedure that reads a value for a location and
writes a hexagon directly to a le would be better for
graphics output.
My Splus routines for odd resolution ISEA grids are
available via anonymous ftp to
galaxy.gmu.edu.
Change directory to
pub/dcarr/newsletter/isea.
There is a README document describing the vari-
ous functions. For example, one function produces a
globe cell near neighbor pointer matrix (for low reso-
lutions). Another function uses this matrix for smooth-
ing values on the globe. (More sophisticated smoothers
could restrict domains to land masses, oceans land-
ocean boundaries, or address ow constraints.) A script
le shows the process of producing a foldable icosahe-
dron. The script starts by randomly generating a vec-
tor of 2432 values that implicitly correspond to globe
cells in a resolution 5 grid. After smoothing the values
and converting them into colors for hexagons, the script
plots the hexagons along with tabs for gluing. Creasing
along the lines shown in Figure 3 helps in the construc-
tion. I have made several gures for the holiday season.
Postscript les for different examples and sizes are in
the above directory. Kevins web site contains more ex-
amples including one of my favorites. The favorite is an
amazing gift from the antiquity of basketry, a six great
circle weave.
For stereo presentations on a globe, a simple approach
partitions each hexagon into six triangles. The plotting
step then renders triangles whose vertices result from
the inverse Snyder equal area projection.
6. Additional Challenges and Closing
Remarks
This article describes a 1-D indexing system that is vi-
able for modest odd resolution grids. The basic index-
ing is for hexagon cells that cover a rectangle bound-
ing the planar icosahedron view. A re-indexing vector,
whose length is the number of covering cells, removes
the unused and redundant indices. After binning with
the new indices, pre-computed
and vectors provide
plotting positions for the planar icosahedron view. The
binned results correspond to cells on the globe so a short
subscript vector extracts values corresponding to split
cells in the planar view. The result of concatenating the
two vectors corresponds to the planar
and coordi-
nates. No doubt a similar approach will work for even
resolution grids but the bookkeeping to handling unused
and split cells will require some work.
When the grid involves billions of cells, the indexing
based on the rectangle bounding the foldable icosahe-
dron planar view may be too wasteful. A rst chal-
lenge is to develop a more efcient indexing system.
Quite possibly this will just cover the twenty triangles
with hexagons and handle the cells that cross the edges
of touching triangles. A second challenge is to move
from a demonstration system to professional quality al-
gorithms for high resolution grids.
There are many tasks to be addressed for a collection of
algorithms to be complete. Tony is interested in index-
ing optimized for subsets of the globe such as the con-
tinental U.S. Perhaps the most crucial task is to provide
fast, conceptually acceptable algorithms for changing
resolutions. As indicated earlier, lack of strictly nested
cells at different resolutions poses a problem. The equal
area projection approach easily adapts to strictly nested
triangles, but that would give up some of the merits of
hexagon cells.
A second challenge area is to consider the use of spa-
tial models in producing cell summaries. For exam-
ple, Ralph has noted that the current procedure for pro-
ducing pixel values for satellite images involves a sim-
ple near neighbor averaging process. Noel Cressie ad-
dressed some of the spatial modeling possibilities in his
talk at the Harris Seminar.
Assuming the computation issues are solved, we will
then face the biggest challenge of all, institutional iner-
tia. Proposing a standard is one thing. Getting scientists
in different nations and different disciplines to use it is
another.
38 Statistical Computing & Statistical Graphics Newsletter Vol.8 No.2/3
Acknowledgments
Research related to this article was supported by EPA
under cooperative agreement No. CR820820-01-0. The
article has not be subjected to the review of the EPA and
thus does not necessarily reect the view of the agency
and no ofcial endorsement should be inferred.
References
Carr, D. B. (1989), Discussion of Regression Diag-
nostics with Dynamic Graphics, Technometrics, 31(3),
293-296.
Carr, D. B. (1991), Looking at Large Data Sets Using
Binned Data Plots,Computing and Graphics in Statis-
tics, eds. A. Buja and P. Tukey, Springer-Verlag, New
York, New York, 7-39.
Carr, D. B., Littleeld, R. J., Nicholson, W. L. and Lit-
tleeld, J. S. (1987), Scatterplot Matrix Techniques
For Large N, Journal of the American Statistical As-
sociation, 82(398), 424-436.
Carr, D. B., Olsen, A. R., and White, D. (1992),
Hexagon Mosaic Maps for Display of Univariate and
Bivariate Geographical Data, Cartography and Geo-
graphic Information Systems, 19(4), 228-236, 271.
Carr, D. B. and Pickle, L. W. (1993), Plot Pro-
duction Issues and Details: Smoothed Cancer Rates
and Hexagon Mosaic Maps, Statistical Computing &
Graphics Newsletter, 4(2), 16-20.
Conway, J. H. and Sloane, N. J. A. (1988), Cover-
ings, Lattices, and Quantizers, Sphere Packings, Lat-
tices and Groups, New York. Springer-Verlag, 56-62.
Fisch, U., Hasslacher, B. and Pomeau, Y. (1986),
Lattice-Gas Automata for the Navier Stokes Equa-
tion,Physical Review Letters, 56(14), 1505-1508.
Kahn, R., Haskins, R. D., Knighton, J. E., Pursch, A.
and Granger-Gallegos, S. (1991), Validating a Large
Geophysical Data Set: Experiences with Satellite-
Derived Cloud Parameters, Computing Science and
Statistics, Proceedings of the 23rd Symposium on the
Interface, Interface Foundation of North America, Fair-
fax Station, VA, 133-140.
Messer, J. J., Linthurst, R. A., and Overton, W. S.
(1991), An EPA Program for Monitoring Ecological
Status and Trends, Environmental Monitoring and As-
sessment, 17, 67-78.
Overton, W. S., Stevens, D. L. and White, D. (1990),
Design Report for EMAP, Environmental Monitoring
and Assessment Program, EPA 600/3-91/053, U.S. En-
vironmental Protection Agency, Ofce of Research and
Development, Washington, D.C.
Saff, E. B. and Kuijlaars, A. B. J. (1997), Distributing
Many Points on a Sphere, Mathematical Intelligencer,
19(1), 5-11.
Snyder, J. P. (1992), An Equal-Area Map Projection
for Polyhedral Globes. Cartographica, 29(1), 10-21.
Stevens, D. L., Jr. (1994), Implementation of a Na-
tional Environmental Monitoring Program, Journal of
Environmental Management, 42, 1-29.
Stevens, D. L., Jr. (1997), Variable Density Grid-
Based Sampling Designs for Continuous Spatial Pop-
ulations, Environmetrics, 8, 167-95.
White, D., Kimerling, A. J., and Overton, W. S. (1992),
Cartographic and Geometric Components of a Global
Sampling Design for Environmental Monitoring, Car-
tography and Geographic Information Systems, 19(1),
5-22.
Yang, K. S., Carr, D. B. and OConnor, R. J. (1995),
Smoothing of Breeding Bird Survey Data to Produce
National Biodiversity Estimates, Computing Science
and Statistics, Proceeding of the 27th Symposium on the
Interface, Vol. 27, M. M. Meyer and J.L. Rosenberger
(eds.), Interface Foundation of North America, Fairfax
Station, VA, 405-409.
Dan Carr
Institute for Computational Sciences
and Informatics
George Mason University
dcarr@voxel.galaxy.gmu.edu
Ralph Kahn
Jet Propulsion Laboratory
ralph.kahn@jpl.nasa.go
Kevin Sahr
University of Oregon
sahrk@cs.uoregon.edu
Tony Olsen
EPA National Health and Environmental
Effects Research Laboratory
tolsen@mail.cor.epa.gov
Vol.8 No.2/3 Statistical Computing & Statistical Graphics Newsletter 39
... For B(s), we choose to use three resolutions of bisquare basis functions based on the Discrete Global Grid (for more detail, see [24]). The basis functions are chosen to be multi-resolutional, which enables the methodology to better capture spatial dependence at different scales [7]. ...
... For the horizontal basis functions, we make use of multi-resolutional bisquare basis functions. Specifically, we use three resolutions whose centers are the centers of Level 1, 2, and 3, respectively, of an Icosahedral Snyder Equal Area Aperture 3 (ISEA3) grid, which is generated from the Discrete Global Grid software [24]. The three resolutions have 32, 92, and 272 basis centers, respectively, for a total of 396 functions. ...
Article
Full-text available
Global maps of total-column carbon dioxide (CO2) mole fraction (in units of parts per million) are important tools for climate research since they provide insights into the spatial distribution of carbon intake and emissions as well as their seasonal and annual evolutions. Currently, two main remote sensing instruments for total-column CO2 are the Orbiting Carbon Observatory-2 (OCO-2) and the Greenhouse gases Observing SATellite (GOSAT), both of which produce estimates of CO2 concentration, called profiles, at 20 different pressure levels. Operationally, each profile estimate is then convolved into a single estimate of column-averaged CO2 using a linear pressure weighting function. This total-column CO2 is then used for subsequent analyses such as Level 3 map generation and colocation for validation. In principle, total-column CO2 in these applications may be more efficiently estimated by making optimal estimates of the vector-valued CO2 profiles and applying the pressure weighting function afterwards. These estimates will be more efficient if there is multivariate dependence between CO2 values in the profile. In this article, we describe a methodology that uses a modified Spatial Random Effects model to account for the multivariate nature of the data fusion of OCO-2 and GOSAT.We show that multivariate fusion of the profiles has improved mean squared error relative to scalar fusion of the column-averaged CO2 values from OCO-2 and GOSAT. The computations scale linearly with the number of data points, making it suitable for the typically massive remote sensing datasets. Furthermore, the methodology properly accounts for differences in instrument footprint, measurement-error characteristics, and data coverages.
... Compromises are available, however. Carr et al. (1997) and Kimerling et al. (1999) discuss this in more detail. ...
Preprint
Visualizing very large matrices involves many formidable problems. Various popular solutions to these problems involve sampling, clustering, projection, or feature selection to reduce the size and complexity of the original task. An important aspect of these methods is how to preserve relative distances between points in the higher-dimensional space after reducing rows and columns to fit in a lower dimensional space. This aspect is important because conclusions based on faulty visual reasoning can be harmful. Judging dissimilar points as similar or similar points as dissimilar on the basis of a visualization can lead to false conclusions. To ameliorate this bias and to make visualizations of very large datasets feasible, we introduce a new algorithm that selects a subset of rows and columns of a rectangular matrix. This selection is designed to preserve relative distances as closely as possible. We compare our matrix sketch to more traditional alternatives on a variety of artificial and real datasets.
... A level 2 soil moisture data product, MIR_SMUDP2, from SMOS satellite was used to validate the nomograph. This data set was obtained from the European Space Agency on the Icosahedral Snyder Equal Area projection (Carr et al., 1997). The product's average spatial resolution is 43 km, but the data are oversampled, and the coordinates are equispaced at 15 km. ...
Article
Full-text available
Hydrological applications require robust and periodic spatially distributed soil moisture data. Radiometer-based soil moisture (~30–60-km resolution), after being appropriately downscaled (<5-km resolution), can be a valuable resource for providing such data globally. However, the accuracy of available downscaling algorithms is severely affected by subgrid variability in geophysical factors and precipitation within a satellite footprint. In this work, we introduce a scaling nomograph that incorporates the scale and site specific dependence of soil moisture on geophysical heterogeneity and antecedent wetness conditions to overcome this limitation. We developed functional scaling relationships to estimate the semivariogram of downscaled soil moisture change without any available fine-scale soil moisture data. The nomograph enables these relationships to be specific to the geophysical heterogeneity and antecedent wetness within a radiometer-based satellite footprint through footprint specific heterogeneity and wetness indices. The heterogeneity index quantifies the subgrid scale variability and covariability of soil, vegetation, and topography within the footprint, and the wetness index is a measure of antecedent precipitation. The nomograph was developed for Arizona, Iowa, and Oklahoma and can enable downscaling to scales varying between 0.8 and 6.4 km. The true power of the nomograph is to enable the use of static dominant factors like soil to define dynamic scale specific scaling relationships for soil moisture for different kinds of land use and land cover in a data driven yet scientific approach, thus providing spatial transferability to the downscaling scheme. The spatial transferability of the nomograph was validated by downscaling Soil Moisture Ocean Salinity data in Manitoba, Canada.
... Late in the 1980s the Environmental Protection Agency (EPA) of the United States, launched the Environmental Monitoring and Assessment Program (EMAP), an effort to monitor the fauna and flora, scaling up from the national to the global scope. This programme entailed the definition of a uniform sampling network under the following requirements ( White et al. 1992;Carr et al. 1997): A sampling system respecting all these requirements is not possible at the global scale; however, it was from them that the concept of Geodesic Discrete Global Grid System (GDGGS) emerged (Sahr and White 1998). White et al. (1992) noted that the spherical projections of the Platonic solids yield the only possible divisions of the spherical surface into equal spherical polygons. ...
Article
Full-text available
Hexagonal segmentations of space have long been known for their advantages vis à vis squared partitions in discretizing spatial variables, be it natural phenomena or human-related features. However, readily available and easy-to-use tools to manipulate and interact with hexagonal rasters remain widely unavailable today. This article presents a first step to enable the use of hexagonal rasters in the GIS field. A format to encode cartographical hexagonal meshes as simple ASCII files is specified through a context-free grammar. Named HexASCII, this file format provides a simple means of storing and sharing such rasters. A set of simple tools based on the HexASCII format is presented, allowing their creation and basic interaction with traditional GIS software.
... A key goal in flattening is to achieve an arrangement of grid cells in computer memory that maintains locality of reference. We unfold the icosahedron of the DGG onto a plane; see Figure 2.1 as described in [3]. It is then necessary to choose an indexing scheme that allows storage and addressing in the flattened grid. ...
Article
Visualizing very large matrices involves many formidable problems. Various popular solutions to these problems involve sampling, clustering, projection, or feature selection to reduce the size and complexity of the original task. An important aspect of these methods is how to preserve relative distances between points in the higher-dimensional space after reducing rows and columns to fit in a lower dimensional space. This aspect is important because conclusions based on faulty visual reasoning can be harmful. Judging dissimilar points as similar or similar points as dissimilar on the basis of a visualization can lead to false conclusions. To ameliorate this bias and to make visualizations of very large datasets feasible, we introduce two new algorithms that, respectively, select a subset of rows and columns of a rectangular matrix. This selection is designed to preserve relative distances as closely as possible. We compare our matrix sketch to more traditional alternatives on a variety of artificial and real datasets. Supplementary materials for this article are available online.
Article
Data portals and services have increased coastal water quality data availability and accessibility. However, tools to process this data are limited – geospatial frameworks at the land-sea interface are either adapted from open-water frameworks or extended from watershed frameworks. This study explores use of a geospatial framework based on hexagons from a Discrete Global Grid System (DGGS) in a coastal area. Two DGGS implementations are explored, dggridR and H3. The geospatial frameworks are compared based on their ability to aggregate data to scales from existing frameworks, integrate data across frameworks, and connect flows across the land-sea interface. dggridR was simpler with more flexibility to match scales and use smaller units. H3 was more performant, identifying neighbors and moving between scales more efficiently. Point, line and grid data were aggregated to H3 units to test the implementation’s ability to model and visualize coastal data. H3 performed these additional tasks well.
Chapter
Big data is a term widely used to describe large datasets. Out of context, it is more of a marketing term than a precise descriptor of scope and scalability. Nevertheless, datasets with a huge number of rows (billions) and columns (millions) can present special problems in developing coherent visualizations. This article reviews these problems and offers strategies for ameliorating them.
Article
Full-text available
In this article, computation for the purpose of spatial visualization is presented in the context of understanding the variability in global environmental processes. Here, we generate synthetic but realistic global data sets and input them into computational algorithms that have a visualization capability; we call this a simulation–visualization system. Visualization is key here, because the algorithms which we are evaluating must respect the spatial structure of the input. We modify, augment, and integrate four existing component technologies: statistical conditional simulation, Discrete Global Grids (DGGs), Array Set Addressing, and a visualization platform for displaying our results on a globe. The internal representation of the data to be visualized is built around the need for efficient storage and computation as well as the need to move up and downresolutions in a mutually consistent way. In effect, we have constructed a Geographic Information System that is based on a DGG and has desirable data storage, computation, and visualization capabilities. We provide an example of how our simulation–visualization system may be used, by evaluating a computational algorithm called Spatial Statistical Data Fusion that was developed for use on big, remote-sensing data sets.
Article
Full-text available
The objective of this paper is to present the multi-orbit (MO) surface Soil Moisture (SM) and angle binned Brightness Temperature (TB) products for the SMOS (Soil Moisture and Ocean Salinity) mission based on the a new multi-orbit algorithm. The Level 3 algorithm at CATDS (Centre de Traitement Aval des Données SMOS) makes use of multi-orbit (multi-revisits) retrieval to enhance the robustness and quality of SM retrievals. The motivation of the approach is to make use of the temporal auto-correlation of the vegetation optical depth (VOD) to enhance the retrievals when an acquisition occurs at the border of the swath. The retrieval algorithm is implemented in a unique operational processor delivering multiple parameters (e.g. SM and VOD) using angular signatures, dual polarization and multiple revisits. A subsidiary angle binned TB product is provided. In this study the L3 TB V300 product is showcased and compared to SMAP (Soil Moisture Active Passive) TB. The L3 SM V300 product is compared to the single-orbit (SO) retrievals from Level 2 SM processor from ESA (European Space Agency) with aligned configuration. The advantages and drawbacks of the Level 3 SM product (L3SM) product are discussed. The comparison is done at global scale between the two datasets and at local scale with respect to in situ data from AMMA-CATCH and USDA-ARS WATERSHEDS networks. The results obtained from the global analysis show that the MO implementation enhances the number of retrievals up to 9 % over certain areas. The comparison with the in situ data shows that the increase of the number of retrievals does not come with a decrease of quality. But rather at the expense of an increased lag of product availability from 6 hours to 3.5 days which can be a limiting factor for forecast applications like flood forecast but reasonable for drought monitoring and climate change studies. The SMOS L3 soil moisture and L3 brightness temperature products are delivered using an open licence and free of charge by CATDS (http://www.catds.fr).
Book
Full-text available
2nd Ed Bibliogr. s. 573-656
Article
Full-text available
The goal of this study is to validate the global cloud parameters derived from the satellite-borne HIRS2 and MSU atmospheric sounding instrument measurements, and to use the analysis of these data as one prototype for studying large geophysical data sets in general. The HIRS2/MSU data set contains a total of 40 physical parameters, filling 25 MB/day; raw HIRS2/MSU data are available for a period exceeding 10 years. Validation involves developing a quantitative sense for the physical meaning of the derived parameters over the range of environmental conditions sampled. This is accomplished by comparing the spatial and temporal distributions of the derived quantities with similar measurements made using other techniques, and with model results. The need to work with Level 2 (point) data, rather than Level 3 (gridded) data for validation purposes is discussed, and some techniques developed for charting the assumptions made in deriving an algorithm and generating a code to produce geophysical quantities from measured radiances are presented.
Article
Despite hundreds of millions of dollars spent annually in the United States on environmental monitoring, policy and decision makers seldom have ready access to monitoring data to aid in prioritizing reasearch and assessment efforts or to assess the extent to which current policies are meeting the desired objectives. EPA is currently conducting research to evaluate options for establishing an integrated, cooperative monitoring program, with participation by federal, state, and private entities, that could result in annual statistical reports and interpretive summaries on the status and trends in indicators of adverse disturbance and corresponding 'health' of the nation's ecosystem on the regional and national scale.
Article
Numerous polyhedral shapes have been proposed as approximations for globes, and the projection most often used is the Gnomonic, with considerable scale and area distortion. Complicated conformal projections have been designed, but an equal-area projection has been used only once, for the icosahedron. The Lambert Azimuthal Equal-Area projection can be modified to provide an exactly fitting, perfectly equal-area projection for any polyhedral globe that has regular polygons, but is most satisfactory for the dodecahedron with 12 pentagons and for the truncated icosahedron with 20 hexagons and 12 pentagons. On the application to the truncated icosahedron, the angular deformation does not exceed 3.75°, and the scale variation is less than 3.3 percent. These advantages are at the expense of increased interruptions at the polygon edges when the polyhedral globe is unfolded.On a propose de nombreuses formes polyedriques comme approximation de globes et la projection gnomonique est la plus souvent utilisee, avec de considerables distortions d'echelles et de superficies. On a concu des projections conformes compliquees, mais on n'a utilise qu'une seule fois une projection equivalente pour l'icosaedre. La projection equivalente azimutale de Lambert peut etre modifiee pour permettre un ajustement exact, parfaitement equivalent pour tout globe polyedrique dote de polygones reguliers, mais elle est plus que satisfaisante pour le dodecaedre a douze pentagones et pour l'icosaedre tronque a vingt hexagones et douze pentagones. Dans l'application de l'icosaedre tronque, la deformation angulaire ne depasse pas 3.75° et la variation d'echelle est inferieure a 3.3 pour cent. Ces avantages s'obtiennent aux depens d'une augmentation des interruptions aux limites des polygones, lorsque l'on deplie le globe polyedrique.
Article
High-performance interaction with scatterplot matrices is a powerful approach to exploratory multivariate data analysis. For a small number of data points, real-time interaction is possible and overplotting is usually not a major problem. When the number of plotted points is large, however, display techniques that deal with overplotting and slow production are important. This article addresses these two problems. Topics include density representation by gray scale or by symbol area, alternatives to brushing, and animation sequences. We also discuss techniques that are generally applicable, including interactive graphical subset selection from any plot in a collection of scatterplots and comparison of scatterplot matrices.
Article
Many environmental resources, such as mineral resources or vegetation cover, or environmental attributes, such as chemical concentration in a stream or benthic community structure, are most appropriately sampled as continuous populations distributed over space, but most applied sampling theory and methodology is concerned with finite, discrete populations. This paper reports sampling methodology that explicitly recognizes the continuous nature of ecological resources. A family of designs are developed to permit control of the spatial dispersion of the sample, variable spatial density, and nested subsampling. The designs have non-zero joint inclusion probability densities, so that rigorous design-based inference and variance estimation are possible.
Article
Timely information on the conditions of the environment is essential if complex environmental issues are to be resolved. Currently, much of the information we have on large-scale environmental conditions is from unrelated local monitoring efforts that must be synthesized to address large-scale issues. The Environmental Monitoring and Assessment Program (EMAP) was initiated by the U.S. Environmental Protection Agency to address the need for coherent ecological resources information on a regional and national scale. EMAP began with a design concept or framework that had the goal of providing an integrated family of monitoring designs that could be adapted to sampling any ecological resource. Since EMAP's inception, the design concept has been applied to pilot and demonstration monitoring programs for a variety of ecological resources. This paper discusses some of the rationale behind the conceptual design, describes some of the techniques for applying the general concept to a specific resource, and gives specific applications for several demonstration studies.