Modeling educational software for people with disabilities: theory and practice



Modeling Educational Software for People with
Disabilities: Theory and Practice
Nelson Baloian, Wolfram Luther
Institut ftir Informatik und Interaktive Systeme
Gerhard-Mercator-Universit~it Duisburg
Lotharstral3e 65, D 47048 Duisburg
{ baloian,luther } @
Jaime S~inchez
Department of Computer Science
University of Chile
Blanco Encalada 2120, Santiago
j sanchez @ dcc
Interactive multimedia learning systems are not suitable for
people with disabilities. They tend to propose interfaces
which are not accessible for learners with vision or auditory
disabilities. Modeling techniques are necessary to map real
world experiences to virtual worlds by using 3D auditory
representations of objects for blind people and visual repre-
sentations for deaf people. In this paper we describe com-
mon aspects and differences in the process of modeling the
real world for applications involving tests and evaluations
of cognitive tasks with people with reduced visual or audi-
tory cues. To validate our concepts, we examine two exist-
ing systems using them as examples: AudioDoom and
Whisper. AudioDoom allows blind children to explore and
interact with virtual worlds created with spatial sound.
Whisper implements a workplace to help people with im-
paired auditory abilities to recognize speech errors. The
new common model considers not only the representation
of the real world as proposed by the system but also the
modeling of the learner's knowledge about the virtual
world. This can be used by the tutoring system to enable the
learner to receive relevant feedback. Finally, we analyze the
most important characteristics in developing systems by
comparing and evaluating them and proposing some rec-
ommendations and guidelines.
Modeling methodologies, Tutoring systems, Sensory dis-
abilities, User adapted interfaces
A large number of educational systems for supporting peo-
ple with disabilities have already been developed. While
most systems targeting the hearing-impaired are oriented
toward training people by developing the necessary skills to
overcome their disabilities, a large proportion of the sys-
tems for blind people aim to increase considerably their"
access to current computing resources which are based on
graphic user interfaces, such as games and web navigation
Apparently, the task of developing computer systems for
visually impaired and hearing-impaired people may be
considered to be of very different nature because the re-
strictions in both cases are contrary. In the first case, the
problem is to construct interfaces which do not rely on
In the second, on the other hand, graphics are the main - if
not the only - way for the system to communicate with the
user. However, we found a great number of similarities in
the procedure of designing and constructing computer sys-
tems for people with different kinds of disabilities. This is
especially true in the case of developing cognitive systems
the aim of which is the modeling and implementation of
various aspects of the real world by a computer system.
This paper proposes an integrated model for developing
learning systems for people with different kinds of disabili-
ties, which consists of a series of steps and recommenda-
tions to be followed; it considers the common aspects and
outlines the differences. Special attention is paid to the
feedback issue, which seems to be a critical point in exist-
ing systems. To be able to give proper feedback the system
needs to include an intelligent tutor module concerned with
the interaction of the student with the implemented model
of the real world. Problems in constructing learning systems
for people with disabilities arise when the model is to be
presented to the user who is supposed to interact with it. In
systems for people without disabilities this model is trans-
mitted to the learner by using graphics (with or without
animation), sounds and text, taking advantage of the whole
spectrum of the computer's multimedia capabilities. For
people with disabilities this spectrum is limited according to
the type of disability they have. This fact forces system
designers to project all the information the model has to
give to or receive from the student on the available channel
the auditory channel for the visually-impaired and the vis-
ual channel for the heating-impaired. Additionally, non-
traditional interaction modes, for example haptic devices
can be used. The same considerations are valid for the con-
struction of the learner's model. The common model high-
lights the importance of an automatic and accurate meas-
urement of the differences between the system's model and
the one the student has developed in his mind.
To validate the proposed development model, we will test it
on two existing systems, one designed for blind and the
other for heating-impaired people: AudioDoom and Whis-
per, respectively. AudioDoom allows blind children to
explore and interact with virtual worlds created with spatial
sound. It was inspired by traditional Doom games, where
the player has to move inside a maze discovering the envi-
ronment and solving problems posed by objects and entities
that inhabit the virtual world. In doing so, it emphasizes
sound navigation throughout virtual spaces in order to de-
velop cognitive tasks to enhance spatial orientation skills in
blind children [7, 8, 15, 16, 19, 20].
AudioDoom has been tested with more than forty Chilean
children aged 7-12 in a blind school setting. Primary results
are described in Lumbreras and S~nchez [9]. Long term
data from a full field study were also communicated in
S~inchez [17, 18].
Whisper implements a workplace for hearing-impaired to
recognize speech errors. Thus, words, sentences, and small
stories are taken from everyday life; learners explore a
typical situation by repeating words or short sentences.
Typical speech errors are displayed in a way appropriate to
the learner. Whisper was presented at the REHA 99 exhibi-
tion at Dtisseldorf, Germany and was evaluated during a six
month period at special schools for the hearing-impaired,
with the aid of all leading German rehabilitation centers.
The usability of the English version was tested in collabo-
ration with the'National Association for Deaf People in
Dublin/Ireland [23, 34, 25].
Several virtual reality systems and virtual environments
combined with appropriate human-machine interfaces were
used to enhance the sensual capabilities of people with
sensory disabilities: Presentation of graphic information by
text-to-speech, and 3D auditory navigation environments to
construct spatial mental representations or to assist users in
acquiring and developing cognitive skills [21].
A sonic concentration game described in [13] consists of
matching pairs of different levels of basic and derived geo-
metric shapes. To represent geometric shapes it is necessary
to build a two-dimensional sound space. The concept allows
the shape to be rendered by the perception of moving sound
in a special plane. Each dimension corresponds to a musical
instrument and raster points correspond to pairs of frequen-
cies on a scale. Moving horizontally from left to right is
equivalent to a frequency variation of the first instrument,
and moving vertically to a frequency variation of the second
VirtualAurea by S~inchez [18] was developed after sound-
based virtual environments proved to trigger the develop-
ment of cognitive spatial structures in blind children. Virtu-
alAurea is a set of spatial sound editors that can be used by
parents and teachers to design an ample variety of spatial
maps such as the inner structure of the school and rooms,
corridors and other structures of a house. Users can also
integrate different sounds associated to objects and entities
in a story. VirtualAurea is being used to develop complex
cognitive spatial and temporal structures in blind children.
Mehida [1] is an intelligent multimedia system for deaf or
hearing-impaired children designed to assist them in ac-
quiring and developing communication skills. Mehida cov-
ers the following types of communication: Finger spelling
(representing the letters of the alphabet using the fingers),
gestures or sign languages, lip reading (understanding spo-
ken language through observing lip motion), and voice
The IBM SpeechViewer III [
snsspv3.html] is a powerful speech and language tool that
transforms spoken words and sounds into imaginative
graphics. The system is intended for people who have
speech, language or hearing disabilities and includes func-
tions for the administration of the learners' data. There are
different types of exercises: The awareness exercises are
simple games concentrating on basic speech parameters and
can be used by very young children. The skill building
exercises ask the learner to work towards achieving a goal
set by the expert and provide feedback on pitch, voicing,
and vowel articulation. The patterning exercises offer a
graphic representation of the learner's and expert's voices
for comparison concerning pitch and loudness.
ISAEUS [6] is a speech training system for deaf and hear-
ing-impaired people on a multilingual basis which has been
developed in the context of an EU-project begun in 1997 by
Jean-Paul Haton (Nancy, France) and other groups in Spain
and Germany.
The Visual Talker [
/FY99/phasel/ph199t02.html] provides a real-time speech-
to-text visual translator system for classroom environments.
The aim of this research is the development of a real-time
speech-to-text system to be used by teachers, their students
with hearing loss and/or speech impairments, and all other
students in classroom settings to provide a 'full participa-
tion' communication system.
Summarizing, we can identify two ways to improve systems
for people with sensory disabilities: by complementing
graphic interfaces with other communication facilities and
by supporting direct interactions with internal models or
representations of virtual models. For blind people, systems
tend to transform graphic output information in a haptic or
audible format. Speech processing is used to generate text
or graphics from acoustic input in order to permit hearing
impaired users to navigate through and communicate with
the model. In the next section we will show how the mod-
eling process should reflect the user's impairments.
Unified modeling pipeline for developing educational
We propose a unified model for creating educational soft-
ware for people with disabilities. The modeling process
starts with the definition of cognitive skills the learner has
to acquire; then it considers the creation of a virtual envi-
ronment composed by a navigable world and built by using
an adequate modeling language, dynamic scene objects, and
acting characters. Scenic objects are characterized by
graphic and acoustic attributes; character's actions are
based on deterministic and non-deterministic plans as in an
interactive hyperstory [9]. The learner explores the virtual
world by interacting with appropriate interfaces and imme-
diately obtains feedback. The learner's actions, such as
sound reproductions or utterances, are collected, evaluated,
and classified, based on a student's modeling and diagnostic
The modeling pipeline is divided into seven sections.
1. We define cognitive skills in a real world situation, for
example self-motivating activities, drill and practice appli-
cations, problem solving or leisure time occupations.
2. Objects are constructed of geometric primitives or a
combination of these. They are characterized by graphic
and acoustic attributes and grouped into components with
input and output slots. Control elements of the virtual world
are represented by the graphic and acoustic elements,
known as icons and earcons.
3. We develop an internal computer representation and
define a geometric environment and a 2D/3D visual and
acoustic model. Modern object-oriented and message based
modeling languages are powerful tools for building virtual
worlds using a scene graph as data structure. We insist on
the necessity of special editors for teachers and learners to
create synthetic models independently.
:Real ............................................ world .... :: ...................... : i ::
Metaphors f0~ impaired Children
Drilll and practice applications i
f motivating activ ties: : i
Develop editors fi Characterize the I- I
for teacher and learner i i ~ maps, find invariants
.................. i ............. : 'i ..................... ......................... i ............... i
COmputer r~epresentation , !
2D/3D yisual aficl acoustic model
i:forms, colots, soundsi:grids i
.::~:~'.~" ............ ~??~?...
3D spatial sounds f,;/;; / ""~.'.L'~, waveform model
i:AcousticmOdeJ 'i i ii~":"Visual model ............. !
i for blind childreni for hearing impaired i
.............................. i:i; ............... i:i ................................. ................. ...................................
External ':~ ?~\.
, ~.,i.;.;;;~'~i;'
.... Learner's best
.................................................... ................................
~Eoduc :tions
The learner explores the model i
adequate inteiffaceS:;
!i navigating, grasping, localizing/ i
reprod0cipg, interpreting i
........ ~ : :'~'~.~
~', ,,~l~art~ ~ o
Figure 1: Model synthesis
4. According to the problem characteristics and the target
users, problem-specific correspondences between graphic
and acoustic attributes or properties are used to reduce the
model to its visual or acoustic projection. The resulting
model can also generate a special editor for impaired learn-
ers. Characteristic model parameters and their domains and
ranges are formally defined.
5. The acoustic representation of the model uses spatial
sound; the visual model uses graphic representations of the
objects. Interaction and navigation are based on visual or
acoustic control elements depending on the case.
6. The learner explores the model by interacting with suit-
able interfaces. This can be done through navigating with-
out changing viewpoint or the use of an internal representa-
tion of the user giving him the illusion of being a part of the
virtual scene. A blind person explores neighboring models
by grasping them, tracking objects or listening to typical
sounds. A hearing-impaired person looks at model primi-
tives like letters, finger or lip representations of phones.
Then the learner explores the object space by navigating
and reproducing the structures. It is imperative to make sure
that conditions during the reconstruction process are always
the same. Therefore, the interfaces involved are calibrated
7. Depending on the particular parameters of the model,
error measures between the internal model and the model
reproduced by the learner are defined. Appropriate image
processing or speech recognition algorithms are necessary
to quantify the error. Learner's actions are collected, evalu-
ated and classified. The outcome is transformed to a user-
adapted aural or visual output.
The modeling work flow
AudioDoom and Whisper are being used as exemplary
applications for blind and heating-impaired users, respec-
tively. Thus, we will validate our common model by de-
scribing the modeling work flow in both educational sys-
tems in a more detailed and formalized way.
Center of the flying source
8 Box of bullets
j Monster ~
Catcher A
Figure 2a: External model C
Finally, the degree of similitude is derived from the
measure and the result is displayed in an learner-adapted
output and used for updating the student model.
Figure 2b: Reconstructed LEGO-model D
In AudioDoom and Whisper the modeling process follows
the steps introduced above: First, a virtual model B is de-
rived from the real br fictive world scenario A by means of
abstraction and reduction without taking into consideration
the limitations of the potential users. Then, model B is pro-
jected to an appropriate model C which can be explored by
people with sensory disabilities using available communi-
cation channels. Appropriate editors support the modeling
process. Important model parameters must be identified at
this stage. By interacting with the system, the learner makes
an internal reconstruction of the model C called D. Addi-
tionally, the learner can build an external representation of
D, which has to be evaluated (cp. Figures 2a and 2b). This
can be done by using an appropriate multidimensional
measure mi(P,X,Y) depending on the objects and their
attributes. I represents the parameters, P a learner, X the
computer model, and Y the reconstructed model.
interaction language
Define multidimensional
mi(P,X,Y ), X
{A,B,C}, Y=D,
P: target
A: Virtual world (Doom game)
Model by hand Map uses simple
geometric forms
special editors for and colors
teachers and learners
B: Visual (and acoustic) computer representation
I: Position, appearance, orientation,
order of
distances volume, height
ii~ Map objects to sounds, study the
order of ratios
orientation ~-~ role of volume, frequency,
C: Acoustic model 3D spatial sound (Knowledge)
o~ P: blind
person or notion of
visual cues
P explores model by
and localizing sounds)
P explores the model C (and B)
If possible, P explores all the features
by mouse/keyboard/joystick/glove/...
e.g. distances, absolute and relative
~,~ posititons, objects, orientations,
D: Reconstructed model (LEGO, wood cubes,
measured Find correspondences between
error to user
D and C (B,A), measure distance
adapted output between bricks and
internal objects
Figure 3: The modeling pipeline in AudioDoom
A: Real word (Objects, phonemes, words, stories)
Model by hand
special editors for Map uses simple geometric
forms, colors,
and learners
cartoons, and phonetic descriptions
B: Graphic, textual, and phonetic model
good representations i'i Parameter
h Kate. stress, articulation,
FFT and filters ~ volume, pitch. I- I map between phonetic
formants are important description and graphic model
C: Visual model (Frequency - color model)
Define appropriate
Calibrate / customize
Define multidimensional
measure m, (P,X,Y)
Xe {C,B}, Y=D
P: target
~.~ Learner's best productions ->
knowledge in a repository
P: person
P produces visual representations when
spiking words or sentences
P explores the model C and B
D: Reconstructed model (by speech/acoustic pattern)
Methods: Image processing Problem: Find correspondences between
speech error recognition D and C. use the eye-ball metric, image
Transform m. to
user adap- or speech processing
visual feedback
Figure 4: The modeling pipeline in Whisper
The modeling work flow in AudioDoom is represented in
Figure 3. The model of the virtual game world (which in
our application is a simple labyrinth with one main corridor
and two secondary corridors including entities and objects)
is projected on another model consisting only of sounds. At
this stage, the role of volume, frequency, melody, and
rhythm in representing different forms, volumes, and dis-
tances is analyzed. Learners interact with this model by
'virtually walking' through this labyrinth with a keyboard,
mouse, and ultrasonic joystick. Sound-emitting objects help
them to build a mental model of the labyrinth. Finally, they
make concrete replications of mental models with LEGO
blocks and try to rebuild the internal model as it was per-
ceived and imagined after an exploration of the spatial
structure. Different kinds of blocks represent the objects of
the internal model C in a well-defined way.
The concrete representation is checked by a human tutor or
a camera to look for any correspondence with the original
model embedded in the computer. The error measurement
(represented by the function mi(P,X,Y)) reflects this differ-
ence. The index I stands for different properties of objects
which are part of X and Y, P denoting the learner, Y the
reconstructed model D, and X the internal model C of the
labyrinth. Bricks can be represented by two coordinate
triples of a three dimensional discrete vector space. Then
the distance between the same object in models C and D
must be calculated using an appropriate vector normL Sum-
ruing up, we obtain a candidate for an error measure mDis-
The modeling work flow in Whisper is represented in Fig-
ure 4. The modeled entities are nouns or objects which are
named using phonemes and words. This model is projected
on a visual representation using only graphic resources. The
learner explores the visual model of a short story and recon-
structs the visual wave form representation by speaking
words or sentences. Correspondences between the internal
model and its representation are detected by the learner or a
human tutor using the 'eyeball metric' or by means of im-
age processing.
As we can see, despite some differences in object repre-
sentation, interfaces, perception modes, and error measures,
there are important common tasks which should be under-
taken when developing systems for either blind people or
people with hearing disabilities. In fact, by comparing the
two work flow diagrams, we were able to formulate a
model synthesis for developing systems for people with
different kind of disabilities.
A critical task in the modeling pipeline is the reduction of
the original model to either only graphic or only acoustic
output. This topic has already been discussed in our recent
Visualization for the mind's eye
Now we want to explain two steps of the modeling pipeline
in more detail.
Figure 5: Reconstruction process in AudioDoom
To process an external model and to evaluate error meas-
ures, we will assume that a blind user reconstructs Audio-
Doom's maze structure by using special LEGO blocks.
Each block is individually marked by black bars and dots. A
blind user takes note of the different kinds of blocks, selects
the appropriate ones, and rebuilds the mental image by
using the blocks one by one.
After each step a picture is taken by a digital camera placed
on a fixed position over the scene. We highlight a typical
state of a reconstruction process and indicate the next step
by adding a new LEGO block on the lower wall. Figure 5
shows the situation before and after the next step. Reducing
the colors to black and white makes it possible to apply low
level image processing routines to detect the new LEGO
block. After a calibration of the two pictures, we can local-
ize the new block through an XOR-operation on both im-
ages resulting in two dashes or only one, otherwise in new
dots. Starting from these new picture elements we calculate
the position and type of the new LEGO block. Finally, the
learner's model representation is updated.
Thus it is proved that by certain low level operations on
succeeding pictures the external model can be transferred
into an internal representation which is used to feed an ap-
propriately defined error measure function mi(P,C,D). A
degree of fidelity is derived and displayed by text-to-
speech. For a more sophisticated approach using LEGO
RCX robots see [12].
Now we will look at the frequency-color model represent-
ing words in Whisper. From a more technical point of view
we sample the recorded speech signal and execute several
fast Fourier transforms per second to obtain a frequency
representation using an RGB color map with three overlap-
ping triangular characteristics for red, green, and blue
which can be freely modified.
Figure 6: Frequency-color representation in Whisper
The resulting frequency-color representation is smoothed
and simplified by an appropriate cluster and threshold algo-
rithm (cp. Figure 6). Whereas the wave amplitude image
provides any feedback concerning volume and rate in the
xy-plane, omitted syllables are characterized by a lack of
colored areas, and intonation and articulation by the pa-
rameters hue and saturation as well as the shape of the
curves. Details are described in Hobohm [5] and Gr~ife [4].
In recent years it has been shown that the combination of
neural networks with Hidden Markov Models is a powerful
approach to speech recognition [26]. Next, we will include
these techniques in our system by using a phoneme-labeled
speech database providing a morpheme-to-morpheme trans-
position of an oral language into the visual channel. In this
process all structures of the unique oral language are pre-
served. However, speaker adaptation of a certain amount of
available training data does not have the intention of re-
ducing the error recognition rate. On the contrary, the sys-
tem is intended to detect errors in examining deformed or
omitted phones or erroneous articulation. Here it will be
interesting to search for the most probable phone sequences.
Furthermore, the phones are to be displayed within a sepa-
rate window using an adequate segmentation.
Virtual environments (VE) can be used to simulate aspects
of the real world which are not physically available to users
for a wide variety of reasons. They become more realistic
through multimedia displays which include haptic, visual,
and auditory information. According to Colwell et al. [3]
and PacieUo [10], there are several domains in which VEs
can be used to build educational software for people with
In education a virtual laboratory assists students with
physical disabilities in learning scenarios. Possible ap-
plications concern problem solving, strategic games, ex-
ploring spaces or structures, training of speech or hear-
ing capabilities, and working with concrete materials.
Special VE interfaces like head-mounted devices, the
space mouse or gloves are often included.
Training in virtual environments deals with mobility or
cognitive skills in spatial or mental structures.
Communication can be enhanced using gesture recog-
nition, input/output through touching, sign-to-speech or
speech-to-sign translation and special methods of speech
Rehabilitation is possible in the context of physical
therapy - a recovery of manual skills or learning how to
speak or listen to sound can be targeted.
Access to educational systems is facilitated via dual
navigation elements like earcons, icons or haptic de-
Our idea is to support the visually or hearing-impaired in
building conceptual models of real world situations in the
way that seeing or hearing users do. Our approach is com-
parable to the one introduced by Zajicek et al. [27].
We can identify four important common elements and as-
pects in the modeling process for systems concerning blind
and hearing-impaired persons:
1. The Conceptual model results from mapping the real
or fictive world situation into a computer model which
may use all digital media by applying adequate modeling
The Perceptual model is created by developing a
perceptual model and a script for the dynamic changes of
the model. It can be perceived by the learner using only
those information channels which are available to him
and respects the disabilities of the user group. With the
perceptual model, it is important to provide surprising
elements to provoke attention in order to enhance the
perception process. The computer model description
should be based on text. Explanation of graphic objects
should be given in caption form. This text can be pre-
sented for the blind by using a text-to-speech plug-in or a
Braille display, for the deaf by converting it from speech
to text. Intuitive correspondences between graphic and
aural objects must be defined. Attention must be paid to
the fact that only a small number of sounds can be
memorized. Also, melodies that help to identify the ob-
jects should be used.
3. The implementation design - Icons and earcons
should be provided in parallel. If sound or speech is
used, written dialogue for hearing-impaired people
should be provided. If there are animated image se-
quences or videos with sound, subtitles or moving text-
banners should be used. Sound provides a rich set of re-
sources which complement visual access to a virtual
world. The four types of audio examined by Ressler and
Wang [11] are ambient background music, jingles, spo-
ken descriptions, and speech synthesis. Ambient music
can play and vary as the user moves from one room to
another, providing an intuitive location cue. Jingles or
small melodies should characterize special objects and
control elements. Spoken descriptions of objects can play
as the viewer moves closer to an object. Speech synthe-
sizers can read embedded text from the nodes in a scene
graph representation. Recent WEB-languages provide
Anchor node descriptions, EnvironmentNodes or World-
Info nodes. Internet accessible speech synthesizers sup-
ply easy access to text-to-speech technology.
The implementation tools - Special editors or lan-
guages like Java, Java3D, OpenGL, or VRML should be
used. VRML defines a standard language for represent-
ing a directed acyclic graph containing geometric infor-
mation nodes that may be communicated over the Inter-
net and animated and viewed interactively in real-time.
VRML 2.0 [22] also provides behavior, user interaction,
sensors, events, routes and interpolators for interactive
animation. User interaction, time dependent events or
scripts can trigger dynamic changes in some properties
of an object. VRML viewers are available not only as
plug-ins to Internet browsers, but also as interactive ob-
jects that may be embedded into standard Office docu-
ments. However, the actual version does not yet support
collaborative scenarios. This is only a succinct descrip-
tion of some tools that are currently available. There are,
of course, many more, but we have focused on those that
are platform independent and based on international
Software architecture and tools for implementation
To support multi-modal interaction we propose an archi-
tectural model which consists of an interaction module
supporting the input and output devices and allowing data
input and representation of the feedback, a data representa-
tion module that transforms collected data into data types
which will be managed by the model implementation on the
server, and an application module-that contains the task
scheduler, student modeling and diagnostic subsystems.
[Internal model] ~Serv--e~- -
[module [I
/tTutor,s edito~], M~odel
editor tl ~repository
Data representati~
Interaction module
Figure 7: Architectural model
The internal
model module: Special editors for tutors or
pupils can help in the construction of an internal model. For
hearing-impaired people models can be selected using drag-
and-drop actions on nouns, icons or cartoons in a reposi-
tory. Equivalent representations of these entities should be
given in the form of text or phonetic transcription. Blind
people choose elements from internal worlds by using sen-
sitive boards, pointing or haptic devices. Each internal
model designed can be coded as a VRML scene graph that
is memorized in an appropriate database. It should be pos-
sible to add a user avatar and to render the graph with re-
spect to the viewpoint of this representation.
The interface
module: We have seen that the common
procedure for developing interfaces for blind people con-
sists in trying to transform the graphic output into a haptic
or an audible format or the latter into a text or graphic for-
mat through which the user navigates. We believe that in
many cases it is more appropriate to model the interface
first and then try to represent this as a logic structure using
auditory or haptic resources. This idea was first proposed
by Savidis et al. [21] and Ressler and Wang [11], who de-
veloped a methodology for making synthetic virtual worlds
accessible to the visually and physically impaired.
The addition of embedded text, sounds, and further assist-
ing devices, such as speech recognition systems and data
gloves all contribute to more accessible virtual worlds.
Virtual environments (VE) organize objects in hierarchical
data structures. Designers wishing to make their virtual
worlds generated by modeling languages like VRML more
accessible to blind people should think about the possibility
of enhancing the syntactic node elements within the graph
structure by
adding WorldInfo node textual descriptions, both for the
entire world and for individual objects,
using varying sounds depending on distances described in
the EnvironmentNode,
using the description field of Anchor nodes, to implement
links to other scenes or objects,
associating sound nodes to specify the orientation, inten-
sity, and localization of sounds including spoken descrip-
tions of objects of interest,
creating large control areas for alternative input devices.
In this paper we have given an overview of the state of the
art in the field of existing educational systems for people
with disabilities such as blindness or deafness, especially
focusing on transferring the real world into appropriate
computer representations; then we introduced a unified
methodology for modeling the real world for these people,
and finally we illustrated important tasks for defining error
measures and adapted output formats.
A critical mass of educational systems have already been
developed for disabled people, which allows some generali-
zations and recommendations to be made. The development
of systems for people with disabilities should not longer
appear as isolated handcrafted efforts; instead, efforts
should be made to systematize the construction of these
types of systems. Recent advances in hardware and soft-
ware developments support our idea and provide hope that
the technological foundation for such systems has already
been laid.
This research and common multi-media software develop-
ment for the disabled (AudioDoom for blind people and
Whisper for postlingually deaf people) are being carried out
by the authors in the course of a current German-Chilean
project funded by the German ministry BMBF and the Uni-
versities of Duisburg and (Santiago de) Chile. We thank Dr.
M. Mtihlenbrock and Dr. D. Willett for their valuable sug-
gestions and our collaborators Dr. W. Otten and C. Wens
for their persistent valuable support during the development
and evaluation of Whisper, discussing the ideas and sup-
plying the material of their research for our purposes.
Alonso, F., de Antonio, A., Fuertes, J. L., and Montes,
C. Teaching Communication Skills to Hearing-Impai-
red Children.
IEEE Multimedia,
Vol. 2, No. 4 (1995),
2. Baloian, N., Luther, W. Visualization for the Mind's
eye. Proc. Dagstuhl, Software Visualization 2001,
LNCS 2269, St. Diehl ed., Springer, Berlin (2002).
3. Colwell, Ch., Petrie, H., Kornbrot, D., Hardwick, A.,
Furner, S. Haptic Virtual Reality for Blind Computer
Users. The Third Annual ACM Conference on Assistive
Technologies, April 15-17, 1998, Marina del Rey, Cali-
fornia, USA, 92-99. Available at
4. Gr~fe, J. Visualisierung von Sprache und Erkennung
sprechtypischer Parameter und ihre Ver~inderung bei
Spatertaubten. Diploma-Dissertation, Duisburg (1998).
5. Hobohm, K. Verfahren zur Spektralanalyse und Mu-
stergenerierung ftir die Realzeit-visualisierung gespro-
chener Sprache. PhD-Dissertation, TU Berlin (1993).
6. ISAEUS. Speech Training for Deaf and Heating-
Impaired People (1997). Available at News/enw28/
Lumbreras, M., S~nchez, J., Barcia, M. A 3D sound
hypermedial system for the blind. Proc. First Europ.
Conf. on Disability, Virtual Reality and Associated
Technologies, Maidenhead, UK (1996), 187-191.
8. Lumbreras, M., S~nchez, J. 3D aural interactive hyper-
stories for blind children. International Journal of Vir-
tual Reality 4 (1) (1998), 20-28.
Lumbreras, M., S~inchez, J. Interactive 3D Sound Hy-
perstories for Blind Children. CHI '99, Pittsburg PA,
USA (1999), 318-325.
10. Paciello, M. Making the Web accessible for the deaf,
hearing and mobility impaired. Yuri Rubinsky Insight
Foundation (1998). Available at
11. Ressler, S. and Wang, Q. Making VRML Accessible
for People with Disabilities. In Proceedings of ASSETS
'98, the Third Annual ACM Conference on Assistive
Technologies, Marina del Rey, California, USA (1998),
12. Ressler, S., Antonishek, B. Integrating Active Tangible
Devices with a Synthetic Environment for Collabora-
tive Engineering. Proc. 2001 Web3D Symposium,
Paderborn, Germany, Febr. 19-22 (2001), 93-100.
13. Roth, P., Petrucci, L., Assimacopoulos, A., Pun, Th.
Concentration Game, an Audio Adaptation for the
blind. CSUN 2000 Conf. Proceedings, Los Angeles,
USA. Available at
s proceedings.html.
14. S~nchez, J. 3D interactive games for blind children.
Proceedings of Technology and Persons with Disabili-
ties, CSUN 2000 Conference Proceedings, Los Ange-
les, USA.
15. S~inchez, J. Interactive virtual environments to assist
the cognition of blind children. Proc. VII Iberoameri-
can Scientific Conference, CYTED, Panama (2000).
16. S~inchez, J. Interactive virtual environments for blind
children: Usability and cognition. Proceedings
4 th
Iberoamerican Congress on Informatics and Education,
RIBIE 2000, Vifia del Mar, Chile.
17. S~nchez, J. Usability and cognitive impact of the inter-
action with 3D virtual interactive acoustic environ-
ments by blind children. CP&B 2000 (in press).
18. S~nchez, J. Interactive virtual acoustic environments
for blind children. Proceedings of ACM CHI "2001, Se-
attle, Washington, April 2-5, 2001, 23-25.
19. S~inchez, J., Lumbreras, M. Virtual Environment Inter-
action through 3D Audio by Blind Children. Journal of
Cyberpsychology and Behavior, CP&B 2(2) (1999),
20. S~inchez, J., Lumbreras, M. Usability and Cognitive
Impact of the Interaction with 3D Virtual Interactive
Acoustic Environments by Blind Children.
3 rd
Conf. on Disability, VR and Assoc. Technologies, Al-
ghero, Sardinia, Italy (2000),. 67-73.
21. Savidis, A., Stephanidis, C., Korte, A., Crispie, K.,
Fellbaum, K. A generic direct-manipulation 3D-audi-
tory environment for hierarchical navigation in non-
visual interaction. Proc. A CM ASSETS 96, 117-123.
22. VRML Consortium. VRML97 International Standard
Specification (ISO/IEC 14772-1:1997). Available at
23. Wans, C1. ILSE: Ein Interaktives Lehr-/Lernsystem f'~
Sp~itertaubte. In: Ch. Herzog (Hrsg): Beitr~ige zum 8.
Arbeitstreffen der GI-Fachgruppe, 1.1.5/7.0.I "Intelli-
genre Lehr-/Lernsysteme", Blaue Reihe Mtinchen,
TUM-INFO-08-I9736-150/1.-FI (1997), 133-144.
24. Wans, C1. An Interactive Multimedia Learning System
for the Postlingually Deaf. Proceedings of ITiCSE'98,
6th Annual Conference on the Teaching of Computing,
3rd Annual Conference on Integrating Technology into
Computer Science Education, ACM, New York, Ox-
ford (1998).
25. Wans, C1. Computer-supported hearing exercises and
speech training for hearing-impaired and postlingually
deaf. Assistive Technology Research Series, Vol 6 (1),
IOS Press (1999) 564-568.
26. Willett, D. Beitr~ige zur statistischen Modellierung und
effizienten Dekodierung in der automatischen Spra-
cherkennung. PhD-Dissertation, Gerhard-Mercator-
Universit~it Duisburg (2000).
27. Zajicek M., Powell C., and Reeves C. A Web Naviga-
tion Tool for the Blind. Proc. of ASSETS'98, the 3rd
ACM/SIGCAPH on Assistive Technologies, Los Ange-
les, USA (1998), 204-206.
