Conference PaperPDF Available

Synthesizing meaningful feedback for exploring virtual worlds using a screen reader

Authors:

Abstract and Figures

Users who are visually impaired can access virtual worlds, such as Second Life, with a screen reader by extracting a meaningful textual representation of the environment their avatar is in. Since virtual worlds are densely populated with large amounts of user-generated content, users must iteratively query their environment as to not to be overwhelmed with audio feedback. On the other hand, iteratively interacting with virtual worlds is inherently slower. This paper describes our current work on developing a mechanism that can synthesize a more usable and efficient form of feedback using a taxonomy of virtual world objects.
Content may be subject to copyright.
Synthesizing Meaningful Feedback
for Exploring Virtual Worlds Using
a Screen Reader
Abstract
Users who are visually impaired can access virtual
worlds, such as Second Life, with a screen reader by
extracting a meaningful textual representation of the
environment their avatar is in. Since virtual worlds are
densely populated with large amounts of user-
generated content, users must iteratively query their
environment as to not to be overwhelmed with audio
feedback. On the other hand, iteratively interacting
with virtual worlds is inherently slower. This paper
describes our current work on developing a mechanism
that can synthesize a more usable and efficient form of
feedback using a taxonomy of virtual world objects.
Keywords
Virtual Worlds, Accessibility, Audio I/O
ACM Classification Keywords
H.5.2 [User Interfaces]: Voice I/O
General Terms
Human Factors, Measurement, Design
Copyright is held by the author/owner(s).
CHI 2010, April 1015, 2010, Atlanta, Georgia, USA.
ACM 978-1-60558-930-5/10/04.
Bugra Oktay
Department of Computer Science and Engineering
University of Nevada, Reno
Reno, Nevada, USA
oktayb@unr.nevada.edu
Eelke Folmer
Department of Computer Science and Engineering
University of Nevada, Reno
Reno, Nevada, USA
efolmer@unr.edu
CHI 2010: Work-in-Progress (Spotlight on Posters Days 3 & 4)
April 14–15, 2010, Atlanta, GA, USA
4165
Introduction
Virtual worlds have enjoyed increasing popularity
during past years, with millions of participating users.
The immersive graphics, the large amount of user-
generated content and the social interaction
opportunities offered by the greater sophistication of
virtual worlds, could eventually make for a more
interactive and informative World Wide Web. Popular
virtual worlds include Second Life [4] and World of
Warcraft [1]. The focus of our research is on virtual
worlds with user generated content (which have no
elements of combat) as these are increasingly used as
cyber learning environments [7].
In Second Life, users control a digital puppet, called an
avatarwith human capabilities, such as walking and
gesturingthrough a game-like, third person interaction
mechanism. Until recently [2,3,6,9] virtual worlds were
inaccessible to users who are visually impaired as these
virtual worlds are entirely visual and lack of any textual
representation that can be read with a screen reader or
tactile display.
In our previous research, we developed a screen reader
accessible interface for virtual worlds called TextSL
[2,8]. TextSL allows screen reader users to access
Second Life and interact with large numbers of objects
and avatars there, using a command-based interface
that is inspired by multi-user dungeon games. Users
navigate their avatar using commands such as: “move”
or “teleport”. Users can query their environment using
the “describe” command, which lists the number of
objects and avatars found within a 360 degree 10
meter radius around the user‟s avatar. Objects and
avatars can then be iteratively queried (See Figure 1).
Figure 1. It takes two “describe” commands to learn that
the cat is brown.
User generated virtual worlds, are densely populated
with objects, e.g., in Second Life we found that on
average 13 objects can be found within a 10 meter
radius around the user‟s avatar [2]. Providing all the
object names as audio feedback may easily overwhelm
the user, especially if the names of the objects are
long, which motivated the use of a mechanism where
users have to iteratively query their environment.
User studies with TextSL show that a command-based
interface is feasible [2], as TextSL allows screen reader
users to explore Second Life, communicate with other
avatars, and interact with objects with the same
success rates as sighted users using the Second Life
viewer (TextSL has been designed to support access to
other open source virtual worlds such as OpenSim [5]
once APIs become available). However, command-
based exploration and object interaction is significantly
slower in TextSL [2] due to the requirement of users to
have to iteratively query their environment. Some
users found the amount of feedback that TextSL
provides overwhelming. The focus of our current
CHI 2010: Work-in-Progress (Spotlight on Posters Days 3 & 4)
April 14–15, 2010, Atlanta, GA, USA
4166
research is to synthesize a more meaningful form of
feedback that seeks to balance: (1) minimizing the
number of overwhelming feedbacks; and (2)
minimizing the amount of interaction required.
Synthesizer
Users who are visually impaired typically use their
screen readers at different speech rates, which
indicates that screen reader users have different
abilities to process audio feedback. The proposed
synthesizer incorporates a user specified word limit
(UWL). Since words may vary in length and this will
take different amounts of time to be pronounced
through a screen reader, in future work, the UWL could
be combined with a user specified time limit. However,
this would require TextSL to know the speech rate of
the screen reader.
The synthesizer executes as follows:
1. Scan and filter objects within a fixed range around
the user and compile the found names into the Scanned
Word List (SWL)
IF (#SWL > UWL)
2. Group and aggregate SWL.
ELSE
3. Detail SWL.
Step 2 specifically focuses on compressing the
description to prevent overwhelming the user with
feedback and step 3 focuses on reducing the number of
“describe” commands that must be given. The
synthesizer will either execute step 2 or 3 depending on
the number of words generated from step 1. Although
nearby avatars are parts of the provided feedback and
they are just as important as the objects, the
synthesizer only focuses on virtual world objects
because (1) any avatar can be of importance to the
user regardless of its properties so filtering is not
applicable; (2) the number of objects around the user
is typically much larger than the number of avatars and
therefore feedback is most effectively synthesized
through grouping and aggregating objects.
Object Scanning and Filtering
The Second Life client displays objects and avatars that
are in front of the user‟s avatar. To eliminate the user's
need of turning to different directions to find out what
is there, TextSL considers all objects within a 360-
degree radius with a user specified range (default is 10
meters) around the user (Figure 2).
Figure 2. Scanning of objects around the avatar.
For each object we compute a value according to the
following function:
F = NAME * SIZE * DISTANCE-1 * INTERACTION * ROOT
Where,
NAME: The length of the name of the object divided by
the average word length. Objects with non-descriptive
names like “object” are given the value 0.
CHI 2010: Work-in-Progress (Spotlight on Posters Days 3 & 4)
April 14–15, 2010, Atlanta, GA, USA
4167
SIZE: The bounding box of the object.
DISTANCE: Distance in meters to the user divided by the
scanning range.
INTERACTION: 10 if the object allows interaction and 1 if
not. ROOT: 1 if the object is the root object and 0 if the
object is a sub object.
This function prioritizes objects that: (1) have more
descriptive names; (2) are closer to the user; (3) are
larger; (4) are interactive; and (5) are not sub objects.
The latter is to avoid users interacting with parts of a
larger object such as a wheel, which is part of a larger
object such as a car. As most content creators assume
that users can see, they frequently leave the name of
objects to their default value (“object”). This is a
problem when users query their environment as
Figure 3. Sample output value function for a number of
objects within 10 meters of the user.
this may return the names of multiple objects called
“object”, which are basically meaningless to TextSL
users. We identified that for Second Life 32% of the
objects are called “object” [2]. Such objects are culled
from our object scanning. Only objects with a value
above a certain user specified threshold value are
compiled into the SWL (See Figure 3).
Grouping and Aggregation
If the number of words in the SWL is over the UWL
limit, then we need to reduce this by grouping and
aggregation.
Grouping: Objects with the same name are grouped
together, e.g., [car, car, dog] [2 cars, dog].
Grouping has no information loss but this step may not
significantly reduce the number of words if the number
of grouped objects is below three. Some savings are
incurred when adjectives are included in the UWL count
but as these typically are very short we choose not to
include these. Still, saying “There are 2 cars.” makes
more sense than saying “There are a car and a car.”
Aggregation: Object names are aggregated if they can
be determined to be members of the same class.
Aggregation requires the use of a taxonomy of virtual
world objects. This taxonomy is something that we are
currently creating in related work where we developed
a game within Second Life that can help improve the
accessibility of virtual worlds as often meta-data for
virtual world objects is missing. In this game sighted
users can tag and label objects using a scavenger hunt
game, which builds a set of training examples for an
automatic classifier that can recognize objects not
having a name based on their geometry. This game
also helps build taxonomy of virtual world objects,
which we can use in aggregating a more usable form of
feedback. The taxonomy we create is described as a set
of rules, e.g., [vehiclecar] or [animaldog] and these
rules may also define subtypes [dogpoodle]. The
taxonomy of objects created as such is not restricted to
CHI 2010: Work-in-Progress (Spotlight on Posters Days 3 & 4)
April 14–15, 2010, Atlanta, GA, USA
4168
Second Life and can be used by any virtual world that
has textual descriptions for their objects.
Using the taxonomy, we analyze whether any of the
object names can be aggregated to the same parent,
e.g., [bicycle, car] [2 vehicles]. Larger reduction in
word count can be achieved when objects can be
aggregated to the highest possible class, e.g. [cat, dog,
bird] [3 animals]. However, this may also yield a
much higher level of information loss, e.g., [poodle,
mastiff] [2 animals] and [poodle, mastiff] [2
dogs] are both valid aggregations. The first example
has significantly higher information loss that would still
require the user to iteratively query the object set,
which is what we are trying to avoid. The second
transformation may require a more detailed taxonomy
of virtual world objects that also includes subtypes,
which may be more costly to create. Aggregation
transformations are only applied if they reduce the
number of words in SWL. Figure 4 shows example
output of the grouping and aggregation steps.
Figure 4. Sample output after Grouping and Aggregating.
Detailing
If the number of words in the SWL is below UWL then
in order to reduce the amount of interaction required,
we detail the objects in the SWL with specific object
information such as color or size. For example: [cat]
[big red cat]. Specific transformations are only
applied as long as we do not exceed the UWL with a
10% margin.
In addition to the “describe” command users can issue
a “where” command indicating where the object is
relative to the user‟s avatar. Spatial information can
also be added to objects during detailing to further
reduce interaction e.g., [cat] [cat in front of you].
We consider four spatial locations (left, right, behind, in
front). Adding spatial information requires grouping of
objects based on location to reduce the number of
words in the SWL, for example, [cat to your left, car to
your left] [a cat and a car to your left].
Figure 5. Sample output with detailing implemented.
CHI 2010: Work-in-Progress (Spotlight on Posters Days 3 & 4)
April 14–15, 2010, Atlanta, GA, USA
4169
Users can assign priority values to properties (location,
size, color) with respect to their importance. Specific
transformations to detail objects are only applied when
it can be applied to all the objects in the SWL in order
to ensure consistency of feedback. Figure 5 shows an
example output of TextSL with detailing implemented.
Future Work
Currently, only a limited number of taxonomy rules
have been defined manually and these describe a
simple taxonomy for virtual world animals and vehicles
which allowed for implementing the proposed
synthesizer in TextSL. We seek to collect more labeling
efforts through our scavenger hunt game that will allow
for expanding our current taxonomy. Once this has
been established, we will evaluate the effectiveness and
usability of synthesizing more usable forms of feedback
through a series of user studies, where different forms
of synthesizing feedback will be explored.
Conclusion
The large amount of objects in virtual worlds poses a
significant problem for text-based approaches towards
making virtual worlds accessible to users who are
visually impaired. The amount of feedback provided
may overwhelm the user and consequently iteratively
querying a user‟s virtual surroundings is slow. We seek
to provide more usable forms of information by
transforming the feedbacks about a user‟s virtual
environment into more concise or descriptive forms
using a taxonomy of virtual world objects.
Acknowledgements
This work is supported by NSF Grant IIS-0738921.
References
[1] Blizzard Studios, World of Warcraft,
http://www.worldofwarcraft.com.
[2] Folmer, E., Yuan, B., Carr, D., Sapre, M. Textsl: A
Command-Based Virtual World Interface for the Visually
Impaired. Proc. ACM ASSETS „09, pp 59-66 2009.
[3] IBM Human Ability and Accessibility Center, Virtual
Worlds User Interface for the Blind,
http://services.alphaworks.ibm.com/virtualworlds/
[4] Linden Research, Second Life,
http://www.secondlife.com/
[5] OpenSim, http://opensimulator.org/
[6] Pascale, M., Mulatto, S., Prattichizzo, D. Bringing
haptics to Second Life for visually impaired people.
Proc. EuroHaptics 2008, pp 896905, 2008.
[7] Robbins, S.S. Immersion and engagement in a
virtual classroom: Using second life for higher
education. In EDUCAUSE Learning Initiative Spring
2007 Focus Session, 2007.
[8] TextSL, Screenreader accessible interface for
Second Life, http://textsl.org
[9] Trewin, S., Hanson, V., Laff, M., Cavender, A.
Powerup: An accessible virtual world. Proc. ACM
ASSETS „08, pp 171–178, 2008.
CHI 2010: Work-in-Progress (Spotlight on Posters Days 3 & 4)
April 14–15, 2010, Atlanta, GA, USA
4170
... Only two of the total 41 authors in our data set authored multiple papers (two publications each). Besides the work of Oktay and Folmer [15], just one other paper [25] cites an article [11] from within our data set. we believe that they fell short of providing comprehensive and definitive solutions to the challenges of creating an accessible and inclusive virtual world. ...
... Despite fifteen years of inquiry, we have observed only eleven published papers across four of the eleven HCI conferences examined (ASSETS: 6, CHI: 3, UIST: 1, NordiCHI: 1). We note that only two of these publications specifically address Second Life as a research object or environment, and additionally provide contributions towards creating accessible artefacts for use within the platform[11,15].Folmer and colleagues[11] developed a text interface for virtual worlds called TextSL, in which users can use commands to navigate.We also noticed that some publications in the field may not have given sufficient attention to accessibility concerns related to Second Life when now discussing accessibility concerns in the Metaverse and may have missed the opportunity to benefit from existing research findings in this area. ...
Conference Paper
Full-text available
Recently, the need for a more inclusive and accessible Metaverse has become increasingly apparent. As a result, there has been a surge in efforts to explore potential solutions to this issue. To better understand the current accessibility challenges, we have taken a moment to reflect on past research conducted in the field of Human-Computer Interaction (HCI). Through a literature review of accessibility research in Second Life, a popular immersive virtual world during the Web 2.0 era, we analysed accessibility research (2008-2022, N=11 papers) presented at ACM SIGCHI conferences. The potential of accessible virtual worlds was already recognised in the Second Life era. However, we found that the solutions were only implemented as an afterthought and that in the future, we can draw more insights to build upon work from the past and other disciplines. We, therefore, highlight several critical aspects that were lacking and suggest opportunities and discussion points for future research in this field. Our goal is to help advance HCI research on the accessibility of immersive virtual worlds and prevent the pitfalls of the past. We believe that by doing so, we can create a more inclusive and accessible Metaverse without reawakening the ghosts from the past.
... On the other hand, not providing enough information when there are a few objects around, causes more queries which simply means loss of time. To be able to solve this problem, a synthesizer [Oktay and Folmer, 2010] is designed that creates highly descriptive yet not overwhelming feedbacks. ...
Article
An NSF-sponsored workshop, convened in October 2010, addressed the question of a research agenda that would lead to accessible electronic health records. The highest priority identified research areas included addressing the challenges people experience as a result of temporary disabilities; understanding and addressing the issues faced by older adults; investigating the efficacy of personalization technologies for improving accessibility; documenting the current state-of-the-art with regard to the accessibility of existing electronic health records; and carefully documenting the potential benefits of accessible electronic health records for various audiences and with regard to potential improvements in one's health.
... On the other hand, not providing enough information when there are a few objects around, causes more queries which simply means loss of time. To be able to solve this problem, a synthesizer [Oktay and Folmer, 2010] is designed that creates highly descriptive yet not overwhelming feedbacks. ...
Article
American Sign Language (ASL) is the first language of the Deaf in the United States. The most effective method for communication between the Deaf and Hearing is certified ASL interpreters, however interpreters are not always available. In this case, the use of a translator producing 3D animations of ASL could improve communication, however the characterization of ASL for automated D animation is still an open question. The goal of this research is to determine how and when non-manual signals are used in ASL sentences, and develop a methodology to portray them as 3D animations.
... On the other hand, not providing enough information when there are a few objects around, causes more queries which simply means loss of time. To be able to solve this problem, a synthesizer [Oktay and Folmer, 2010] is designed that creates highly descriptive yet not overwhelming feedbacks. ...
Article
Full-text available
This paper outlines an ongoing PhD thesis about accessibility, DTV (Digital TV) and older people. We aim to design and evaluate accessible interactions of older people with DTV services enabling communication and access to information about health and wellbeing. DTV opens up a wide range of opportunities for older people to communicate with their social circles and access online information. However, DTV services are far from being accessible to older people. We aim to fill this gap by addressing three relevant and up to now relatively unexplored areas in DTV accessibility research with older people: interaction barriers, everyday use of DTV services enabling communication and access to information about health, and cultural differences in both interaction and use. We will use key ethnographical methods to understand these aspects with two user groups of older people living in developed and developing countries (Spain and Brazil). We will also design and evaluate interactive prototypes of related DTV services by drawing upon the ethnographical data, and involving older people in the design and evaluation (participatory design) process.
... On the other hand, not providing enough information when there are a few objects around, causes more queries which simply means loss of time. To be able to solve this problem, a synthesizer [Oktay and Folmer, 2010] is designed that creates highly descriptive yet not overwhelming feedbacks. ...
Article
Digital Television (DTV) is full of interactive content that was not previously available through analogue television. However, the interactive content can be complex to use, which in turn creates obstacles for users, and in particular older adults, to access this content. This project work looks into ways in which DTV content can be made accessible for older adults through novel DTV menu interface design and alternative input types.
Chapter
3D Virtual Worlds provide realistic three-dimensional environments accessible through the web that can offer engaging, interactive, and immersive experiences. This can create new opportunities for teaching and learning. Yet, the possible use of 3D Virtual Worlds in formal education is a major challenge for school teachers, even for those who are experienced and keen on using digital technologies. In this chapter, the authors present a 3D Virtual Classroom Simulation appropriately designed and implemented using SLOODLE for supporting a module for teachers’ continuing professional development based on the Synectics “making the strange familiar” instructional strategy, aiming towards acquiring appropriate competences for teaching within 3D Virtual Worlds and for developing innovative educational practices.
Article
3D Virtual Worlds provide realistic three-dimensional environments accessible through the web that can offer engaging, interactive, and immersive experiences. This can create new opportunities for teaching and learning. Yet, the possible use of 3D Virtual Worlds in formal education is a major challenge for school teachers, even for those who are experienced and keen on using digital technologies. In this chapter, the authors present a 3D Virtual Classroom Simulation appropriately designed and implemented using SLOODLE for supporting a module for teachers' continuing professional development based on the Synectics "making the strange familiar" instructional strategy, aiming towards acquiring appropriate competences for teaching within 3D Virtual Worlds and for developing innovative educational practices.
Article
Today, the emergence of technologies, such as 3D Virtual Worlds, which provide realistic three-dimensional environments and offer engaging, interactive and immersive experiences, creates new opportunities for teaching and learning. However, the possible use of 3D Virtual Worlds remains a major challenge for teachers and trainers, since these environments introduce new concepts with which even teachers and trainers who are experienced and keen on using digital technologies are not familiar. This problem has been identified by the research and the educational community. To this end, there are research studies that have raised issues, such as, the extra pressure applied to teachers who teach within 3D Virtual Worlds, and the lack of understanding for the new opportunities offered by 3D Virtual Worlds in teaching and learning. Within this context, in this study we present a module for Teachers' Continuing Professional Development designed using the instructional strategy Synectics Making the strange familiar, aiming towards acquiring appropriate competences for teaching and for developing innovative educational practices within 3D Virtual Worlds. © 2012 Springer Science+Business Media, LLC. All rights reserved.
Article
Despite the efforts for making virtual worlds accessible to users with visual impairments (VI), we still lack an effective way of describing the virtual environment and having the users interact with it. This work aims to complete these missing parts by generating meaningful textual/audio feedbacks as necessary and by providing text-based interaction capabilities.
Conference Paper
Full-text available
The immersive graphics, large amount of user-generated content, and social interaction opportunities offered by popular virtual worlds, such as Second Life, could eventually make for a more interactive and informative World Wide Web. Unfortunately, virtual worlds are currently not accessible to users who are visually impaired. This paper presents the work on developing TextSL, a client for Second life that can be accessed with a screen reader. Users interact with TextSL using a command-based interface, which allows for performing a plethora of different actions on large numbers of objects and avatars; characterizing features of such virtual worlds. User studies confirm that a command-based interface is a feasible approach towards making virtual worlds accessible, as it allows screen reader users to explore Second Life, communicate with other avatars, and interact with objects as well as sighted users. Command-based exploration and object interaction is significantly slower, but communication can be performed with the same efficiency as in the Second Life viewer. We further identify that at least 31% of the objects in Second Life lack a descriptive name, which is a significant barrier towards making virtual worlds accessible to users who are visually impaired.
Conference Paper
Full-text available
TextSL is a screen reader accessible interface for the popular virtual world of Second Life that lets the users interact using a command-based interface. Our research identified that it is challenging to make virtual worlds accessible to users with visual impairments as virtual worlds are densely populated with objects while at the same time many virtual world objects lack meta-data.
Conference Paper
PowerUp is a multi-player virtual world educational game with a broad set of accessibility features built in. This paper considers what features are necessary to make virtual worlds usable by individuals with a range of perceptual, physical, and cognitive disabilities. The accessibility features were included in the PowerUp game and validated, to date, with blind and partially sighted users. These features include in-world navigation and orientation tools, font customization, self-voicing text-to-speech output, and keyboard-only and mouse-only navigation. We discuss user requirements gathering, the validation study, and further work needed.
Conference Paper
Potential applications of online virtual worlds are attracting the interest of many researchers around the world. One and perhaps the most famous example of such systems is Linden Lab's Second Life. Last year, sources for its client application have been released under the GPL license, allowing anyone to extend it building a modied client. This work presents an eort to explore the possibilities that haptic technologies can oer to multiuser online virtual worlds, to provide users with an easier, more interactive and immersive experience. A haptic-enabled version of the Second Life Client, supporting major haptic devices, is proposed. Two haptic-based input modes have been added which help visually im- paired people to navigate and explore the simulated D environment by exploiting force feedback capabilities of these devices.
Screenreader accessible interface for Second Life
  • Textsl
TextSL, Screenreader accessible interface for Second Life, http://textsl.org
Work-in-Progress (Spotlight on Posters Days 3 & 4
CHI 2010: Work-in-Progress (Spotlight on Posters Days 3 & 4) April 14–15, 2010, Atlanta, GA, USA
Immersion and engagement in a virtual classroom: Using second life for higher education
  • S S Robbins
  • Robbins S.S.
Robbins, S.S. Immersion and engagement in a virtual classroom: Using second life for higher education. In EDUCAUSE Learning Initiative Spring 2007 Focus Session, 2007.
Virtual Worlds User Interface for the Blind
  • Ibm Human
  • Accessibility Ability
  • Center