Conference PaperPDF Available

Synthesizing meaningful feedback for exploring virtual worlds using a screen reader

April 2010

April 2010

Source
DBLP

Conference: Proceedings of the 28th International Conference on Human Factors in Computing Systems, CHI 2010, Extended Abstracts Volume, Atlanta, Georgia, USA, April 10-15, 2010

Authors:

Users who are visually impaired can access virtual worlds, such as Second Life, with a screen reader by extracting a meaningful textual representation of the environment their avatar is in. Since virtual worlds are densely populated with large amounts of user-generated content, users must iteratively query their environment as to not to be overwhelmed with audio feedback. On the other hand, iteratively interacting with virtual worlds is inherently slower. This paper describes our current work on developing a mechanism that can synthesize a more usable and efficient form of feedback using a taxonomy of virtual world objects.

It takes two “describe” commands to learn that the cat is brown.

…

Scanning of objects around the avatar.

…

Sample output value function for a number of objects within 10 meters of the user.

…

Sample output after Grouping and Aggregating.

…

Sample output with detailing implemented.

…

Content may be subject to copyright.

Content uploaded by eelke folmer

Content may be subject to copyright.

Synthesizing Meaningful Feedback

for Exploring Virtual Worlds Using

a Screen Reader

Abstract

Users who are visually impaired can access virtual

worlds, such as Second Life, with a screen reader by

extracting a meaningful textual representation of the

environment their avatar is in. Since virtual worlds are

densely populated with large amounts of user-

generated content, users must iteratively query their

environment as to not to be overwhelmed with audio

feedback. On the other hand, iteratively interacting

with virtual worlds is inherently slower. This paper

describes our current work on developing a mechanism

that can synthesize a more usable and efficient form of

feedback using a taxonomy of virtual world objects.

Keywords

Virtual Worlds, Accessibility, Audio I/O

ACM Classification Keywords

H.5.2 [User Interfaces]: Voice I/O

General Terms

Human Factors, Measurement, Design

CHI 2010, April 10–15, 2010, Atlanta, Georgia, USA.

ACM 978-1-60558-930-5/10/04.

Bugra Oktay

Department of Computer Science and Engineering

University of Nevada, Reno

Reno, Nevada, USA

oktayb@unr.nevada.edu

Eelke Folmer

Department of Computer Science and Engineering

University of Nevada, Reno

Reno, Nevada, USA

efolmer@unr.edu

CHI 2010: Work-in-Progress (Spotlight on Posters Days 3 & 4)

April 14–15, 2010, Atlanta, GA, USA

4165

Introduction

Virtual worlds have enjoyed increasing popularity

during past years, with millions of participating users.

The immersive graphics, the large amount of user-

generated content and the social interaction

opportunities offered by the greater sophistication of

virtual worlds, could eventually make for a more

interactive and informative World Wide Web. Popular

virtual worlds include Second Life [4] and World of

Warcraft [1]. The focus of our research is on virtual

worlds with user generated content (which have no

elements of combat) as these are increasingly used as

cyber learning environments [7].

In Second Life, users control a digital puppet, called an

avatar–with human capabilities, such as walking and

gesturing–through a game-like, third person interaction

mechanism. Until recently [2,3,6,9] virtual worlds were

inaccessible to users who are visually impaired as these

virtual worlds are entirely visual and lack of any textual

representation that can be read with a screen reader or

tactile display.

In our previous research, we developed a screen reader

accessible interface for virtual worlds called TextSL

[2,8]. TextSL allows screen reader users to access

Second Life and interact with large numbers of objects

and avatars there, using a command-based interface

that is inspired by multi-user dungeon games. Users

navigate their avatar using commands such as: “move”

or “teleport”. Users can query their environment using

the “describe” command, which lists the number of

objects and avatars found within a 360 degree 10

meter radius around the user‟s avatar. Objects and

avatars can then be iteratively queried (See Figure 1).

Figure 1. It takes two “describe” commands to learn that

the cat is brown.

User generated virtual worlds, are densely populated

with objects, e.g., in Second Life we found that on

average 13 objects can be found within a 10 meter

radius around the user‟s avatar [2]. Providing all the

object names as audio feedback may easily overwhelm

the user, especially if the names of the objects are

long, which motivated the use of a mechanism where

users have to iteratively query their environment.

User studies with TextSL show that a command-based

interface is feasible [2], as TextSL allows screen reader

users to explore Second Life, communicate with other

avatars, and interact with objects with the same

success rates as sighted users using the Second Life

viewer (TextSL has been designed to support access to

other open source virtual worlds such as OpenSim [5]

once APIs become available). However, command-

based exploration and object interaction is significantly

slower in TextSL [2] due to the requirement of users to

have to iteratively query their environment. Some

users found the amount of feedback that TextSL

provides overwhelming. The focus of our current

CHI 2010: Work-in-Progress (Spotlight on Posters Days 3 & 4)

April 14–15, 2010, Atlanta, GA, USA

4166

research is to synthesize a more meaningful form of

feedback that seeks to balance: (1) minimizing the

number of overwhelming feedbacks; and (2)

minimizing the amount of interaction required.

Synthesizer

Users who are visually impaired typically use their

screen readers at different speech rates, which

indicates that screen reader users have different

abilities to process audio feedback. The proposed

synthesizer incorporates a user specified word limit

(UWL). Since words may vary in length and this will

take different amounts of time to be pronounced

through a screen reader, in future work, the UWL could

be combined with a user specified time limit. However,

this would require TextSL to know the speech rate of

the screen reader.

The synthesizer executes as follows:

1. Scan and filter objects within a fixed range around

the user and compile the found names into the Scanned

Word List (SWL)

IF (#SWL > UWL)

2. Group and aggregate SWL.

ELSE

3. Detail SWL.

Step 2 specifically focuses on compressing the

description to prevent overwhelming the user with

feedback and step 3 focuses on reducing the number of

“describe” commands that must be given. The

synthesizer will either execute step 2 or 3 depending on

the number of words generated from step 1. Although

nearby avatars are parts of the provided feedback and

they are just as important as the objects, the

synthesizer only focuses on virtual world objects

because (1) any avatar can be of importance to the

user regardless of its properties so filtering is not

applicable; (2) the number of objects around the user

is typically much larger than the number of avatars and

therefore feedback is most effectively synthesized

through grouping and aggregating objects.

Object Scanning and Filtering

The Second Life client displays objects and avatars that

are in front of the user‟s avatar. To eliminate the user's

need of turning to different directions to find out what

is there, TextSL considers all objects within a 360-

degree radius with a user specified range (default is 10

meters) around the user (Figure 2).

Figure 2. Scanning of objects around the avatar.

For each object we compute a value according to the

following function:

F = NAME * SIZE * DISTANCE-1 * INTERACTION * ROOT

Where,

NAME: The length of the name of the object divided by

the average word length. Objects with non-descriptive

names like “object” are given the value 0.

CHI 2010: Work-in-Progress (Spotlight on Posters Days 3 & 4)

April 14–15, 2010, Atlanta, GA, USA

4167

SIZE: The bounding box of the object.

DISTANCE: Distance in meters to the user divided by the

scanning range.

INTERACTION: 10 if the object allows interaction and 1 if

not. ROOT: 1 if the object is the root object and 0 if the

object is a sub object.

This function prioritizes objects that: (1) have more

descriptive names; (2) are closer to the user; (3) are

larger; (4) are interactive; and (5) are not sub objects.

The latter is to avoid users interacting with parts of a

larger object such as a wheel, which is part of a larger

object such as a car. As most content creators assume

that users can see, they frequently leave the name of

objects to their default value (“object”). This is a

problem when users query their environment as

Figure 3. Sample output value function for a number of

objects within 10 meters of the user.

this may return the names of multiple objects called

“object”, which are basically meaningless to TextSL

users. We identified that for Second Life 32% of the

objects are called “object” [2]. Such objects are culled

from our object scanning. Only objects with a value

above a certain user specified threshold value are

compiled into the SWL (See Figure 3).

Grouping and Aggregation

If the number of words in the SWL is over the UWL

limit, then we need to reduce this by grouping and

aggregation.

Grouping: Objects with the same name are grouped

together, e.g., [car, car, dog] → [2 cars, dog].

Grouping has no information loss but this step may not

significantly reduce the number of words if the number

of grouped objects is below three. Some savings are

incurred when adjectives are included in the UWL count

but as these typically are very short we choose not to

include these. Still, saying “There are 2 cars.” makes

more sense than saying “There are a car and a car.”

Aggregation: Object names are aggregated if they can

be determined to be members of the same class.

Aggregation requires the use of a taxonomy of virtual

world objects. This taxonomy is something that we are

currently creating in related work where we developed

a game within Second Life that can help improve the

accessibility of virtual worlds as often meta-data for

virtual world objects is missing. In this game sighted

users can tag and label objects using a scavenger hunt

game, which builds a set of training examples for an

automatic classifier that can recognize objects not

having a name based on their geometry. This game

also helps build taxonomy of virtual world objects,

which we can use in aggregating a more usable form of

feedback. The taxonomy we create is described as a set

of rules, e.g., [vehicle→car] or [animal→dog] and these

rules may also define subtypes [dog→poodle]. The

taxonomy of objects created as such is not restricted to

CHI 2010: Work-in-Progress (Spotlight on Posters Days 3 & 4)

April 14–15, 2010, Atlanta, GA, USA

4168

Second Life and can be used by any virtual world that

has textual descriptions for their objects.

Using the taxonomy, we analyze whether any of the

object names can be aggregated to the same parent,

e.g., [bicycle, car] → [2 vehicles]. Larger reduction in

word count can be achieved when objects can be

aggregated to the highest possible class, e.g. [cat, dog,

bird] → [3 animals]. However, this may also yield a

much higher level of information loss, e.g., [poodle,

mastiff] → [2 animals] and [poodle, mastiff] → [2

dogs] are both valid aggregations. The first example

has significantly higher information loss that would still

require the user to iteratively query the object set,

which is what we are trying to avoid. The second

transformation may require a more detailed taxonomy

of virtual world objects that also includes subtypes,

which may be more costly to create. Aggregation

transformations are only applied if they reduce the

number of words in SWL. Figure 4 shows example

output of the grouping and aggregation steps.

Figure 4. Sample output after Grouping and Aggregating.

Detailing

If the number of words in the SWL is below UWL then

in order to reduce the amount of interaction required,

we detail the objects in the SWL with specific object

information such as color or size. For example: [cat]

→ [big red cat]. Specific transformations are only

applied as long as we do not exceed the UWL with a

10% margin.

In addition to the “describe” command users can issue

a “where” command indicating where the object is

relative to the user‟s avatar. Spatial information can

also be added to objects during detailing to further

reduce interaction e.g., [cat] → [cat in front of you].

We consider four spatial locations (left, right, behind, in

front). Adding spatial information requires grouping of

objects based on location to reduce the number of

words in the SWL, for example, [cat to your left, car to

your left] → [a cat and a car to your left].

Figure 5. Sample output with detailing implemented.

CHI 2010: Work-in-Progress (Spotlight on Posters Days 3 & 4)

April 14–15, 2010, Atlanta, GA, USA

4169

Users can assign priority values to properties (location,

size, color) with respect to their importance. Specific

transformations to detail objects are only applied when

it can be applied to all the objects in the SWL in order

to ensure consistency of feedback. Figure 5 shows an

example output of TextSL with detailing implemented.

Future Work

Currently, only a limited number of taxonomy rules

have been defined manually and these describe a

simple taxonomy for virtual world animals and vehicles

which allowed for implementing the proposed

synthesizer in TextSL. We seek to collect more labeling

efforts through our scavenger hunt game that will allow

for expanding our current taxonomy. Once this has

been established, we will evaluate the effectiveness and

usability of synthesizing more usable forms of feedback

through a series of user studies, where different forms

of synthesizing feedback will be explored.

Conclusion

The large amount of objects in virtual worlds poses a

significant problem for text-based approaches towards

making virtual worlds accessible to users who are

visually impaired. The amount of feedback provided

may overwhelm the user and consequently iteratively

querying a user‟s virtual surroundings is slow. We seek

to provide more usable forms of information by

transforming the feedbacks about a user‟s virtual

environment into more concise or descriptive forms

using a taxonomy of virtual world objects.

Acknowledgements

This work is supported by NSF Grant IIS-0738921.

References

[1] Blizzard Studios, World of Warcraft,

http://www.worldofwarcraft.com.

[2] Folmer, E., Yuan, B., Carr, D., Sapre, M. Textsl: A

Command-Based Virtual World Interface for the Visually

Impaired. Proc. ACM ASSETS „09, pp 59-66 2009.

[3] IBM Human Ability and Accessibility Center, Virtual

Worlds User Interface for the Blind,

http://services.alphaworks.ibm.com/virtualworlds/

[4] Linden Research, Second Life,

http://www.secondlife.com/

[5] OpenSim, http://opensimulator.org/

[6] Pascale, M., Mulatto, S., Prattichizzo, D. Bringing

haptics to Second Life for visually impaired people.

Proc. EuroHaptics 2008, pp 896–905, 2008.

[7] Robbins, S.S. Immersion and engagement in a

virtual classroom: Using second life for higher

education. In EDUCAUSE Learning Initiative Spring

2007 Focus Session, 2007.

[8] TextSL, Screenreader accessible interface for

Second Life, http://textsl.org

[9] Trewin, S., Hanson, V., Laff, M., Cavender, A.

Powerup: An accessible virtual world. Proc. ACM

ASSETS „08, pp 171–178, 2008.

CHI 2010: Work-in-Progress (Spotlight on Posters Days 3 & 4)

April 14–15, 2010, Atlanta, GA, USA

4170

Reawakening the Ghosts from the Past? Accessibility Lessons Learned from Second Life

Conference Paper

Full-text available

Apr 2023

Recently, the need for a more inclusive and accessible Metaverse has become increasingly apparent. As a result, there has been a surge in efforts to explore potential solutions to this issue. To better understand the current accessibility challenges, we have taken a moment to reflect on past research conducted in the field of Human-Computer Interaction (HCI). Through a literature review of accessibility research in Second Life, a popular immersive virtual world during the Web 2.0 era, we analysed accessibility research (2008-2022, N=11 papers) presented at ACM SIGCHI conferences. The potential of accessible virtual worlds was already recognised in the Second Life era. However, we found that the solutions were only implemented as an afterthought and that in the future, we can draw more insights to build upon work from the past and other disciplines. We, therefore, highlight several critical aspects that were lacking and suggest opportunities and discussion points for future research in this field. Our goal is to help advance HCI research on the accessibility of immersive virtual worlds and prevent the pitfalls of the past. We believe that by doing so, we can create a more inclusive and accessible Metaverse without reawakening the ghosts from the past.

Accessible electronic health records: developing a research agenda

Article

Jan 2011

An NSF-sponsored workshop, convened in October 2010, addressed the question of a research agenda that would lead to accessible electronic health records. The highest priority identified research areas included addressing the challenges people experience as a result of temporary disabilities; understanding and addressing the issues faced by older adults; investigating the efficacy of personalization technologies for improving accessibility; documenting the current state-of-the-art with regard to the accessibility of existing electronic health records; and carefully documenting the potential benefits of accessible electronic health records for various audiences and with regard to potential improvements in one's health.

The automation of spinal non-manual signals in American Sign Language

Article

Jan 2011

Lindsay Semler

American Sign Language (ASL) is the first language of the Deaf in the United States. The most effective method for communication between the Deaf and Hearing is certified ASL interpreters, however interpreters are not always available. In this case, the use of a translator producing 3D animations of ASL could improve communication, however the characterization of ASL for automated D animation is still an open question. The goal of this research is to determine how and when non-manual signals are used in ASL sentences, and develop a methodology to portray them as 3D animations.

Towards designing more accessible interactions of older people with digital TV

Article

Full-text available

Jan 2011

This paper outlines an ongoing PhD thesis about accessibility, DTV (Digital TV) and older people. We aim to design and evaluate accessible interactions of older people with DTV services enabling communication and access to information about health and wellbeing. DTV opens up a wide range of opportunities for older people to communicate with their social circles and access online information. However, DTV services are far from being accessible to older people. We aim to fill this gap by addressing three relevant and up to now relatively unexplored areas in DTV accessibility research with older people: interaction barriers, everyday use of DTV services enabling communication and access to information about health, and cultural differences in both interaction and use. We will use key ethnographical methods to understand these aspects with two user groups of older people living in developed and developing countries (Spain and Brazil). We will also design and evaluate interactive prototypes of related DTV services by drawing upon the ethnographical data, and involving older people in the design and evaluation (participatory design) process.

Inclusive technologies for enhancing the accessibility of digital television

Article

Jan 2011

Amritpal Bhachu

Digital Television (DTV) is full of interactive content that was not previously available through analogue television. However, the interactive content can be complex to use, which in turn creates obstacles for users, and in particular older adults, to access this content. This project work looks into ways in which DTV content can be made accessible for older adults through novel DTV menu interface design and alternative input types.

3D Virtual Classroom Simulations for Supporting School Teachers' Continuing Professional Development

Chapter

Jan 2014

3D Virtual Worlds provide realistic three-dimensional environments accessible through the web that can offer engaging, interactive, and immersive experiences. This can create new opportunities for teaching and learning. Yet, the possible use of 3D Virtual Worlds in formal education is a major challenge for school teachers, even for those who are experienced and keen on using digital technologies. In this chapter, the authors present a 3D Virtual Classroom Simulation appropriately designed and implemented using SLOODLE for supporting a module for teachers’ continuing professional development based on the Synectics “making the strange familiar” instructional strategy, aiming towards acquiring appropriate competences for teaching within 3D Virtual Worlds and for developing innovative educational practices.

Exploiting Virtual Worlds in Life Long Learning and Vocational Training

Thesis

Full-text available

Sep 2012

Natalia Spyropoulou

3D Virtual Classroom Simulations for Supporting School Teachers' Continuing Professional Development

Article

Jan 2012

3D Virtual Worlds provide realistic three-dimensional environments accessible through the web that can offer engaging, interactive, and immersive experiences. This can create new opportunities for teaching and learning. Yet, the possible use of 3D Virtual Worlds in formal education is a major challenge for school teachers, even for those who are experienced and keen on using digital technologies. In this chapter, the authors present a 3D Virtual Classroom Simulation appropriately designed and implemented using SLOODLE for supporting a module for teachers' continuing professional development based on the Synectics "making the strange familiar" instructional strategy, aiming towards acquiring appropriate competences for teaching within 3D Virtual Worlds and for developing innovative educational practices.

Teachers’ Training in Exploiting 3D Virtual Worlds for Teaching and Learning

Article

Aug 2013

Today, the emergence of technologies, such as 3D Virtual Worlds, which provide realistic three-dimensional environments and offer engaging, interactive and immersive experiences, creates new opportunities for teaching and learning. However, the possible use of 3D Virtual Worlds remains a major challenge for teachers and trainers, since these environments introduce new concepts with which even teachers and trainers who are experienced and keen on using digital technologies are not familiar. This problem has been identified by the research and the educational community. To this end, there are research studies that have raised issues, such as, the extra pressure applied to teachers who teach within 3D Virtual Worlds, and the lack of understanding for the new opportunities offered by 3D Virtual Worlds in teaching and learning. Within this context, in this study we present a module for Teachers' Continuing Professional Development designed using the instructional strategy Synectics Making the strange familiar, aiming towards acquiring appropriate competences for teaching and for developing innovative educational practices within 3D Virtual Worlds. © 2012 Springer Science+Business Media, LLC. All rights reserved.

Generating efficient feedback and enabling interaction in virtual worlds for users with visual impairments

Article

Jan 2011

Bugra Oktay

Despite the efforts for making virtual worlds accessible to users with visual impairments (VI), we still lack an effective way of describing the virtual environment and having the users interact with it. This work aims to complete these missing parts by generating meaningful textual/audio feedbacks as necessary and by providing text-based interaction capabilities.

TextSL: A command-based virtual world interface for the visually impaired

Conference Paper

Full-text available

Oct 2009

The immersive graphics, large amount of user-generated content, and social interaction opportunities offered by popular virtual worlds, such as Second Life, could eventually make for a more interactive and informative World Wide Web. Unfortunately, virtual worlds are currently not accessible to users who are visually impaired. This paper presents the work on developing TextSL, a client for Second life that can be accessed with a screen reader. Users interact with TextSL using a command-based interface, which allows for performing a plethora of different actions on large numbers of objects and avatars; characterizing features of such virtual worlds. User studies confirm that a command-based interface is a feasible approach towards making virtual worlds accessible, as it allows screen reader users to explore Second Life, communicate with other avatars, and interact with objects as well as sighted users. Command-based exploration and object interaction is significantly slower, but communication can be performed with the same efficiency as in the Second Life viewer. We further identify that at least 31% of the objects in Second Life lack a descriptive name, which is a significant barrier towards making virtual worlds accessible to users who are visually impaired.

TextSL: A screen reader accessible interface for second life

Conference Paper

Full-text available

Apr 2010

TextSL is a screen reader accessible interface for the popular virtual world of Second Life that lets the users interact using a command-based interface. Our research identified that it is challenging to make virtual worlds accessible to users with visual impairments as virtual worlds are densely populated with objects while at the same time many virtual world objects lack meta-data.

PowerUp: An Accessible Virtual World

Conference Paper

Oct 2008

PowerUp is a multi-player virtual world educational game with a broad set of accessibility features built in. This paper considers what features are necessary to make virtual worlds usable by individuals with a range of perceptual, physical, and cognitive disabilities. The accessibility features were included in the PowerUp game and validated, to date, with blind and partially sighted users. These features include in-world navigation and orientation tools, font customization, self-voicing text-to-speech output, and keyboard-only and mouse-only navigation. We discuss user requirements gathering, the validation study, and further work needed.

Bringing Haptics to Second Life for Visually Impaired People

Conference Paper

Jun 2008

Potential applications of online virtual worlds are attracting the interest of many researchers around the world. One and perhaps the most famous example of such systems is Linden Lab's Second Life. Last year, sources for its client application have been released under the GPL license, allowing anyone to extend it building a modied client. This work presents an eort to explore the possibilities that haptic technologies can oer to multiuser online virtual worlds, to provide users with an easier, more interactive and immersive experience. A haptic-enabled version of the Second Life Client, supporting major haptic devices, is proposed. Two haptic-based input modes have been added which help visually im- paired people to navigate and explore the simulated D environment by exploiting force feedback capabilities of these devices.

Screenreader accessible interface for Second Life

Textsl

TextSL, Screenreader accessible interface for Second Life, http://textsl.org

Work-in-Progress (Spotlight on Posters Days 3 & 4

Jan 2010

CHI 2010: Work-in-Progress (Spotlight on Posters Days 3 & 4) April 14–15, 2010, Atlanta, GA, USA

Immersion and engagement in a virtual classroom: Using second life for higher education

Jan 2007

S S Robbins
Robbins S.S.

Robbins, S.S. Immersion and engagement in a virtual classroom: Using second life for higher education. In EDUCAUSE Learning Initiative Spring 2007 Focus Session, 2007.

Virtual Worlds User Interface for the Blind

Ibm Human
Accessibility Ability
Center

Synthesizing meaningful feedback for exploring virtual worlds using a screen reader

Abstract and Figures

Recommended publications

Glance Awareness and Gaze Interaction in Smartwatches

Variability of practice and knowledge of results as factors in learning to sing in tune

Touchscreen Interfaces for Visual Languages

A Study of Touch Typing Performance with Keyclick Feedback