Conference PaperPDF Available

Using Abstract Anchors to Aid The Development of Multimedia Applications With Sensory Effects

September 2017

September 2017

DOI:10.1145/3103010.3103014

Conference: 7th ACM Symposium on Document Engineering (DocEng 2017)

Authors:

Raphael Abreu

Universidade Federal Fluminense

Joel dos Santos

Centro Federal de Educação Tecnológica Celso Suckow da Fonseca (CEFET/RJ)

Declarative multimedia authoring languages allows authors to combine multiple media objects, generating a range of multimedia presentations. Novel multimedia applications, focusing at improving user experience, extend multimedia applications with multisensory content. The idea is to synchronize sensory effects with the audiovisual content being presented. The usual approach for specifying such synchronization is to mark the content of a main media object (e.g. a main video) indicating the moments when a given effect has to be executed. For example, a mark may represent when snow appears in the main video so that a cold wind may be synchronized with it. Declarative multimedia authoring languages provide a way to mark subparts of a media object through anchors. An anchor indicates its begin and end times (video frames or audio samples) in relation to its parent media object. The manual deffinition of anchors in the above scenario is both not efficient and error prone (i) when the main media object size increases, (ii) when a given scene component appears several times and (iii) when the application requires marking scene components. This paper tackles this problem by providing an approach for creating abstract anchors in declarative multimedia documents. An abstract anchor represents (possibly) several media anchors, indicating the moments when a given scene component appears in a media object content. The author, therefore is able to define the application behavior through relationships among, for example, sensory effects and abstract anchors. Prior to executing, abstract anchors are automatically instantiated for each moment a given element appears and relationships are cloned so the application behavior is maintained. This paper presents an implementation of the proposed approach using NCL (Nested Context Language) as the target language. The abstract anchor processor is implemented in Lua and uses available APIs for video recognition in order to identify the begin and end times for abstract anchor instances. We also present an evaluation of our approach using a real world use cases. CCS CONCEPTS • Applied computing → Markup languages; • Human-centered computing → Hypertext / hypermedia; • Software and its engineering → Translator writing systems and compiler generators;

: Sensory eeects generated by each scene component

…

Abstract anchor processor architecture

…

Sensory effects generated on a video timeline

…

Figures - uploaded by Raphael Abreu

Content may be subject to copyright.

Content uploaded by Raphael Abreu

Content may be subject to copyright.

Using Abstract Anchors to Aid The Development of Multimedia

Applications With Sensory Eects

Raphael Abreu

CEFET/RJ

raphael.abreu@eic.cefet-rj.br

Joel A. F. dos Santos

CEFET/RJ

jsantos@eic.cefet-rj.br

ABSTRACT

Declarative multimedia authoring languages allows authors to com-

bine multiple media objects, generating a range of multimedia pre-

sentations. Novel multimedia applications, focusing at improving

user experience, extend multimedia applications with multisensory

content. The idea is to synchronize sensory eects with the audio-

visual content being presented. The usual approach for specifying

such synchronization is to mark the content of a main media object

(e.g. a main video) indicating the moments when a given eect has

to be executed. For example, a mark may represent when snow

appears in the main video so that a cold wind may be synchronized

with it. Declarative multimedia authoring languages provide a way

to mark subparts of a media object through anchors. An anchor

indicates its begin and end times (video frames or audio samples)

in relation to its parent media object. The manual denition of an-

chors in the above scenario is both not ecient and error prone (i)

when the main media object size increases, (ii) when a given scene

component appears several times and (iii) when the application

requires marking scene components.

This paper tackles this problem by providing an approach for

creating abstract anchors in declarative multimedia documents.

An abstract anchor represents (possibly) several media anchors,

indicating the moments when a given scene component appears

in a media object content. The author, therefore is able to dene

the application behavior through relationships among, for example,

sensory eects and abstract anchors. Prior to executing, abstract

anchors are automatically instantiated for each moment a given

element appears and relationships are cloned so the application

behavior is maintained.

This paper presents an implementation of the proposed approach

using NCL (Nested Context Language) as the target language. The

abstract anchor processor is implemented in Lua and uses available

APIs for video recognition in order to identify the begin and end

times for abstract anchor instances. We also present an evaluation

of our approach using a real world use cases.

CCS CONCEPTS

•Applied computing →Markup languages; •Human-centered

computing →Hypertext / hypermedia; •Software and its en-

gineering →Translator writing systems and compiler generators;

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for prot or commercial advantage and that copies bear this notice and the full citation

on the rst page. Copyrights for components of this work owned by others than ACM

must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,

to post on servers or to redistribute to lists, requires prior specic permission and/or a

fee. Request permissions from permissions@acm.org.

DocEng’17, September 4–7, 2017, Valletta, Malta.

DOI: http://dx.doi.org/10.1145/3103010.3103014

KEYWORDS

Anchors; Multimedia authoring; Multisensory Content; Mulseme-

dia; NCL; Video Recognition

ACM Reference format:

Raphael Abreu and Joel A. F. dos Santos. 2017. Using Abstract Anchors to

Aid The Development of Multimedia Applications With Sensory Eects. In

Proceedings of DocEng’17, September 4–7, 2017, Valletta, Malta., , 8 pages.

DOI: http://dx.doi.org/10.1145/3103010.3103014

1 INTRODUCTION

The recent advances in human-computer interactions ([

])

oers many opportunities to enrich multimedia experience with

new features. Since the beginning of this decade there was sig-

nicant commercial interest in more immersive technologies (3D

displays, VR, etc). Such interest resulted in increased eorts of the

multimedia community to develop new methods to enhance the

user immersion in multimedia applications [21].

New kinds of immersive multimedia applications have been

proposed, giving rise to the multiple sensorial media (Mulsemedia)

applications [

], where traditional media content (text, image, audio,

video, etc.) can be related to media objects that target other human

senses (e.g. smell, haptics, etc.). To enable this applications, one

can use physical sensing devices (sensors) to identify the ambient

state (e.g. temperature, room size, user feedback) and actuators to

generate sensory eects (e.g. wind, mist, heat) to the user.

Traditional declarative multimedia authoring languages, author-

ing languages for short, specify interactive multimedia applications

focusing on the denition of media objects synchronization, in-

dependent of their content. Examples of authoring languages are

SMIL (Synchronized Multimedia Integration Language) [

] and NCL

(Nested Context Language) [

]. In the above scenario, it is inter-

esting to take advantage of those languages abstractions for media

and relationships specication in order to provide synchronization

among both traditional content and also multisensory content.

An approach for synchronizing traditional and multisensory is

to represent sensors and actuators as media objects and create rela-

tionships among parts of a main media object (e.g. a main video)

and those media objects representing multisensory content. In

order to do so, authors have to mark the main media object indicat-

ing when, for example, an explosion occurs so the corresponding

sensory eect can be synchronized with it.

In this paper, we call a scene component a given element (rock,

tree, dog, person, etc.) or concept (happy, crowded, dark, etc.) that

appears in the main media object content.

The usual approach for marking when a given scene component

appears in a given media object is to execute such media object and

create anchors related to those components. Relationships among

such anchors and the related multisensory content, therefore, dene

the intended synchronization.

When the application size grows, or when several scene compo-

nent shall be synchronized with multisensory content, authors are

required to create several anchors. The manual denition of such

anchors, however, is not ecient. Moreover, such an approach can

be error prone, given the size of the resulting code. This problem

was presented in [

], where the authors emphasize the need for

automating this process.

This paper presents an approach for automating the creation of

anchors in multimedia authoring languages. Our approach is to

provide a way for the author to dene abstract anchors in multime-

dia documents. An abstract anchor represents (possibly) several

media anchors, indicating the moments when a given scene com-

ponent appears in a media object content. Relationships in the

document are dened considering such abstract anchors. Prior to

execution, a document with abstract anchors is processed so that,

abstract anchors are automatically instantiated for each moment

a given scene component appears and relationships are cloned so

the application behavior is maintained.

The proposed approach was implemented using NCL as the

target language. NCL is a standard for digital TV [

] and IPTV [

]

services. It provides anchors for media objects, whose denition

indicate their begin and end times in relation to their parent media

object. In this work NCL anchors were extended so they can indicate

the scene component they refer to. The Abstract Anchor Processor,

AAP for short, uses available APIs for video recognition in order

to identify when a given scene component appears in the video

content. An instance of a given abstract anchor is created for each

time the element appears. In sequence, document relationships

are cloned for each anchor instance, maintaining the document

behavior. AAP was implemented in Lua [

] and is available for

download and use1.

Using NCL with Abstract Anchors, NCLAA for short, reduces

authoring eort, since anchors and document relationships are

created only once, for each dierent scene component. In order to

support our claim we present an evaluation of our approach using

a real world use cases.

The remainder of the paper is organized as follows. Section 2

presents related work regarding approaches for reducing the author-

ing eort for multimedia and mulsemedia applications. Section 3

discusses the concept of abstract anchors, their creation in NCL

and the steps for processing abstract anchors. Section 4 presents

the implementation of the abstract anchor processor. Section 5

presents our approach evaluation results. Section 6 concludes the

paper and presents future work.

2 RELATED WORK

A lot of attention has been devoted to reducing the authoring ef-

fort of multimedia and mulsemedia applications. Two common

approaches are to provide authoring tools or template languages

for those applications.

A template language allows the author to specify reusable com-

ponents (placeholders) that should later be replaced by instances in

1https://github.com/raphael-abreu/NCLAAP

the target language. More precisely, templates dene generic com-

ponents and express relationships between generic components

that later can be duplicated to a target language by a template

processor before runtime. The template processor ensures that

the generic components are correctly instantiated in the target

language. This section presents works focusing on templates for

multimedia applications.

XTemplate [

] is a modular approach for creating templates

for NCL documents. The template language proposed represents

generic components and relationships among them. XTemplate

species composite templates, which denes a spatio-temporal

semantics to be reused by (possibly) several document compositions.

Along with the template specication, a template processor was

proposed. The processor receives as input a set of templates and

a document using them and returns an NCL compliant document

that can run on any standard NCL player. A similar approach is

provided in [

] where authors propose the TAL template language

and its associated processor.

Some template languages not only support placeholders, but also

loops and conditions which often lack in declarative multimedia

languages. This is the case of Luar [

]. The authors focus on

authors with programming expertise, providing a way to embed

Lua code in NCL documents. The Luar processor, executes the Lua

code embedded in the NCL document producing an NCL compliant

document.

Another approach to reduce the authoring eort is to develop

visual authoring tools. These tools help the user by providing a

graphical user interface (GUI) that eases or remove the need to write

code. In general, such approaches target in non-expert authors,

aiding the application development.

Examples of authoring tools for multimedia documents are [

]. [

] proposes NCL Composer, an authoring tool presenting

to the user a structural, a textual, and a layout view of an NCL

document. It allows authors to interact with the document logical

structure by representing media objects as nodes and the relation-

ships among them as vertices.

A similar approach is presented in [

] where the NEXT tool is

proposed. The dierence is that NEXT is focused on templates, also

providing a template view where authors may create documents

using XTemplate templates.

LimSee [

] also uses templates for document authoring, in a

similar approach to the one presented in [

]. Finally, xSMART

[

] is used to create wizards to guide the creation of a multimedia

document.

In the mulsemedia domain, much of the authoring eort is to

specify scene components for synchronizing audiovisual content

with sensory eects [

]. Usually, the authoring eort it to tie scene

components to the sensory eects that a human should experience

when they are presented [

], such as feeling cold when a snow

scene is presented or feeling heat when a beach scene is presented.

In [

] the authors present an authoring tool designed for au-

thoring mulsemedia applications, called SEVino (Sensory Eect

Video Annotation Tool). SEVino provides the author an interface

that presents a video timeline. Such video represents the main

audiovisual content to have sensory eects synchronized with. The

tool creates cells representing sensory eects (eg. fog, wind, tem-

perature, etc.) and for a given time interval, users could select a cell

representing a sensory eect to be executed. After the authoring

phase, the tool generates descriptions compatible with the MPEG-V

standard [

], which is a standard for information exchange be-

tween digital world and real world. The MPEG-V descriptions

generated by SEVino represent the sensory eects to be executed

on physical devices.

Despite the advances in tools and templates for easing the au-

thoring eort, the process of authoring a mulsemedia application

is still a very expensive work in terms of eort and time. Espe-

cially when a great deal of synchronization among the audiovisual

content and sensory eects is required.

Such problem gave rise to research proposing semi-automatic

or automatic video description. A video description indicates for

each instant of the video, the scene components that are present.

Such approaches should require minimal to no author interaction

at all for providing a video description, as well as to generate events

based on that description.

The SEVino authors have also developed a media player capa-

ble of automatically gathering a video description and producing

events on the ambient. More specically, the proposed player can

synchronize ambient lighting eects with a video presentation [

To achieve such synchronization, the player gather pixel color in-

formation from a video frame (usually the borders) and send the

same color information to a nearby array of LED lights. This player

removes the need for the user to specify the lightning eects in the

multimedia document, however the approach is restricted to only

one kind of eect, in this case, lightning eects.

The work presented in this paper diers from related work as

follows. (i) It enables the author to describe its application abstract-

ing the video description, using abstract anchors. (ii) It enables the

author to dene abstract anchors for multiple videos in a document,

and not just one as the above approaches. (iii) It enables authors to

synchronize any sensory eect with the application, by providing

relationships among them and abstract anchors.

Although in this paper we present an approach for video descrip-

tion, the Abstract Anchor Processor (AAP) architecture is indepen-

dent of the tool to be used for describing a media object content.

Therefore, it could be used also for dening abstract anchors for

audio objects.

3 ABSTRACT ANCHORS

Multimedia applications are described by multimedia documents. A

document specication is described using some multimedia author-

ing language. Common entities for multimedia authoring languages

are nodes, representing the document content, and relationships, for

representing the synchronization to be performed in an application.

Dierent languages, such as NCL [

], provide temporal anchors

for representing a subpart of a node content. Temporal anchors

represent a subpart of a node content in the time axis. For example,

a sequence of frames of a video node or a sequence of samples in

an audio node. Usually, temporal anchors are dened by a begin

and end, in respect to the node content.

By allowing the author to dene anchors, multimedia languages

enables the denition of relationships taking into account parts of

a node content. Thus providing a ne-grained synchronization.

As discussed in Section 2. Template authoring languages enable

the user to abstract some steps of the authoring process in favor

of a more generic description. After authoring, at processing time,

the template processor to “ll the blanks” with document specic

content.

With that in mind, this work enables the author to make use of

abstract anchors (NCLAA) to represent subparts of a node content,

without explicitly describing them. It is similar to a template ap-

proach, in the sense that it enables another level of abstraction in

the authoring phase.

An abstract anchor represents (possibly) several dierent node

anchors, that are related by the node content being presented while

they are active. In our approach, abstract anchors are related to

scene components, such that all of its instances represent when the

scene component it is associated with is being presented. Figure 1

depicts such idea, where media nodes are represented as circles and

node anchors are represented as squared. Dashed lines associate an

anchor to a node and solid lines represent document relationships.

Figure 1: Abstract anchor denition and processing

The upper part of Figure 1 presents a document where media

video1 has three anchors sea,snow and sun. Each anchor represents

a given scene component. Relationships among such anchors and

medias wind eect and heat eect dene when such medias shall

be presented.

NCL [

], the target language used in this work, provides ele-

ment

media

for dening nodes representing media objects. It also

enables the denition of anchors using element

area

, child of el-

ement media. Listing 1 presents an example of media and anchor

specication.

1<medi a id= " video1 " src= " v i d e o . mp4 " >

2<area ta g = " s e a " / >

3<area ta g = " s un " / >

4</media>

Listing 1: NCL media and anchor specication example

In order to provide the denition of abstract anchors, we extend

NCL such that

area

elements have a new attribute

tag

. Such at-

tribute indicates the scene components related to that anchor. In

the example presented in Listing 1, two abstract anchors are created,

one representing the instants when the sea appears in the video

and the other representing the instants when the sun appears. Ad-

ditionally, the author can dene the

tag

to asterisk ( *) if it should

match every scene component in a document.

NCL is an event-based language such that synchronization rela-

tionships are dened based on events. NCL provides causal relation-

ships such that when an event specied as its condition happens,

one or more actions are triggered. Relationships in NCL are de-

ned using link-connector element pairs. Connectors [

] dene

a general relation that is instantiated by links to a given set of

participants. Listing 2 presents an example of link specication.

1<link xconnector= " onBeginStart ">

2<b i nd role= " o n B e g in " component= " video1 "

interface= " s e a " / >

3

4</link>

5<link xconnector= " onBeginStart ">

6<b i nd role= " o n B e g in " component= " video1 "

interface= " sun " / >

7

8</link>

Listing 2: NCL link specication example

The example presented in Listing 2 denes two links. The rst

species that whenever anchor sea of video1 starts, media wind

shall be started. The second species that whenever anchor sun of

video1 starts, media wind shall be started. Two links are also crated

to stop the wind and the head when the related anchor stops. For

simplicity, they are not presented in Listing 2.

It is worth noticing that

bind

elements inside NCL links indicate

the participants in a relationship. Attribute

component

indicates

the participant node and an optional attribute

interface

restricts

to a given node interface, i.e., a node anchor or property. In order

to enable links to be dened over abstract anchors, we extend NCL

such that attribute

interface

instead an anchor id may indicate

its tag attribute value.

Prior to execution, a document using abstract anchors shall

be processed into a nal document following the NCL standard.

The processing performed for abstract anchors is similar to that

performed for template languages. The rst step of the process is

to instantiate the abstract anchors for the scene components they

specify. The second step is to duplicate links for each instance of a

given abstract anchors. The whole process in shown in Figure 1.

The anchor instantiation step is performed using tools for scene

recognition as presented in Section 4.3. It recognizes the time in-

stants a given scene component is presented in the video content

and create anchor instances marked with temporal description.

Therefore, our approach requires from authors little (or even no)

prior knowledge about the media content. Anchors temporal def-

inition is performed entirely with data acquired by recognition

software.

4 ARCHITECTURE

The architecture of the Abstract Anchor Processor (AAP) is depicted

in Figure 2.

AAP receives as input a document containing abstract anchors

dened by the author. It parses the document identifying nodes

that dene abstract anchors and links related to them. At this step,

the processor also extracts media content from those nodes. For

the example in Listing 1 the processor identies node video1 as a

node dening abstract anchors and shall extract its content (le

video.mp4).

The extracted media content is sent to an external software for

scene recognition. As it can be seen in Figure 2, the recognition

software is decoupled from the processor. Such approach gives

more freedom to the author allowing one to use dierent scene

recognition software. The scene recognition step results in a set of

tags

that are equivalent to ones identied in the abstract anchors

dened by the author. These tags represent the scene components

along with timing information about when they appear in the video.

4.1 Anchor Instantiation

According to the tags received from the scene recognition soft-

ware, AAP instantiates the abstract anchors. The process of anchor

instantiation is performed as follows. According to the scene com-

ponents specied in the abstract anchor, the processor checks in

the set of received tags the time instants when those components

were present. It identies adjacent instants dening intervals where

scene components are present. For each resulting interval, one an-

chor instance is created. Listing 3 presents the result of the anchor

instantiation step for the example in Listing 1.

1<medi a src= " v i d e o . mp4 " i d = " video1 " >

2<area id = " s e a _ 1 " begin= " 0 1 s " end= " 0 9 s " / >

3<area id = " s e a _ 2 " begin= " 1 7 s " end= " 1 9 s " / >

4<area id = " s un_ 1 " begin= " 01 s " end= " 19 s " / >

5<area id = " s un_ 2 " begin= " 28 s " end= " 32 s " / >

6</media>

Listing 3: Anchor instantiation step result for the example

in Listing 1

We use the same nomenclature as the scene recognition software. It shall not be

confounded with XML tags.

Figure 2: Abstract anchor processor architecture

In the example presented in Listing 3, the scene component

sea

was identied in the video in the intervals

[

]

and

[

]

seconds

of the video. Thus two anchor instances were created,

sea_1

for

the rst interval and

sea_2

for the second one. The same is done

for scene component

sun

, which was identied in the video inside

intervals

[

]

and

[

]

, generating anchor instances

sun_1

and sun_2.

It is worth noticing that in the resulting document, the attribute

tag

was removed from the anchor instances. Anchor ids, which

are mandatory in NCL, are created according to the

tag

attribute

value. In order to maintain the output compatibility with the NCL

standard, each anchor id is also incremented to be unique in the

whole document.

4.2 Link Instantiation

After the anchor instantiation process, AAP is able to instantiate

links that refer to abstract anchors.

For each link marked at the processing begin as using an ab-

stract anchor, the processor examines each of its binds in order to

determine its target element. Two outputs are possible.

•

The bind targets a media node as a whole or a regular

anchor. In that case nothing has to be done.

•

The bind targets an abstract anchor of a media node. In

that case the link has to be duplicated for each instance of

the abstract anchor.

This process continues until no link bind targets an abstract

anchor. Listing 4 presents the result of the link instantiation step

for the example in Listing 2.

1<link xconnector= " onBeginStart ">

2<b i nd role= " o n B e g in " component= " video1 "

interface= " s e a _ 1 " / >

3

4</link>

5<link xconnector= " onBeginStart ">

6<b i nd role= " o n B e g in " component= " video1 "

interface= " s e a _ 2 " / >

7

8</link>

9<link xconnector= " onBeginStart ">

10 <b i nd role= " o n B e g in " component= " video1 "

interface= " sun_1 " / >

11

12 </link>

13 <link xconnector= " onBeginStart ">

14 <b i nd role= " o n B e g in " component= " video1 "

interface= " sun_2 " / >

15

16 </link>

Listing 4: Link instantiation step result for the example in

Listing 2

In the example presented in Listing 4, the rst link from Listing 2

was instantiated to both instances of the abstract anchor

sea

. The

resulting links now targets anchors

sea_1

and

sea_2

, respectively.

The same process was done for the second link from Listing 2,

which was instantiated for anchors sun_1 and sun_2.

It is worth noticing that the steps of anchor instantiation and link

instantiation may be executed in distinct moments. It is possible

for the author to use AAP to rst instantiate the anchors, continues

to work in the document and perform the link instantiation step

later.

4.3 Scene recognition

Given a set of abstract anchors previously dened by the author,

AAP collects the anchors

tag

attribute values along with his par-

ent element source. The resulting tags must be instantiated with

temporal information that identies where that tag appeared on

the scene. Here we call this process scene recognition.

Scene recognition is achieved by submitting all the

tag

attribute

values to the recognition system, which is a system that employs

algorithms that can detect scene components in media content

(e.g. video, audio, text analysis). These approaches return a set of

tags indicating the description of a media content. Although static

media can also be analysed (image and text) this work focuses on

continuous media objects, which are frequently used as basis for

sensory eect synchronization.

The scene recognition phase is decoupled from the processor to

enable its adaptation to novel ways of recognizing features in any

media format. The author can adapt the AAP settings for another

recognition system. The only requirement is that the recognition

system has to return a list of independent tags with their temporal

data, according to the notation used by the processor.

In our implementation we used a video recognition API3based

on a Convolutional Neural Networks (CNN)[

]. These neural net-

works have been shown as an eective method for understanding

video content ([

]). Figure 3 shows the result of an image

recognition using such software.

Figure 3: Image recognition result

The example in Figure 3 returns a set of tags indicating the scene

components present in the image. Each tag is followed by the neural

network prediction probability. The API can identify objects (e.g.

boat), as well as individual concepts (e.g. reection).

To recognize video content, the neural network works in a similar

way of image recognition. One approach is to treat video as a series

of images. However, as pointed out by [

], this approach does not

account for the temporal information between frames and can lead

to irrelevant concepts emerging from the scene. Nonetheless one

advantage of this method is that it requires less computation time

to analyse the video.

Another approach is to consider the temporal relationship be-

tween the frames and deduce the tags by analysing relationships

as time passes. A advantage of this method is that it decreases the

probability of returning irrelevant tags from the video and keeps

only the ones that persisted though the entire time. However this

approach is shown to be dicult to compute [17].

3https://clarifai.com

The video recognition API we used in this work, content descrip-

tion is performed for every second of video content. Therefore, after

the instantiation phase, the events described on the multimedia

document will also have a 1second time-step.

The description of scenes by one second at a time may seem to

include a great deal of delay in the specication of sensory eect

synchronization with audiovisual content. However, for mulseme-

dia applications, works published in the literature show that user

perception of a sensory eect happens in a time window of

≈

for

haptic eects [

≈

for heat eects [

≈

for wind eects

[7, 27] and ≈25sfor scent eects [8].

Given the above results, we consider that the content description

of a media object with a one second step should not pose a threat

to the user quality of experience. A future work is to investigate an

approach to reduce such time step.

5 EVALUATION

For the purpose of evaluating of our approach we introduce a usage

scenario to highlight how AAP supports the development of a

mulsemedia application. We developed an NCL application that

combines video and sensory eects to enrich the user experience.

The application called “environments around the world”, consists

of scenes about dierent environments that are presented to the

user.

A timeline representation of the video content and its synchro-

nization with sensory eects is presented on Figure 4. It presents

a set of key frames of the video

and three of the tags recognized

in that part of the video. At the moment of each scene, the NCL

application starts an actuator to perform a sensory eect related to

that scene.

Table 1 describes the sensory eects to be synchronized when a

given tag is found in the video. It varies from scent eects to wind,

heat and cold eects. The eects also vary in intensity according

with the scene components. One should notice, that eects can be

played at the same time. It shall occurs when both tags are found

in the video at the same time. Thus both

area

elements related to

those tags will be active and, as consequence of NCL links, so shall

be the sensory eects.

Table 1: Sensory eects generated by each scene component

Tag Sensory eects

Summer wind 50%, heat 50%

Snow cold 100%

Forest forest scent 100%, wind 25%

Flower ower scent 100%, wind 25%

Storm wind 100%, cold 50%, air humidier 100%

Sea wind 50%, heat 50%, air humidier 50%

Hot wind 50%, heat 100%

The video was described in NCL with abstract anchors indi-

cating the scene components of interest. The cover components

present in all environments. Listing 5 presents the abstract anchor

specication.

Images and videos are licensed as Creative Commons CC0 and were found at Pixabay.

https://pixabay.com

Figure 4: Sensory eects generated on a video timeline

1<medi a id= " v i d e o " src= " v i d e o . mp4 " >

2<area ta g = " summer " / >

3<area ta g = " s now " / >

4<area ta g = " forest " / >

5<area ta g = " f l o w e r " / >

6<area ta g = " s to r m " />

7<area ta g = " s e a " / >

8<area ta g = " h ot " / >

9</media>

Listing 5: NCL abstract anchors for the application

“environments around the world”

The behavior of the application is dened by a group of 7

link

elements (one for each abstract anchor). Listing 6 presents an link

specication for one of the abstract anchors.

1<link xconnector= " onBeginStartSet ">

2<b i nd role= " o n B e g in " component= " v i d e o "

interface= " summer " / >

3

4<bindParam name= " intensity " value= " 50 %

"/ >

5</bind>

6

7<bindParam name= " intensity " value= " 50 %

"/ >

8</bind>

9</link>

Listing 6: NCL link specication with intensity parameters

The link presented in Listing 6 synchronizes the scene compo-

nent

summer

to the sensory eects

wind

and

heat

. Both sensory

eects are represented as media nodes in the application, and repre-

sent Lua scripts that control the actuators responsible for that eect.

The scripts have an intensity parameter whose value is dened in

NCL by parameters (lines 4 and 7). The intensity is expressed in

a percentage of the maximum capable intensity the actuator can

provide.

The author of this application, using NCLAA, has to declare 7

abstract anchors and 7links. The application has a total of 74 lines

of code to describe the behavior of the application.

After processing, according to the video content, the document

has 45

anchor

instances and also 45

link

instances. The processed

document has a total of 362 lines of code to perform the behavior

described in the abstract anchors.

As can be seen in this example, using abstract anchors, the author

had to declare around 15% of the resulting number of anchors and

links and around 20% of the resulting lines of code. Moreover,

without the use of the AAP the author would have to, not only,

dene the anchors and links, but also carefully watch the video for

recognizing scene components and their timing in order to describe

the anchors and their synchronization with the sensory eects. As

intended, we can see a great decrease in the authoring eort with

respect to manual authoring.

It is worth noting that the same code described using NCLAA

is maintained even in case the video size changes. Given that the

abstract anchors are not directly related to the video length (and

timing), but only to the scene components it has, the application

code does not have to change in case the video size changes. This

result is also favorable to the author, as the number of anchor

instances may increase with the video size.

6 CONCLUSION

This paper proposed an approach to describe multimedia applica-

tion with abstract anchors. Abstract anchors represent intervals

when a given scene component is presented in the media node

content. Thus, a mulsemedia application author does not need have

a complete knowledge of a node content for dening its synchro-

nization with other content.

Such approach is intended to be used in a mulsemedia context,

where it is common to perform sensory eect synchronization in

relation to audiovisual content. The approach, however, is not re-

stricted to it and can be used for traditional multimedia application

specication.

Together with the abstract anchors, the abstract anchor processor

(AAP) allows for the automatic generation of node anchor based

on its content. It gathers information about the document and uses

scene recognition software for identifying the temporal information

for anchors. This approach allow automatic media synchronization

to be done based on video recognition.

A positive side eect of our approach is that given that the

abstract anchors are not directly related to the video length (and

timing), but only to the scene components it has, the application

code does not have to change in case the video size changes.

Since the AAP processor have broad applications with dierent

media types. A rst future work should be to integrate to it audio

recognition software. The idea is to identify scene components,

e.g., according to the background sound, and use such informa-

tion for anchor instantiation. A use case could be the automatic

synchronization of subtitles in NCL applications.

Another future work is to enhance AAP with the ability to infer

synonyms of the words used to describe abstract anchors. The

current approach for identifying scene concepts can be error prone.

Sometimes it can be dicult to guess which concept the recogni-

tion software can handle. There are several recognition softwares

available and they may not follow a common standard for concept

naming.

Finally, one interesting future work is to improve our approach

so that it can be used for live content. AAP has to be able to per-

form anchor and link instantiation at runtime. Besides some kind

of caching strategy has to be used for performing the scene recog-

nition step. The challenge to that approach is related to Quality of

Experience (QoE) preservation in multimedia applications, which

may be lost due to processing latency of some scene recognition

software.

REFERENCES

[1]

ABNT. 2011. Digital terrestrial television - Data coding and transmission speci-

cation for digital broadcasting - Part 2: Ginga-NCL for xed and mobile receivers

- XML application language for application coding. (2011). ABNT NBR 15606-

2:2011 standard.

[2]

Roberto Gerson A. Azevedo, Eduardo Cruz Araújo, Bruno Lima, Luiz Fernando G.

Soares, and Marcelo F. Moreno. 2014. Composer: meeting non-functional aspects

of hypermedia authoring environment. Multimedia Tools and Applications 70, 2

(2014), 1199–1228. DOI:http://dx.doi.org/10.1007/s11042-012-1216- 8

[3]

Diogo Henrique Duarte Bezerra, Denio Mariz Timóteo Sousa, Guido Lemos

de Souza Filho, Aquiles Medeiros Filgueira Burlamaqui, and Igor Ros-

berg Medeiros Silva. 2012. Luar: A Language for Agile Development of NCL

Templates and Documents. In Proceedings of the 18th Brazilian Symposium on

Multimedia and the Web (WebMedia ’12). ACM, New York, NY, USA, 395–402.

DOI:http://dx.doi.org/10.1145/2382636.2382718

[4]

Carolina Cruz-Neira, Daniel J. Sandin, Thomas A. DeFanti, Robert V. Kenyon,

and John C. Hart. 1992. The CAVE: Audio Visual Experience Automatic Virtual

Environment. Commun. ACM 35, 6 (June 1992), 64–72.

DOI:

http://dx.doi.org/10.

1145/129888.129892

[5]

Romain Deltour and Cécile Roisin. 2006. The limsee3 multimedia authoring

model. In Proceedings of the 2006 ACM symposium on Document engineering.

ACM, 173–175.

[6]

Joel André Ferreira dos Santos and Débora Christina Muchaluat Saade. 2010.

XTemplate 3.0: Adding Semantics to Hypermedia Compositions and Providing

Document Structure Reuse. In Proceedings of the 2010 ACM Symposium on Applied

Computing (SAC ’10). ACM, New York, NY, USA, 1892–1897.

DOI:

http://dx.doi.

org/10.1145/1774088.1774490

[7]

H Felix, Nikita Mattar, and Julia Fr.2014. Simulating Wind and Warmth in Virtual

Reality : Conception , Realization and Evaluation for a CAVE Environment. 11,

10 (2014).

[8] Gheorghita Ghinea and Oluwakemi A. Ademoye. 2010. Perceived synchroniza-

tion of olfactory multimedia. IEEE Transactions on Systems, Man, and Cybernetics

Part A:Systems and Humans 40, 4 (2010), 657–663.

DOI:

http://dx.doi.org/10.1109/

TSMCA.2010.2041224

[9]

Gheorghita Ghinea, Christian Timmerer, Weisi Lin, and Stephen R. Gulliver. 2014.

Mulsemedia : State of the Art, Perspectives, and Challenges. ACM Transactions

on Multimedia Computing, Communications, and Applications 11, 1s (2014), 1–23.

DOI:http://dx.doi.org/10.1145/2617994

[10]

Roberto Ierusalimschy. 2006. Programming in lua (2nd ed.). Roberto Ierusalim-

schy.

[11]

ITU. 2009. Nested Context Language (NCL) and Ginga-NCL for IPTV services.

http://www.itu.int/rec/T-REC-H.761-200904-S. (2009). ITU-T Recommendation

H.761.

[12]

Alejandro Jaimes and Nicu Sebe. 2007. Multimodal human–computer interaction:

A survey. Computer Vision and Image Understanding 108, 1–2 (2007), 116 – 134.

DOI:

http://dx.doi.org/10.1016/j.cviu.2006.10.019 Special Issue on Vision for

Human-Computer Interaction.

[13]

Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Suk-

thankar, and Li Fei-Fei. 2014. Large-Scale Video Classication with Convolutional

Neural Networks. In Proceedings of the 2014 IEEE Conference on Computer Vision

and Pattern Recognition (CVPR ’14). IEEE Computer Society, Washington, DC,

USA, 1725–1732. DOI:http://dx.doi.org/10.1109/CVPR.2014.223

[14]

Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard,

and L. D. Jackel. 1989. Backpropagation Applied to Handwritten Zip Code

Recognition. Neural Comput. 1, 4 (Dec. 1989), 541–551.

DOI:

http://dx.doi.org/10.

1162/neco.1989.1.4.541

[15]

D. C. Muchaluat-Saade and L. F. G. Soares. 2002. XConnector & XTemplate:

Improving the Expressiveness and Reuse in Web Authoring Languages. The New

Review of Hypermedia and Multimedia Journal 8, 1 (2002), 139–169.

[16]

Carlos de Salles Soares Neto, Luiz Fernando Gomes Soares, and Clarisse Sieck-

enius de Souza. 2012. TAL-Template Authoring Language. Journal of the

Brazilian Computer Society 18, 3 (2012), 185–199.

DOI:

http://dx.doi.org/10.1007/

s13173-012- 0073-7

[17]

Joe Yue-Hei Ng, Matthew J. Hausknecht, Sudheendra Vijayanarasimhan, Oriol

Vinyals, Rajat Monga, and George Toderici. 2015. Beyond Short Snippets: Deep

Networks for Video Classication. CoRR abs/1503.08909 (2015). http://arxiv.org/

abs/1503.08909

[18]

Sharon Oviatt. 2003. The Human-computer Interaction Handbook. L. Erlbaum

Associates Inc., Hillsdale, NJ, USA, Chapter Multimodal Interfaces, 286–304.

http://dl.acm.org/citation.cfm?id=772072.772093

[19]

Douglas Paulo de Mattos, Júlia Varanda da Silva, and Débora Christina Muchaluat-

Saade. 2013. NEXT: graphical editor for authoring NCL documents supporting

composite templates. In Proceedings of the 11th european conference on Interactive

TV and video. ACM, 89–98.

[20]

A. Scherp and S. Boll. 2005. Context-driven Smart Authoring of Multimedia

Content with xSMART. In 13th ACM Multimedia.

[21]

Y. Sulema. 2016. Mulsemedia vs. Multimedia: State of the art and future trends. In

2016 International Conference on Systems, Signals and Image Processing (IWSSIP).

1–5. DOI:http://dx.doi.org/10.1109/IWSSIP.2016.7502696

[22]

Christian Timmerer, Markus Waltl, Benjamin Rainer, and Hermann Hellwagner.

2012. Assessing the quality of sensory experience for multimedia presentations.

Signal Processing: Image Communication 27, 8 (2012), 909–916.

DOI:

http://dx.

doi.org/10.1016/j.image.2012.01.016

[23]

W3C. 2008. Synchronized Multimedia Integration Language - SMIL 3.0 Speci-

cation. http://w ww.w3c.org/TR/SMIL3. (2008). World-Wide Web Consortium

Recommendation.

[24]

Markus Waltl, Benjamin Rainer, Christian Timmerer, and Hermann Hellwagner.

2013. An end-to-end tool chain for Sensory Exp erience based on MPEG-V. Signal

Processing: Image Communication 28, 2 (2013), 136–150.

DOI:

http://dx.doi.org/

10.1016/j.image.2012.10.009

[25]

K. Yoon, B. Choi, E. S. Lee, and T. B. Lim. 2010. 4-D broadcasting with MPEG-V.

In 2010 IEEE International Workshop on Multimedia Signal Processing. 257–262.

DOI:http://dx.doi.org/10.1109/MMSP.2010.5662029

[26]

Kyoungro Yoon, Sang-Kyun Kim, Jae Joon Han, Seungju Han, and Marius Preda.

2015. MPEG-V: Bridging the Virtual and Real World (1st ed.). Academic Press.

[27]

Zhenhui Yuan, Shengyang Chen, Gheorghita Ghinea, and Gabriel-Miro Muntean.

2014. User Quality of Experience of Mulsemedia Applications. ACM Transactions

on Multimedia Computing, Communications, and Applications 11, 1s (2014), 1–19.

DOI:http://dx.doi.org/10.1145/2661329

[28]

Matthew D. Zeiler and Rob Fergus. 2013. Visualizing and Understanding Convo-

lutional Networks. CoRR abs/1311.2901 (2013). http://arxiv.org/abs/1311.2901

Toward Content-Driven Intelligent Authoring of Mulsemedia Applications

Article

Full-text available

Jul 2020

Semi-automatic synchronization of sensory effects in mulsemedia authoring tools

Conference Paper

Full-text available

Oct 2019

Synchronization of sensory effects with multimedia content is a non-trivial and error-prone task that can discourage authoring of mulsemedia applications. Although there are authoring tools that assist in the specification of sensory effect metadata in an automated way, the forms of analysis used by them are not general enough to identify complex components that may be related to sensory effects. In this work, we present an intelligent component, which allows the semi-automatic definition of sensory effects. This component uses a neural network to extract information from video scenes. This information is used to set sensory effects synchronously to related videos. The proposed component was implemented in STEVE 2.0 authoring tool, helping the authoring of sensory effects in a graphical interface.

A Bimodal Learning Approach to Assist Multi-sensory Effects Synchronization

Preprint

Full-text available

Apr 2018

In mulsemedia applications, traditional media content (text, image, audio, video, etc.) can be related to media objects that target other human senses (e.g., smell, haptics, taste). Such applications aim at bridging the virtual and real worlds through sensors and actuators. Actuators are responsible for the execution of sensory effects (e.g., wind, heat, light), which produce sensory stimulations on the users. In these applications sensory stimulation must happen in a timely manner regarding the other traditional media content being presented. For example, at the moment in which an explosion is presented in the audiovisual content, it may be adequate to activate actuators that produce heat and light. It is common to use some declarative multimedia authoring language to relate the timestamp in which each media object is to be presented to the execution of some sensory effect. One problem in this setting is that the synchronization of media objects and sensory effects is done manually by the author(s) of the application, a process which is time-consuming and error prone. In this paper, we present a bimodal neural network architecture to assist the synchronization task in mulsemedia applications. Our approach is based on the idea that audio and video signals can be used simultaneously to identify the timestamps in which some sensory effect should be executed. Our learning architecture combines audio and video signals for the prediction of scene components. For evaluation purposes, we construct a dataset based on Google's AudioSet. We provide experiments to validate our bimodal architecture. Our results show that the bimodal approach produces better results when compared to several variants of unimodal architectures.

Computer-Aided Teaching Using Animations for Engineering Curricula: A Case Study for Automotive Engineering Modules

Article

Aug 2021

One-dimensional (1-D) demonstrations, e.g., the black-box systems, have become popular in teaching materials for engineering modules due to the high complexity of the system's multidimensional (e.g., 2-D and 3-D) identities. The need for multidimensional explanations on how multiphysics equations and systems work is vital for engineering students, whose learning experience must gain a cognitive process understanding for utilizing such multiphysics-focused equations into a pragmatic dimension. The lack of knowledge and expertise in creating animations for visualizing sequent processes and operations in academia can result in an ineffective learning experience for engineering students. This study explores the benefits of animation, which can eventually improve the teaching and student learning experiences. In this article, the use of computer-aided animation tools is evaluated based on their capabilities. Based on their strengths and weaknesses, the study offered some insights for selecting the investigated tools. To verify the effectiveness of animations in teaching and learning, a survey was conducted for undergraduate and postgraduate cohorts and automotive engineering academics. Based on the survey's data, some analytics and discussion have offered more quantitative results. The historic data (2012-2020) analysis has validated the animations efficacy as achievements of the study, where the average mark of both modules has significantly improved, with the reduced rate of failure.

Future Vision of Interactive and Intelligent TV Systems using Edge AI

Article

Full-text available

Aug 2020

Recently the Brazilian DTV system standards have been upgraded, called TV 2.5, in order to provide a better integration between broadcast and broadband services. The next Brazilian DTV system evolution, called TV 3.0, will address more deeply this convergence of TV systems not only at low-level network layers but also at the application layer. One of the new features to be addressed by this future application layer is the use of Artificial Intelligence technologies. Recently, there have been practical applications using Artificial Intelligence (AI) deployed to improve TV production efficiency and correlated cost reduction. The success in operationalize and evaluate these applications is a strong indication of the interest and relevance of AI in TV. This paper presents TeleMÃdia Labâ€™s future vision on interactive and intelligent TV Systems, with particular focus on edge AI. Edge AI means use in-device capabilities to run AI applications instead of running them in cloud.

Symmetry for Multimedia-Aided Art Teaching Based on the Form of Animation Teaching Organization and Social Network

Article

Full-text available

Apr 2020

Zheng Xie

Symmetries play a vital role in multimedia-aided art teaching activities. The relevant teaching systems designed with a social network, including the optimized teaching methods, are on the basis of symmetry principles. In order to study art teaching, from the perspective of the teaching organization form, combined with the survey method, multimedia-aided art classroom teaching was explained in detail. Based on the symmetrical thinking in art teaching, the multimedia-aided teaching mode of art classroom was discussed. The reasons for the misunderstanding of multimedia-aided art teaching were analyzed, and the core factors affecting the use of multimedia art teaching were found. In art teaching, more real pictures were shown aided by multimedia; students could experience the beauty of symmetrical things in real life and were guided to find the artistic characteristics of these kinds of graphics, analyze them, and summarize them. The results showed that this method enriched the art multimedia teaching theory and improved the efficiency of art teaching. The blind use of multimedia technology by teachers in art classroom teaching was avoided. Therefore, the method can develop individualized teaching, develop students’ potential, and cultivate innovative consciousness and practical ability.

Embedding Deep Learning Models into Hypermedia Applications

Chapter

Full-text available

Mar 2020

Deep learning research has allowed significant advances in several areas of multimedia, especially in tasks related to speech processing, hearing, and computational vision. Particularly, recent usage scenarios in hypermedia domain already use such deep learning tasks to build applications that are sensitive to its media content semantics. However, the development of such scenarios is usually done from scratch. In particular, current hypermedia standards such as HTML do not fully support such kind of development. To support such development, we propose that a hypermedia language should be extended to support: (1) describe learning using structured media datasets; (2) recognize content semantics of the media elements in presentation time; (3) use the recognized semantics elements as events in during the multimedia. To illustrate our approach, we extended the NCL language, and its model NCM, to support such features. NCL (Nested Context Language) is the declarative language for developing interactive applications for Brazilian Digital TV and an ITU-T Recommendation for IPTV services. As a result of the work, it is presented a usage scenario to highlight how the extended NCL supports the development of content-aware hypermedia presentations, attesting the expressiveness and applicability of the model.

Semi-automatic mulsemedia authoring analysis from the user's perspective

Conference Paper

Full-text available

Jun 2023

Fog of Things: Fog Computing in Internet of Things Environments

Chapter

Mar 2020

The Fog of Things (FoT) proposes a paradigm which uses the Fog Computing concept to deploy Internet of Things (IoT) applications. The FoT exploits the processing, storage, and network capacity of local resources, allowing for the integration of different devices in a seamless IoT architecture, and it defines the components which compose the FoT paradigm describing their characteristics. This chapter presents the FoT paradigm and relates it to IoT architecture describing the main characteristics and concepts from the sensor and actuator communication to gateways, and local and cloud servers. Lastly, this chapter presents SOFT-IoT platform as a concrete implementation of FoT, which uses microservice infrastructure distributed along devices in the IoT system.

Building Models for Ubiquitous Application Development in a Model-Driven Engineering Approach

Chapter

Mar 2020

Model-driven Engineering (MDE) is an approach that considers models as the main artifacts in software development. Models are generally built using domain-specific languages, such as UML and XML. These languages are defined by their own metamodels. In this context, this chapter aims to present the basics of MDE as well as key frameworks and languages available for its support, providing the necessary background to assist in building an environment to build models in accordance with a particular metamodel. Models built in this environment can then be used to document and maintain systems from different domains.

Mulsemedia vs. Multimedia: State of the art and future trends

Conference Paper

Full-text available

May 2016

Yevgeniya Sulema

User Quality of Experience of Mulsemedia Applications

Article

Full-text available

Oct 2014

User Quality of Experience (QoE) is of fundamental importance in multimedia applications and has been extensively studied for decades. However, user QoE in the context of the emerging multiple-sensorial media (mulsemedia) services, which involve different media components than the traditional multimedia applications, have not been comprehensively studied. This article presents the results of subjective tests which have investigated user perception of mulsemedia content. In particular, the impact of intensity of certain mulsemedia components including haptic and airflow on user-perceived experience are studied. Results demonstrate that by making use of mulsemedia the overall user enjoyment levels increased by up to 77%.

Mulsemedia: State-of-the-Art, Perspectives and Challenges

Article

Full-text available

Sep 2014
ACM T MULTIM COMPUT

Mulsemedia – multiple sensorial media – captures a wide variety of research efforts and applications. This paper presents a historic perspective on mulsemedia work and reviews current developments in the area. These take place across the traditional multimedia spectrum – from virtual reality applications to computer games-as well as efforts in the arts, gastronomy and therapy, to mention a few. We also describe standardization efforts, via the MPEG-V standard, and identify future developments and exciting challenges the community needs to overcome.

Visualizing and understanding convolutional networks

Article

Jan 2013

Beyond short snippets: Deep networks for video classification

Conference Paper

Jun 2015

Large-Scale Video Classification with Convolutional Neural Networks

Conference Paper

Jun 2014

Convolutional Neural Networks (CNNs) have been established as a powerful class of models for image recognition problems. Encouraged by these results, we provide an extensive empirical evaluation of CNNs on large-scale video classification using a new dataset of 1 million YouTube videos belonging to 487 classes. We study multiple approaches for extending the connectivity of a CNN in time domain to take advantage of local spatio-temporal information and suggest a multiresolution, foveated architecture as a promising way of speeding up the training. Our best spatio-temporal networks display significant performance improvements compared to strong feature-based baselines (55.3% to 63.9%), but only a surprisingly modest improvement compared to single-frame models (59.3% to 60.9%). We further study the generalization performance of our best model by retraining the top layers on the UCF-101 Action Recognition dataset and observe significant performance improvements compared to the UCF-101 baseline model (63.3% up from 43.9%).

Simulating wind and warmth in virtual reality: conception, realization and evaluation for a CAVE environment

Article

Jan 2014

Beyond Short Snippets: Deep Networks for Video Classification

Article

Mar 2015

Convolutional neural networks (CNNs) have been extensively applied for image recognition problems giving state-of-the-art results on recognition, detection, segmentation and retrieval. In this work we propose and evaluate several deep neural network architectures to combine image information across a video over longer time periods than previously attempted. We propose two methods capable of handling full length videos. The first method explores various convolutional temporal feature pooling architectures, examining the various design choices which need to be made when adapting a CNN for this task. The second proposed method explicitly models the video as an ordered sequence of frames. For this purpose we employ a recurrent neural network that uses Long Short-Term Memory (LSTM) cells which are connected to the output of the underlying CNN. Our best networks exhibit significant performance improvements over previously published results on the Sports 1 million dataset (73.1% vs. 60.9%) and the UCF-101 datasets with (88.6% vs. 87.9%) and without additional optical flow information (82.6% vs. 72.8%).

NEXT - Graphical editor for authoring NCL documents supporting composite templates

Conference Paper

Jun 2013

Using the Ginga-NCL middleware, interactive multimedia applications for the Brazilian digital TV system are written in NCL (Nested Context Language). Although programming skills are not required when using a declarative authoring language, authors need to have at least a basic knowledge of the language in order to develop an application. Aiming at facilitating and spreading the use of NCL, this paper presents a graphical editor that allows the development of NCL documents for authors with no knowledge of the language. The proposed editor is called NEXT (NCL Editor Supporting XTemplate). To provide that facility, the editor uses hypermedia composite templates, which represent generic structures for NCL programs. Those templates are specified in the XTemplate 3.0 language. In addition, NEXT offers other functionalities, such as creation and editing NCL documents in different views, which facilitate the development of digital TV applications. Those functionalities are provided as a set of plugins, which makes the tool extensible and adaptable to different author skills.

Luar: a language for agile development of NCL templates and documents

Conference Paper

Oct 2012

In the application's development described in NCL language, we have observed the reuse of some models and document structures, which is possible by repetition of common codes on applications. Thus, we do visualize the need to generalize this kind of development described in NCL. This need has been observed by other developers who are aiming at the possibility of the reuse of structure from some documents. This paper introduces Luar, an authoring language for NCL templates. The Luar language has been conceived through analysis of the applications' behavior for iDTV. Luar has a templates's processor developed with the Lua language and library to maintain and to aggregating template collections, sharing them among developers. The entire template system aims to facilitate the design and development of interactive applications described in NCL using the technique of reuse.

Using Abstract Anchors to Aid The Development of Multimedia Applications With Sensory Effects

Abstract and Figures

Recommended publications

Organizational management system in an heterogeneous environment - A WWW case study

Hypertext markup language specification - 2

VHDL2HYPER-a highly flexible hypertext generator for VHDL models

Robust Transmission of Compressed HTML Files over Wireless Channel using an Iterative Joint Source-C...