Conference PaperPDF Available

Fuzzy linguistic reporting in driving simulators

May 2011

May 2011

DOI:10.1109/CIVTS.2011.5949529

Source
IEEE Xplore

Conference: Computational Intelligence in Vehicles and Transportation Systems (CIVTS), 2011 IEEE Symposium on

Authors:

Luka Eciolaza

Mondragon Unibertsitatea

Gracian Trivino

Phedes Lab

Beatriz delgado

DATIK

Show all 5 authorsHide

The growth of new intelligent transportation on board systems has increased dramatically drivers' attention to secondary tasks other than driving. Potential risk of committing mistakes while driving suggests the need to evaluate the onboard systems in order to ensure safe driving practices. An interesting possibility consists of using simulators to perform this evaluation. Simulators can provide analysts with great amount of data describing the performed driving and details about each different situation type. Unluckily, handling this huge amount of information is not easy and to perform a series of repetitive experiments can be very tedious. This application paper deals with the development of a computational system capable of automatically generating linguistic descriptions of driving activity from vehicle simulator data. Based on Fuzzy Logic, and as a contribution towards the development of Computational Theory of Perceptions, the proposed solution is part of our research on granular linguistic models of phenomena. This paper presents a first prototype that is part of a long term practical project devoted to evaluate vehicle onboard systems interacting with professional drivers in different situations. At this stage, we focus on generating simple reports about the quality of driving in a specific context. We have used real time-series data from a vehicle simulator to evaluate the performance of our approach.

CABINTEC: The simulator for Intelligent Transportation Systems.

…

: Table of 1CPs with the corresponding linguistic variables and labels.

…

Report generation diagram.

…

: Average results of the evaluation.

…

Examples of instantaneous 2CP signals over a period of time. a) Shows the progression of the weights (w Low ) corresponding to de label "a Low " of the parameters y 2 1 , y 2 2 , y 2 3. b) Shows the progression of w Low corresponding to the parameter y 2 4 , i.e.: Driving Quality.

…

Figures - uploaded by Gracian Trivino

Content may be subject to copyright.

Content uploaded by Gracian Trivino

Content may be subject to copyright.

Automatic Linguistic Reporting in Driving Simulation

Environments

Luka Eciolazaa,∗, M. Pereira-Fari˜

nab, Gracian Trivinoa

aEuropean Centre for Soft Computing,

Gonzalo Gutierrez Quiros s/n, 33600 Mieres, Asturias, Spain

bCentro de Investigaci´

on en Tecnolox´

ıas da Informaci´

on (CITIUS),

Universidade de Santiago de Compostela, Galicia, Spain

Abstract

Linguistic data summarization targets the description of patterns emerging in data

by means of linguistic expressions. Just as human beings do, computers can use

natural language to represent and fuse heterogeneous data in a multi criteria de-

cision making environment. Linguistic data description is particularly well suited

for applications in which there is a necessity of understanding data at different

levels of expertise or human-computer interaction is involved. In this paper, an

application for the linguistic descriptions of driving activity in a simulation en-

vironment has been developed. In order to ensure safe driving practices, all new

onboard devices in transportation systems need to be evaluated. Work performed

in this application paper will be used for the automatic evaluation of onboard de-

vices. Based on Fuzzy Logic, and as a contribution to Computational Theory of

Perceptions, the proposed solution is part of our research on granular linguistic

models of phenomena. The application generates a set of valid sentences describ-

ing the quality of driving. Then a relevancy analysis is performed in order to

compile the most representative and suitable statements in a ﬁnal report. Real

time-series data from a vehicle simulator have been used to evaluate the perfor-

mance of the presented application in the framework of a real project.

Keywords: Computational theory of perceptions, linguistic description of data,

driving simulators

∗Corresponding author

Email addresses: luka.eciolaza@softcomputing.es (Luka Eciolaza),

martin.pereira@usc.es (M. Pereira-Fari˜

na),

gracian.trivino@softcomputing.es (Gracian Trivino)

Preprint submitted to Applied Soft Computing July 4, 2012

1. Introduction

Linguistic summarization represents an attempt to describe by means of lin-

guistic expressions patterns emerging in data. It is intended in general for appli-

cations in which there is a strong human-machine interaction involving accessing

and understanding data, like supervision and control processes. Linguistic sum-

marization is a data mining or knowledge discovery approach for providing ef-

ﬁcient and human friendly framework for the analysis of data within a decision

support environment.

There is a growing necessity for computational systems capable of provid-

ing linguistic descriptions of phenomena. New technologies allow acquiring and

archiving vast volumes of data about time-evolving phenomena and the amount of

information collected in different ﬁelds is overwhelming and ever growing. How-

ever, there is a lack of tools and means for processing and interpreting all this

information using computers. In order to be useful, this information must be ex-

plained in an understandable way, including facts that may be derived from the

data and the background knowledge available about the phenomenon under study.

Human specialists describe their perceptions using natural language (NL). NL al-

lows experts to make imprecise representations that summarize their perceptions

of complex phenomena choosing the most adequate degree of granularity in each

circumstance to remark the relevant aspects and to hide the irrelevant ones.

In our opinion, data should not be simply made accessible or summarized as

graphics, tables and simple linguistic variables, but the needed interpretation ar-

guments and conclusions should be provided and explained using NL. Nowadays,

any organization of data as provided by a computer, either in a numerical, categor-

ical, and/or graphical form, is just a tool that can be employed by human experts

to produce an explanation in NL. Understandable linguistic descriptions of phe-

nomena are provided by human experts, computers just being a tool for storing

and accessing data in a very ﬂexible way. This is clearly a problem as the ratio

data/human experts is growing dramatically, since it is becoming easier and easier

to collect data, but providing a human being with expertise on a certain area re-

mains difﬁcult and expensive. On the other hand, assessment by observation does

not meet the validity and reliability criteria necessary for any objective evaluation.

Generated expert reports should be based on uniﬁed criteria and independent to

the subjective variability of different experts [1] [2].

In summary, the formulation of data summaries can be seen as a very complex

and non-trivial data mining task, and there is a clear need for computational sys-

tems able to produce automatic linguistic descriptions of data about phenomena.

In this paper, we present our approach to implement a computational system

capable of providing linguistic descriptions of driving activity in simulation en-

vironments. The growth of vehicle onboard devices has increased dramatically

driver’s attention to secondary tasks other than driving. There are many studies

[3] [4] [5] analyzing this problem that demonstrate how the integration of onboard

systems can drastically increase the risk of committing mistakes on driving, po-

tentially resulting on trafﬁc accidents. Therefore, there is a clear need to evaluate

the onboard systems in order to ensure safe driving practices.

Driving simulators are interesting tools to perform this evaluation. Simula-

tors can provide analysts with various information related to the driver-device

interaction in a variety of situations. Human factor experts are able to assess the

performance and effectiveness of the onboard devices through a variety of prede-

ﬁned experiments and obtain a vast amount of data. Unluckily, handling this huge

amount of information is not easy and to perform a series of repetitive experiments

can be very tedious.

Our research line belongs to the ﬁeld of Fuzzy Logic, where since several

years ago the scientiﬁc community deals with the problem of obtaining automatic

linguistic summaries of all sorts of data [6] [7] [8] [9]. Our approach is based on

Computational Theory of Perceptions (CTP) [10] [11]. CTP provides a frame-

work to develop computational systems with the capacity of computing with the

meaning of NL expressions, i.e., with the capacity of computing with imprecise

descriptions of phenomena in a similar way humans do it. In this paper we de-

velop on the concept of Granular Linguistic Model of a Phenomenon (GLMP)

[12] [13]. Our long term project deals with generating more complex and useful

linguistic descriptions than the obtained using the existing technology.

State-of-the-art automatic linguistic description techniques in modern training

systems based on fuzzy logic [2], consider expert knowledge recorded as simple

linguistic variables. In our approach, we generate complete linguistic summary

reports through the aggregation of all the information related to the analyzed phe-

nomena. On the other hand, the works in the area of Natural Language Generation

systems [14] contemplate the generation of complete summary reports, however,

they do not consider the quantiﬁcation of validity degrees for generated sentences

and therefore it is difﬁcult to deal with sensor input data and uncertain informa-

tion.

The work presented in this article builds upon previous works presented in [12]

and [13]. This work represents the last step of a series of prototypes of increasing

complexity. Each prototype followed a logic step where different problematic

have been addressed in order to get to the ﬁnal goal of automating the onboard

device evaluation process. Thus, in [12] the quality of driving during selected

simulation segments was summarized into a set of ﬁxed sentences that were later

on compared with questionnaires ﬁlled by experts. In the prototype presented in

[13] the generated reports were sequential descriptions of events within dynamic

or time-evolving data, and there was an initial stage of validity analysis performed

by the GLMP followed by a relevancy analysis to select the appropriate sentences

for the ﬁnal report.

The application presented here is characterized as follows; (a) it is focused on

the detection of risk events during the driving activity; (b) it compares intervals

of interaction with no-interaction between driver and onboard systems. Finally

it determines the distraction potential of analyzed onboard systems. The report

includes hypothesis about the relation between the detected risk events and the

distraction levels, all based on the input/output signals of the simulator

Once calibrated, the automatically generated document should complement or

replace the evaluation reports generated by human experts.

In this paper, we present the complete linguistic model and report template that

describe the interaction between the driver and the onboard devices in a vehicle.

We measure the potential distraction levels following the requirements deﬁned

within the HITO project. For this the deﬁnition of the index of potential distrac-

tion is done. As for the quality assessment of the generated text, we present our

approach that seeks to evaluate reports’ usability from the point of view of human

consumption, identifying different indicators that deﬁne the quality of a linguistic

report.

Section 2 provides a brief review of the state of the art in the different subjects

involved. Section 3 introduces the concept of Granular Linguistic Model of a

Phenomenon. The deﬁnitions related to the various components are included, and

the process of sentences generation with their respective validity degree are de-

scribed. Section 4 describes the report generation task through the compilation of

the most relevant sentences. Section 5 explains in detail the different parameters

involved in the application, it shows the practical implementation of the GLMP

for the report generation and a deﬁnition for the computation of the Index of Po-

tential Distraction is proposed. Section 6 and 7 describe the experimentation and

validation of the obtained reports respectively. Finally, Section 8 contains the

conclusions.

Figure 1: CABINTEC: The simulator for Intelligent Transportation Systems.

2. Background Information

This section presents a small review of the various domains involved in this

work. Information about the state of the art practices within these domains will

help the reader understand better the extent and potential of the paper.

2.1. Intelligent Transportation Systems and HITO

Road transportation is changing rapidly due to technological evolution. The

ﬁeld of Intelligent transportation systems (ITS) is focused on the minimization

of trafﬁc congestions, optimization of trafﬁc management, and the road security

improvement in general.

The increasing capability to acquire, archive and share huge amounts of het-

erogeneous information allow the development of a big number of vehicle on-

board systems focused to improve the overall driving experience with applications

for driving assistance, entertainment or information systems. Apart from traf-

ﬁc management and road security improvement applications, in the last decade,

technologies to monitor and control drivers conditions, such as fatigue or stress

showed an increased interest and there are numerous examples of these devices in

the market [15].

As stated above, the growth of vehicle onboard devices implies the increased

risk of distractions potentially resulting on mistakes while driving. Vehicle on-

board systems must be designed with the priority of ensuring and meeting safety

requirements, focussing on clear and intuitive interfaces. Therefore, there is a

clear need to evaluate the onboard systems in order to ensure safe driving prac-

tices.

Driver distraction is an important issue of study both for research in investi-

gating human multitasking abilities and for practical purposes in developing and

constraining new onboard devices. See, for example, the research performed in

[16] [17] focused on modeling and prediction of driver distraction. The analy-

sis of driving performance data like vehicle deviation from lane or speed control

with respect to an accelerating and braking lead vehicle can be used to accurately

model driver’s distraction. They quantify how the vehicle’s deviation from the

lane center increases during period of inattention and how the vehicle returns to

lane center during periods of active steering.

This paper is part of a multidisciplinary long term project with the aim of de-

veloping an ”Intelligent cabin for road transportation of goods and passengers”.

Representative parameters of vehicle, environment and driver behavior are col-

lected and evaluated in order to know, depending on the nature and criticality of

the information, WHAT, HOW, WHEN and WHERE the onboard systems’ infor-

mation should be placed in the driving desk within the vehicles. The project aims

to design a safe, usability oriented and scalable framework for the integration of

new onboard devices.

In particular, HITO (Human Interface by Technology Observation) [18] aims

to develop an effective methodology and framework to evaluate onboard devices

used in road transport in order to optimize reliability levels and to guarantee an er-

gonomic and safe workplace design. For this, HITO proposed the development of

a methodology to measure potential distraction levels caused by onboard systems.

This methodology consists of a series of experiments or exercises on a simulation

environment designed by experts. Initially, the exercises will be analyzed and re-

ported manually by a team of human factor researchers for the evaluation of each

device. Finally, the project will also address the essential task of automatization

of the developed methodology through a software application.

Modern training systems in different areas are now incorporating automated

evaluation functionality. Evaluations are based on expert knowledge and the used

metrics are typically recorded as linguistic variables that can take values such as

low, medium, high or other comparable terms. The work presented by Riojas and

col. [2] is an interesting and updated reference of applying Fuzzy Logic in this

ﬁeld. In our approach, we are aiming to generate complete linguistic summary

reports and not only simple linguistic variables describing the simulation activity.

2.2. Computational Theory of Perceptions

CTP was introduced in the Zadeh’s seminal paper “From computing with num-

bers to computing with words - from manipulation of measurements to manipu-

lation of perceptions” [10] and further developed in subsequent papers [11] [19]

[20]. CTP provides a framework to implement computational systems with the ca-

pacity of computing with the meaning of NL expressions, i.e. with the capacity of

computing with imprecise descriptions of the world in a similar way that humans

do it. According to CTP, our perception of world is granular. A granule underlies

the concept of a linguistic variable [19]. A linguistic variable is a variable whose

values are linguistic labels, i.e.: words or sentences in NL [21]. In this approach,

a fuzzy linguistic label can be viewed as a linguistic summary of numerical data,

e.g., a set of temperature values are labeled as Medium. The deﬁnition of the lin-

guistic label includes the concept of ”degree of validity” to describe each element

in the set [21].

2.3. Linguistic Summarization of Data

Linguistic summarization [6] [8] [22] is closely related to CTP and hence to

Granular Computing paradigm. According to [23], information granules are con-

ceptual entities which emerge from the needs of humans in a continuous quest to

abstraction and summarization of information. Linguistic summarization seeks to

process complex information and describe emerging patterns through linguistic

expressions and the manipulation of information granules in the form of words.

Information granulation in the form of NL is used to describe and understand

complex phenomena at different levels of resolution or scales. Similar to human

reasoning, NL representation is used for multi modal data fusion.

The idea of linguistic fuzzy quantiﬁers was introduced by Zadeh in [22]. The

concept of linguistic fuzzy summary was introduced in [6] and further developed

in [8]. A fuzzy linguistic summary is a set of sentences which express knowl-

edge about a situation through the use of fuzzy linguistic summarizers and fuzzy

linguistic quantiﬁers.

The basic concept of fuzzy linguistic summary has the general form of a quan-

tiﬁed fuzzy proposition [6] [22]: (w, ”Q objects in database are S”); where Q is

called the quantiﬁer, S is the qualiﬁer, also called summarizer, and w is the degree

of validity of the linguistic clause for representing the meaning in the speciﬁc con-

text. For example: (0.7, ”sometimes the driving quality has been low”). In recent

years, this basic concept has been developed in different ways [24] and used for

different applications, e.g., data mining [7], database query [8], and for describing

temporal series [25]. See in [26] a review on the state of the research on this ﬁeld.

During the last years, researchers in the ﬁeld of computing with words and

perceptions have developed an important set of resources to represent the meaning

of perceptions for making decisions in speciﬁc applications [27] [28]. The basic

fuzzy linguistic summary covers a very small part of the possibilities of meaning

in NL [29] [30] [31]. Each application will present particular challenges for the

computation of the validity degrees of the different types of linguistic expressions.

3. Granular Linguistic Model of a Phenomenon (GLMP)

In this section, we introduce the components of the GLMP, our approach based

on CTP for developing computational systems able to generate linguistic descrip-

tions of data. In our research line, one of the contributions of this paper consists

of identifying three types of Computational Perceptions (CP), namely, Assertive,

Derivative and Integrative that here, they are extensively used to linguistically

model the evolution in time of phenomena.

3.1. Computational Perception

ACP is the computational model of a unit of information acquired by the

designer about the phenomenon to be modeled. In general, CPs correspond with

speciﬁc parts of the phenomenon at certain degrees of granularity. A CP is a

couple (A, W )where:

A= (a1, a2, . . . , an)is a vector of linguistic expressions (words or sentences in

NL) that represents the whole linguistic domain of the CP. Each aidescribes

the value of the CP in each situation with speciﬁc degree of granularity.

These sentences can be either simple, e.g., ai=“The vehicle speed is high”

or more complex, e.g., ai=“During interaction sometimes the manoeuvre

execution has been bad.”.

W= (w1, w2, . . . , wn)is a vector of validity degrees wi∈[0,1] assigned to each

aiin the speciﬁc context. wiis the degree in which aiis valid to describe a

situation.

3.2. Perception Mapping (PM)

We use PMs to create and aggregate CPs. There are many types of PMs and

this paper explores several of them. A PM is a tuple (U, y, g, T )where:

Uis a vector of input CPs, U= (u1, u2, . . . , un), where ui= (Aui, Wui). In the

special case of ﬁrst order Perception Mappings (1PMs), these are the inputs

to the GLMP and they are values z∈Reither provided by a physical sensor

or obtained from a database.

Figure 2: Example of GLMP.

yis the output CP,y= (Ay, Wy) = {(a1, w1),(a2, w2), . . . , (any, wny)}.

gis an aggregation function employed to calculate the vector of validity degrees

assigned to each element in y,Wy= (w1, w2, ..., wny). It implements the

aggregation of input vectors, Wy=g(Wu1, Wu2, ..., Wun), where Wuiare

the degrees of validity of the input perceptions. In Fuzzy Logic many dif-

ferent types of aggregation functions have been developed. For example g

could be implemented using a set of fuzzy rules. In the case of 1PMs,gis

built using a set of membership functions as follows:

Wy= (µa1(z), µa2(z), . . . , µany(z)) = (w1, w2, . . . , wny)

where Wyis the vector of degrees of validity assigned to each ai, and zis

the input data.

Tis a text generation algorithm that allows generating the sentences in Ay. In

simple cases, Tis a linguistic template, e.g., “The temperature in the room

is {high |medium |low}”.

3.3. Structure of the GLMP

The GLMP consists of a network of PMs. Each PM receives a set of input

CPs and transmits upwards a CP. We say that each output CP is explained by the

PM using a set of input CPs. In the network, each CP covers speciﬁc aspects of

the phenomenon with certain degree of granularity. Fig. 2 shows an example of

GLMP. In this example, at every point in time, the phenomenon can be described

at a very basic level in terms of three variables z1,z2, and z3. These variables are

verbalized in through 1PMs {p1

1, p1

2, p1

3}.

Using different aggregation functions and linguistic expressions, the paradigm

GLMP allows the designer to model computationally his/her perceptions. In the

case of Fig. 2, from the outputs of 1PMs, other two higher-level descriptions of

the phenomenon are derived. These descriptions are given in the form of compu-

tational perceptions CP4and CP5, which are explained by 2PMs {p2

4, p2

5}in terms

of CP1,CP2, and CP3. The validity of each item in CP4and C P5is explained

by those items of CP1,CP2and CP3. Finally, the top-order description of the

phenomenon is provided, at the highest level of abstraction, by CP6, explained by

the 2PM {p2

6}in terms of CP4and CP5. Notice that, by using this structure, one

can provide a linguistic description of the phenomenon at different levels, from

the very basic level to the highest or most general level of granularity.

3.4. Types of CP

For this application, as initially introduced in [13] and inspired in the classical

Control Theory [32], we focus on the perception of three important characteristics

of phenomena evolution, namely, the perception of the current state (assertive

CP), the perception of the trend to evolve (derivative CP) and the summary of

accumulated perceptions (integrative CP).

3.4.1. Assertive CP

It is associated with a linguistic expression of type “Y is A”. It represents the

linguistic fuzzy model of the current state of a characteristic of the phenomenon,

e.g., “The Distance to the Vehicle in front is High ”.

3.4.2. Derivative CP

They correspond with trend analysis information and they give insight into

how the phenomenon is evolving in time. It helps contextualizing the information

and it may be important for decision making.

The following example sentences clearly show the importance of the deriva-

tive information, and how it can completely change the context in which certain

decision must be taken.

•“The Distance to the Vehicle in front is Medium ”.

•“The Distance to the Vehicle in front is Medium and Increasing ”.

•“The Distance to the Vehicle in front is Medium and Rapidly Decreasing ”.

In this work, we only use derivative 1CPs, i.e., they are directly obtained from the

input signals of the different sensors.

In the Derivative PM,Uis a time-series signal (z={z(k−l+ 1), . . . z(k−

1), z(k)}) of length l, obtained directly from sensor input data, where krepresents

the current sample and lis deﬁned by the designer, e.g., to ﬁlter the noise of the

input signal. zdis the relative change between samples and it is calculated as

follows:

zd= 100 ×z(k)−z(k−l+1)

¯z

where the average of the zis

¯z=1



i=1

z(k−i+ 1)

The relative change zdis directly used to describe the trend of the perceptions

linguistically. Here, Tis deﬁned by the template: “The value of the attribute is

{Rapidly Decreasing |Decreasing |Steady |Increasing |Rapidly Increasing}”.

3.4.3. Integrative CP

The Integrative CP represents the accumulated perception of the phenomenon

over a period of time. The text associated with these perceptions consists of

summary sentences of historical event occurrences, and answers the question of

“Which is usually the state of the Parameter”, i.e.: “Q of Ys are A”.

The typical template for the answer could be:

{Never |A few times |Sometimes |Many times |Most of the time |Always}, the

parameter was {Low |Medium |High}.

The accumulated perception may be very important for decision making. The

following example sentences with assertive and integrative sentences show how

the integrative information can completely change the context in which the deci-

sion must be taken.

•”The Vehicle Linearity is Medium ”.

•”The Vehicle Linearity is Medium and Most of the time it has been High”.

•”The Vehicle Linearity is Medium and already Sometimes it has been

Medium ”.

In case of driving quality assessment the last example sentence could show a dis-

traction problem as it seems that medium vehicle linearity events are common and

so there have been many slight distractions.

The deﬁnition of an Integrative PM corresponds to the tuple (U,y,g,T). In

this case, Uis an input CP over a time window. The designer sets the parameter

ldeﬁning the length of the time window from which the temporal series of l

samples is obtained. U={u(k−l). . . u(k)}, where krepresents the current

sample.

In the case of an attribute deﬁned by three linguistic labels, e.g.: {Low,

Medium, High}. The output CP y= (Ay, Wy)for the integrative PM will be

expressed by eighteen possible sentences {(a1, w1), . . . , (a18, w18 )}, combination

of the six linguistic quantiﬁers Q={Never, A few times, Sometimes, Several times,

Most of the time, Always}and input linguistic labels Ai={Low, M edium, High}.

gis an aggregation function computed as a quantiﬁed sentence. There are

many different approaches for evaluating quantiﬁed sentences. In this work we

have used the α-cut based method called GD introduced in [33] instead of the ba-

sic approach of quantiﬁed fuzzy propositions [6] [22] where the weights wassoci-

ated with the linguistic expressions are computed as fuzzy cardinalities. The GD

method has been used due to its efﬁciency and non-strict character. The method

also fulﬁlls some interesting properties related to relative quantiﬁers deﬁned in

[33].

4. Report Generation

Using the GLMP deﬁned in the previous section, we can generate a set of

valid sentences describing the phenomenon in different levels of granularity or

detail. The GLMP is built during a design stage where a corpus of NL expressions

that are typically used in the application domain is collected. These expressions

describe the relevant features of analyzed phenomena. The sentences describe

each perception in every temporal sample and through quantiﬁed sentences the

overall states of the perceptions will be described.

Figure 3: Report generation diagram.

A medium size GLMP could generate a huge number of sentences describing a

particular phenomenon. In case of a session of driving simulation analyzed in this

work, the number of generated sentences can amount to hundreds of thousands

for a normal simulation exercise. It is critical to do a relevancy analysis in order

to select and compile the relevant sentences into one document highlighting the

interesting characteristics of a simulation.

The report describing the temporal evolution of a phenomenon is obtained

from the instantiation of the input data following a customized report template.

The template of the reports is deﬁned considering the particular needs of the users

in order to highlight relevant aspects. For the application presented in this paper,

a report template have been created in collaboration with human factors experts

and it will be explained more in detail in section 5.4.

Fig. 3 shows the report generation diagram followed for the automatic report

generation. Initially, within the validity analysis, the full set of valid sentences

describing the analyzed phenomenon are created. In a second stage, a logic of

relevant sentence selection is implemented based on the customized template for

a ﬁnal report. The computational system selects among the available possibilities

the most suitable linguistic expressions to describe the input data.

In this paper, in order to explain the possible causes of detected incidents,

the report template includes de diagnosis of happening events which implies the

need to solve the inverse problem concerned with fuzzy relations [34]. We have

implemented a linguistic approach to present the solutions where the variables in-

ferred in the diagnosis are explained based on NL sentences generated in previous

section (see the example of application in section 5.4).

5. Application: Linguistic description of potential distraction levels of on-

board devices

In the HITO project, driving simulators have been used in order to set a

methodology for the evaluation of new vehicle onboard devices. This method-

ology analyzes driver-device interaction in a variety of situations in order to asses

the index of potential distraction (IPD) of the new device. The designed exercises

require a sufﬁcient degree of concentration from the driver and at the same time

demand the interaction with different onboard devices. The amount of data gen-

erated in the experiments, with a number of drivers in various different situations,

is huge and so the manual assessment process require many human resources.

In this work, linguistic descriptions of driving simulation exercises are gener-

ated automatically. The automated onboard device evaluation process will save

time and resources and it will generate reports based on an uniﬁed expert criteria.

The generated descriptions will either replace or complement manually generated

expert reports and they will be mainly focused on the following aspects:

•Detection of distraction events during the exercise,

•Comparison of intervals with and without interaction between the driver and

onboard devices.

•Generation of a linguistic report describing the overall quality of driving,

providing an IPD of the analyzed onboard devices.

5.1. Driving simulator: monitored parameters

Vehicle simulator provides different parameters that are used to determine the

driving quality and the IPD of the analyzed devices over the simulation exercises.

These input parameters are related to the conduction, the controls of the vehicle

and the simulation environment. They are numerical values z∈Rdescribed as

follows:

z1Vehicle Speed: Principal vehicle speed (km/h).

z2Lateral Position: Distance from the central point of the vehicle to the right

extreme of the track where it is (m).

z3Track Width: Width of the track of circulation (cm).

z4TTLC: Time to cross the line on the edge of the road (in seconds) (Lateral

Position/Lateral Speed).

z5HE: Angle between the tangents of the vehicle position and the road (de-

grees).

z6Steering Wheel Position: Measured steering wheel turning angle (degrees).

−250

−200

−150

−100

−50

100

150

200

250

Vehicle Speed

Lateral Positioning

Track Width

TTLC

Steering Wheel Position

Break Usage

Accel pedal usage

Road Slope

Distance to vehicle in front

Overtaking

Retarder

Speed of vehicle in front

Figure 4: Examples of signals obtained from the simulator CABINTEC.

z7Percentage of brake usage: Percentage of actuation over the brake pedal

(%).

z8Percentage of accelerator usage: Percentage of actuation over the accel-

eration pedal (%).

z9Road Slope: Slope or inclination of the road (%).

z10 Distance to the Vehicle in front: (m).

z11 Vehicle Overtaking: Overtaking Situation of the vehicle (boolean).

z12 Retarder: Use of retarder. Hydraulic vehicle braking system (%).

z13 Speed of the Vehicle in front: (km/h).

Obviously, it is possible to extract a lot of useful information from these data.

The detection of risk events and description of driving activity through the lin-

guistic summarization of the above data inputs is only a ﬁrst approach of the ﬁnal

automated application. For the comparison of interaction and non-interaction in-

tervals a correlation between the appearance of risks events and the manipulation

of the onboard devices is done. The information provided by the onboard devices

may vary depending on the device but we are always able to determine when the

interaction starts and ends within the simulation exercise. Therefore, we created a

time-series signal zint representing the interaction intervals, as follows:

zint =1if Interaction is active

0if Interaction is inactive (1)

5.2. GLMP: Index of potential distraction of an onboard device

The GLMP in this application is designed to answer the general question of

which is the IPD of a given onboard device (See Fig. 9). This linguistic model

is an enhancement over the one presented in [13] where the description of the

Driving Quality was performed. In this case the Driving Quality information is

combined with the information of the analyzed onboard device in order to describe

and evaluate the IPD.

5.2.1. 1CPs

In total, there are 14 input variables for the GLMP, 13 signals from the simu-

lator and one signal deﬁning the interaction with the device. Therefore, there are

14 1CPs, with 1PMs{p1

1, p1

2. . . p1

14}. Each 1PM will be deﬁned by the tuple (U,

y,g,T). As an example, the 1PM of Vehicle Speed is developed where:

Uis the input value z1provided by the speed sensor.

yis a variable of type 1CP describing the Vehicle Speed. Its value is expressed

by linguistic sentences and their corresponding weights of validity as fol-

lows: (Null, wN ull),(S low, wSlow ),(M edium, wMedium ),(F ast, wF ast ).

Here, e.g., Slow stands for complete linguistic expression “The vehicle

speed is Slow”.

gis a function: Wy= (µNull (z), µSlow(z), µM edium(z), µF ast (z)) where µi

are membership functions relating linguistic labels with the sensor’s nu-

merical value z.

T“The vehicle speed is {Null |Slow |Medium |Fast}”.

Trapezoidal membership functions have been used to cover the domain of values

of the different inputs parameters zi. The deﬁnition of the membership functions

have been made using expert knowledge. Fig. 5 shows several examples of the

used trapezoidal membership functions.

0 20 40 60 80 100 120 140

0.5

Vehicle Speed (km/h)

µNull Slow Medium Fast

−100 −80 −60 −40 −20 0 20 40 60 80 100

0.5

Steering Wheel Position (degrees)

µStrong Left Soft Left Centered Soft Right Strong Right

0 20 40 60 80 100 120

0.5

Percentage of accelerator usage (%)

µNull Low Medium High

Figure 5: Examples of trapezoidal linguistic labels used for fuzziﬁcation of 1CPs.

5.2.2. 2CP

According to the deﬁnition provided by the team of human factors experts in-

volved in the HITO project, we describe and evaluate the quality of driving using

three parameters, namely, Steering Wheel Control,Vehicle Linearity and Security

Distance. These three variables can give insight on the risk events and the distrac-

tion levels while driving. They are 2CPs that are derived directly from the 1CPs

described above. We used sets of fuzzy IF-THEN rules by each corresponding

2PM (named p2

1, p2

2, p2

3in Fig. 9).

1The Steering Wheel Control is the output of p2

1, function of the 1CPs (y1. . . y11)

and the derivatives (yd1. . . yd11). The loss of steering wheel control is de-

ﬁned as an abrupt manoeuvre, unusual in the vehicle direction control, and

causing safety risk. Sudden track changes, big oscillations on the vehicle

directions, out of track circulation, all could be indicators of loss in steering

wheel control.

2The Vehicle Linearity, output of p2

2, is also function of 1CPs (y1. . . y11 )

and their derivatives (yd1. . . yd11). It refers to the uniformity of the vehicle

trajectory.

3Security Distance, output of p2

3, refers to the distance with respect to the

vehicle in front. It is deﬁned depending on Highway Code, road conditions

and environmental conditions. This 2CP is function of 1CPs (y7. . . y13 ) and

their derivatives (yd7. . . yd13).

On the other hand, Driving Quality is also a 2CP in the GLMP. Any deviations on

above mentioned three parameters suggest a distraction problem and a degraded

quality of driving.

4Driving Quality is the output of p2

4. It is deﬁned using the CPs (y2

1, y2

2, y2

3).

It determines the quality of driving at every instant during the simulation.

Each 2PM will be deﬁned by the tuple (U,y,g,T). As an example, the 2PM of

Security Distance (y2

3) is developed where:

Uis a set of input CPsU= (u1, . . . , un), where ui= (Ai, Wi)are the output

CPs of the 1PM’s {p1

7, p1

8, . . . , p1

13}and their derivatives. U={y7, . . . , y13,

yd7, . . . , yd13}.

yis a variable of type 2CP describing the Security Distance. Its possible

value is expressed by linguistic sentences and their corresponding weights

of validity as follows: (Low, wLow ),(M edium, wMedium),(High, wH igh ).

Here, e.g., Low stands for ”The Security Distance is Low”.

gis the aggregation function Wy=g(Wy7, . . . , Wy13 , Wyd7, . . . , Wyd13 ), where

Wyis a vector (wLow, wM edium , wH igh )of validity degrees of the percep-

tion’s linguistic labels in y2

3.Wyiare the degrees of validity of the input

computational perceptions.

T“The Security Distance is {Low |Medium |High}”.

The aggregation function Wy=g(Wy7, . . . , Wy13 , Wyd7, . . . , Wyd13 )has been im-

plemented using an expert set of fuzzy (IF-THEN) rules. In this rules, operator

AND has been implemented through the minimum, while the operator OR has

been implemented through the maximum. The following rules are an example

of some of the rules used to deﬁne the Security distance (see Tables 1 and 2 for

quickly reference the names of CPs).

- IF (y7is Medium) AND (y13 is Low) AND (y10 is Medium) AND (y9is Strong Descendent)

THEN y2

3is Low

- IF (y7is Medium) AND (y13 is Medium) AND (y10 is Very Small) THEN y2

3is Low

- IF (y10 is Very Small) AND (y13 is Low) AND (y12 is not 0) THEN y2

3is Low

- IF (y10 is Very Small) AND (y13 is Low) AND (y7is not Null) THEN y2

3is Low

- IF (y10 is Low) AND (yd3is Rapidly Decreasing) THEN y2

3is Low

- IF (y10 is Low) AND (yd3is Decreasing) THEN y2

3is Low

- IF (y13 is Low) AND (y10 is Medium) THEN y2

3is Medium

- IF (y10 is Low) AND (yd10 is Slowly Decreasing) THEN y2

3is Medium

- IF (y10 is Big) THEN y2

3is High

Fig. 6 shows an example of the evolution of the label Low at the 2CPs (y2

1, y2

2, y2

and y2

4) over a period of time within a particular simulation where a loss of driving

quality happens.

Figure 6: Examples of instantaneous 2CP signals over a period of time. a) Shows the progression

of the weights (wLow) corresponding to de label ”aLow ” of the parameters y2

1, y2

2, y2

3. b) Shows

the progression of wLow corresponding to the parameter y2

4, i.e.: Driving Quality.

Finally, two more 2CPs have been deﬁned in order determine the driving qual-

ities with and without interaction activity.

i5Driving Quality during interaction. This perception is an integrative CP and

it has been computed from inputs (y2

4and y1

11). It provides the accumulated

perception of the driving quality over the active interaction period during the

simulation exercise. The aggregation function gis the aggregation method

GD that provides quantiﬁed sentences as described in 3.4.3.

i6Driving Quality during non-interaction. This perception is also an integra-

tive CP and it has been computed from inputs (y2

4and y1

11) in order to have

Figure 7: Example of CP y2

i4or y2

i5. The weights of the quantiﬁed sentences and the resulting

output sentences.

the accumulated perception of the driving quality while there has not been

interaction over the simulation exercise.

Although these CPs are not used directly to deﬁne the top order perception, their

output sentences will be required for the ﬁnal report. Fig. 7 shows an example of

the quantiﬁed sentences at integrative CPs referred to the quality of driving.

The most relevant question to answer by the 2CPs mentioned in this section

could be about the reasons behind low overall driving quality. Thus, a typical

description statement generated with them could be something like:

“During interaction, sometimes the driving quality has been low,

because the vehicle linearity has been low and . . . ”

5.2.3. Top order CP: IPD

In this application, the top order CP is the IPD that the analyzed onboard

device has over the driver. This is a more complex type of integrative CP with the

following elements:

Uis a temporal series obtained from couples of instantaneous values of y2

and y1

11 along the duration of a simulation session, i.e., IPD is obtained

as a combination of distraction rates over intervals of interaction and non-

interaction.

yis a variable of type 2CP describing the IPD. Its possible value is expressed

by linguistic sentences and their corresponding weights of validity as fol-

lows: (Low, wLow),(M edium, wMedium),(H igh, wHigh). Here, e.g., Low

stands for “The Index of Potential Distraction is Low”.

gis the aggregation function described below.

T“The Index of Potential Distraction is {Low |Medium |High}”.

The IPD is calculated as follows:

I P D =D1

(D1+D2)(2)

where D1and D2are the distraction rates of intervals with and without interaction

respectively.

D1is calculated as the weighted sum of the cardinalities of labels Low and

Medium on the driving quality y2

4over the active interaction period.

D1=k1×l

t=0 wLow(t)×wI nt

Active(t)

l

t=0 wInt

Active(t)+k2×l

t=0 wMedium(t)×wI nt

Active(t)

l

t=0 wInt

Active(t),(3)

where the values k1= 1 and k2=1

2were chosen empirically as a part of the

aggregation function design. In this equation, wLow corresponds to the validity

degree of linguistic clause “The Driving Quality is Low” of y2

4.wInt

Active corre-

sponds to the validity degree of sentence “Interaction is Active” of y1

14 which is 1

when there is interaction and 0 otherwise, as deﬁned in formula (1). In the same

way, D2is calculated as follows:

D2=k1×l

t=0 wLow(t)×wI nt

Inactive(t)

l

t=0 wInt

Inactive(t)+k2×l

t=0 wMedium(t)×wI nt

Inactive(t)

l

t=0 wInt

Inactive(t),

(4)

With this deﬁnition for IPD, the obtained value will be Low while D1< D2.

IPD will be Medium while D1above but still similar to D2, and IPD will be-

come High as D1gets considerably bigger than D2and the driving quality while

interaction gets degraded. Fig. 8 shows the membership functions designed to

verbalize the obtained IPD value. Fig. 9 shows the GLMP developed for this

application. As a reference, tables 1 and 2 show the full list of CPs utilized, in-

cluding the linguistic variables and labels used to describe them.

Figure 8: Membership functions used for the verbalization of IPD.

Figure 9: GLM P developed to determine the IP D of an onboard device.

5.3. Identiﬁcation of distraction events

In order to fulﬁll the report template as required by the ﬁnal users, we needed

to identify the, so called, distraction events. Here we describe how to identify

these events as speciﬁc situation types into the information available in the GLMP.

The values of D1and D2are an insight to the overall driving quality evalua-

tions during the simulation. However the identiﬁcation of particular low driving

quality events is also important within the application. These individual distrac-

tion events allow users analyzing in depth the reasons behind them. The evalua-

tion statements of the IPD will be accompanied by the most likely reasons for that

judgements.

The identiﬁcation of incidences is focused on the 2CPsy2

5and y2

6(Driving

Table 1: Table of 1CPs with the corresponding linguistic variables and labels.

CP (y) Linguistic Variables Linguistic Labels (Ay)

1Vehicle Speed {Null, Slow, Medium, Fast}

2Lateral Position {Short, Medium, Long}

3Track Width {Narrow, Medium, Wide}

4TTLC {Big Left, Small, Big Right}

5HE {High Neg, Medium Neg, Low,

Medium, High}

6Steering Wheel Position {Strong Left, Soft Left,

Centered, Soft Right, Strong Right}

7Percentage of brake usage {Null, Low, Medium, High}

8Percentage of accelerator {Null, Low, Medium, High}

usage

9Road Slope {Strong Descendent, Descendent,

Null, Ascendent, Strong Ascendent}

10 Distance to the Vehicle {Not Measurable, Very Small,

in front Medium, Big}

11 Vehicle Overtaking {Not, Yes}

12 Retarder {0, 1, 2, 3, 4}

13 Speed of Vehicle in front {Not Measurable, Null, Low,

Medium, High}

ydDerivative {Rapidly Decreasing, Decreasing,

Steady, Increasing, Rapidly Increasing}

Quality with and without interaction). For this task a window length (wl) is con-

sidered within which the normalized cardinalities of labels Low and Medium

were computed, e.g.:

CARD(Low) = 1



j=k−l

wLow(j).(5)

where kis the current sample, lis the length in samples of the analyzed window,

l=wl ×sr, and sr is the sample rate. Using expert knowledge, in this application

awl of 10 seconds have been selected. The presence of distraction events is

deﬁned as CARD(Low)>0.3or CARD(M edium)>0.6. Note that here, we

must make the crisp decision of either including or not including the description

of an event in the ﬁnal report.

5.4. Template of the IPD analysis report

As mentioned above, the generation of a report describing the IPD within

a simulation exercise needs to follow a customized report template in order to

highlight the relevant aspects the ﬁnal user needs.

Table 2: Table of 2CPs with the corresponding linguistic variables and labels.

1Steering Wheel Control {Low, Medium, High}

2Vehicle Linearity {Low, Medium, High}

3Security Distance {Low, Medium, High}

4Driving Quality {Low, Medium, High}

5Driving Quality {Low, Medium, High}

during interaction

6Driving Quality {Low, Medium, High}

during non-interaction

yT op IPD {Low, Medium, High}

yiIntegration {Never, A few times, Sometimes

Many times, Most of the time,

Always}

The report template used for this application represents a typical summariza-

tion document deﬁned by human factors experts within the HITO project. This

typical summary focuses on the comparison of driving quality with and without

interaction in order to determine the IPD an onboard device may have. On the

other hand, the report focuses on the abnormal events during the driving activity

providing the ﬁnal user more information to determine the type of distractions the

onboard device may induce.

Our software application generates a linguistic report including hypothesis

about the relation between the distraction events and parameters that triggered

each of the events.

The template of the report deﬁned for this application consists of the following

sections.

(i) General Observations: This section holds various subsections to describe in

detail the different aspects happened on the simulation exercise.

a) Interaction Activity: This subsection gives information about the interac-

tion activity with the analyzed onboard system during the simulation. It

states the number of independent interactions and the accumulated time

of interaction. A ﬁgure indicating the periods of interaction is also in-

cluded.

b) Comparison of interaction vs. non-interaction: The quantiﬁed sentences

obtained from y2

i5and y2

i6describing the quality of driving are compared

trying to highlight the similarities and differences between them. The

following sentences show an example:

-“Both during interaction and non-interaction,

Most of the time the Driving Quality has been High”.

-”During interaction, Sometimes the Driving Quality has been Low,

while during non-interaction A few times the Driving Quality has been

Low.”

Note that the template provides two different linguistic expressions de-

pending on the results. This subsection also provides a relation of distrac-

tion events during the simulation indicating the number of events, and for

each event, occurrence time and cause of event, e.g.,

-”During interaction 1 distraction event happened:”

Event 1) at 100 seconds: The Vehicle Linearity is Low”

c) Detailed description of events during interaction: In this subsection, the

route-cause of events during the interaction are explained individually.

Depending on the rules deﬁned in each aggregation function, the GLMP

navigates backwards on its branches solving the inverse problem, in or-

der to deduce which rules have been triggered and by which parameters.

Each event has an associated time, and so the generation of statements

at the exact time of the event is a straight forward task (we identify the

antecedents of the triggered fuzzy rules), e.g.,

-”Event 1) approximate time at 100 seconds.

The Vehicle Linearity is Low:

* The Vehicle Speed is Medium.

* The Percentage of accelerator usage is High.

* The Vehicle is Not Overtaking.

* The Lateral Position is Rapidly Decreasing.”

Each event description is also accompanied with video frames and ﬁgures

showing the related input parameters.

(ii) Conclusion: This section provides the estimated IPD during the simulation

exercise.

Therefore, the ﬁnal layout of generated reports will be dynamic and depending on

the number of distraction events detected during the simulation. Fig. 10 shows

an example of an event description within one of the generated reports (see an

explanation of this text in the next section).

Figure 10: Example of event description within a report that was generated during the experimen-

tation.

6. Experimentation

The experimentation was based on data of a series of simulation exercises

performed with professional drivers in a variety of settings. Within the project,

four drivers conducted 8 exercises each in 4 predeﬁned scenarios. These exercises

were designed to evaluate 4 different onboard devices. For the validation of the

implemented application, we were asked to focus on the ”mobile phone” as on-

board device. Exercises performed by the ﬁrst driver were used to tune the rule

sets and parameters deﬁned within the GLMP.

A standard simulation exercise analyzed within the project contains data of

10-15 minutes, and considering the sample rate of the measurements (≃50 Hz)

and the size of the GLM P , the number of sentences generated can amount to

hundreds of thousands.

Fig. 10 shows an interesting example of a description of an event occurred

during the execution of an simulation exercise in an inter-urban environment. At

some point during this simulation, the driver (of the truck) must overtake a group

of cyclists that appears on the road. In the particular case of the ﬁgure, the numer-

ical information indicates the presence of an object (group of cyclists) in front of

the vehicle. The speed of the object is low, while the truck is accelerating hard and

not braking. The prolongation of this situation over a period of time is a reason

why an incident happens. It is worth noting that the description of this particular

event would only appear in a ﬁnal document in case the event occurred during a

positive interaction period as speciﬁed in the user requirements.

As shown in Fig. 10, the description of a particular event contains the expla-

nation sentences that describe it and conjectures about the possible reasons that

caused it. Then, the ﬁrst graphic shows the ﬁrst order parameters around the in-

stant of the event. Secondly the progression of the incidence level signal during

the same period is shown. Finally the frame of the simulation video that corre-

sponds to the approximate instant of the incidence is represented (here the group

of cyclists can be observed).

At the time of describing the incidence, the rules that triggered the detection

of a distraction event are identiﬁed. In this sense, the inverse problem concerned

with fuzzy relations is investigated here. We propose the diagnosis of occurring

incidents through a linguistic approach. In every instant of the simulation, the

rules that are activated can be tracked. For example, we can access the combina-

tion of parameters in each rule, and the exact variables that trigger and cause the

incidence can be reported.

Quality of a

linguistic

description

What the text

implicates

What the text

says

Relevance Truthfulness

Lexico-grammar

Extension

Context Semantics

Manner

Expression

·Question 1

·Question 2

·Question 3

·Question 4

·Question 5

·Question 6

·Question 7

·Question 8

Layer 4

Layer 3

Layer 2

Layer 1

Figure 11: Hierarchical structure of our concept of the quality of a report.

7. Validation

One of the critical problems for the designer of this type of computational

applications is that of assessing a degree of quality to each automatically generated

text. This evaluation is needed to provide him/her with the necessary feedback to

improve the ﬁnal results.

Here in order to obtain a measure of text quality, the generated report was sent

to ﬁve experts in human factors in the HITO project, who fulﬁlled a form com-

posed by eight questions. This questionnaire is a result of our own research in this

speciﬁc ﬁeld [35]. In this section, we introduce the essentials of our approach and

we address the interested reader to the referenced paper. Our source of inspiration

are Pragmatics [36] and Systemic Functional Linguistics (SFL) [37]. From Prag-

matics, we take the deﬁnition of a good communicative act and from SFL we take

the structure of language as a system.

Fig. 11 shows a hierarchical structure of concepts about the report quality that

are explained using the answers obtained using a questionnaire fulﬁlled by a hu-

man expert. The questions have ﬁve possible answers in form of numeric scale of

evaluation in [1, 5]. Table 3 contains our proposal of questions. Question 1 con-

siders subjective relevance meanwhile question 2 considers the inter-subjective

relevance. Questions 3 and 4 deals with evaluating the truthfulness of a report.

With the ratings of the relevance and the truthfulness, we obtain the partial rating

of the report respect to “what the text implicates”. Questions 5 and 6 evaluate

if the report uses the adequate vocabulary, if the order of the ideas is the most

appropriate or if the used expressions are the right ones. Bad quantity reports

contains too much or insufﬁcient statements respect to the fact that try to be de-

scribed; good quantity reports contains the right statements to understand the fact.

To evaluate this aspect, we propose questions 7 and 8.

As mentioned above, for the validation of the implemented application, we

were asked to focus on the ”mobile phone” as onboard device. These sample

gives a good insight of the potential of the developed technology, and the reports

generated in these exercises were used to assess the quality of the automatically

generated reports. In a typical experimental layout, we selected a group of ﬁve

people familiar with the HITO project. For each exercise, we gave them: the

description of the context of the simulation, the video of the exercise, the auto-

matically generated document and the form to evaluate it. After 45 minutes, they

had to fulﬁll the questionnaire. The results of the different reviewers are shown in

Table 4, where R1,R2,R3,R4 and R5 denotes each reviewer that participated in

the experiment. The data in Table 4 are an example of the type of practical results

Table 3: Questions about Quality of the generated Reports.

1Indicate in which degree the content of this report

belongs to the application domain of the HITO

project.

2Indicate in which degree you identify the type of

results expressed as the type of results expressed

by yourself.

3After observing the behavior of the driver in

the simulator, do you agree with the assessed

global quality?

4Do you agree with the provided explanations?

5Indicate in which degree the vocabulary is used

correctly.

6Indicate in which degree the ideas are correctly

ordered to facilitate the comprehension

of the report.

7Indicate in which degree the format of the

report, including the use of ﬁgures and

punctuation, is the most adequate.

8Indicate in which degree you consider that the

extension of the report is the right respect

to its content.

obtained during the development of the project. The average denotes the global

rating for the report. Two of the ﬁve ratings are higher than 4and the minimum

rating is 3.7. The average is calculated as the arithmetic mean of all questions;

therefore, all the questions have the same importance for the global rating. It is

relevant to note that all the reviewers agree with the maximum rating in question

1, the question that tries to determine if the content of the report belongs to the

application domain. Question 7 has the worst ratings, following by question 2 and

question 5.

After analyzing these results the designers must interpret them in order to im-

prove the HITO report generator. For instance, the rating of question 7 indicates

that the format of the report and the different techniques for expressing the in-

formation must be improved, because the reviewers consider that it is not good

enough. Question 2, that analyzes the way as the results are expressed, also has a

low global rating. Therefore, this is other aspect that the designers should improve

to make better reports.

Table 4: Average results of the evaluation.

R1 R2 R3 R4 R5

Question 1 5 5 5 5 5

Question 2 4 3 3 5 2

Question 3 4 4 4 4 3

Question 4 5 4 3 5 3

Question 5 3 3 4 4 4

Question 6 4 4 2 5 4

Question 7 4 3 2 4 3

Question 8 4 4 4 5 5

Average 4.1 3.8 3.4 4.6 3.7

8. Conclusion

Linguistic description of phenomena is a very complex challenge. The ex-

pected results of this new technology will soon be useful for experts dealing with

the monitoring and evaluation of big volumes of data. The exploitation of data

generated by simulators is a remarkable example of these applications. Simula-

tors are widely used in a variety of ﬁelds and in many situations, domain experts

are required in order to observe and evaluate the amount of data acquired in the

performed exercises.

In this paper, based on CTP, we contribute to the automatic evaluation of ve-

hicle onboard devices in a simulation environment. Using our solution, experts

in the ﬁeld will save time and resources when analyzing all the data generated in

simulation exercises. The system will provide objective reports generated based

on uniﬁed criteria with the interpretation, arguments and conclusions derived from

the data.

We have presented the complete linguistic model and report generation tem-

plate deﬁned on the HITO project. With this application we fulﬁll the requirement

of automatic evaluation of onboard devices in road transport environment. With

respect to our previous works, we have extended the linguistic model and report

template in order to meet the ﬁnal speciﬁcations of the project. We have included

the detailed description of our approach for automatic text generation and we have

included the deﬁnition of the IPD. We have also included our approach for the

quality assessment of the generated text. We seek to evaluate the usability of the

generated report from the users’ point of view, and we have identiﬁed different

indicators in a hierarchical form to deﬁne the quality of a linguistic report.

Through the use of Assertive CP,Derivative CP and Integrative CP computa-

tional perceptions inspired in Control Theory, the GLMP has been clearly adapted

to be applied to the representation of temporal evolution of phenomena.

It is worth noting that in this application we have implemented a solution to

the, so called inverse problem. The computational system navigates through the

GLMP and realizes a backward search in order to determine the rules that were

activated and ultimately triggered each event. Thus, conjectures about the possible

reasons (variables) that caused the events can be established.

Interestingly, the GLMP paradigm has been useful for the whole project team,

including human factors experts, helping to deﬁne the new concept of Index of

Potential Distraction. Our experience in the HITO project encourages us to apply

this paradigm in other application ﬁelds.

We have a lot of pending future work in this area. Here we have presented

a speciﬁc solution to a speciﬁc problem of linguistic description of data. In the

context of the HITO project, we have created just a ﬁrst prototype that must be

validated and tuned after a period of practical application. Many topics of re-

search remain unexplored in front of us, e.g.; develop new types of Perception

Mapping, i.e., new types of sentences; develop the mechanisms to select the more

relevant sentences in each situation type; generate different linguistic expressions

according with the experience of each different user, and develop new methods

for assessing the quality of the obtained reports.

We think that the presented automatic linguistic description approach can be

applied in a wide variety of domains and in multiple forms. For example, this

technology could be used for driving quality assessment in the form of a new on-

board system. In this case the evaluation would be performed on-line in real time

and the generated text messages should be converted to voice. We could also use

this technology for training environment. Depending on students’ performance in

training with respect to established objectives, customized training plans could be

proposed automatically.

Acknowledgment

This work was supported in part by the Spanish Ministry of Science and In-

novation (grants TIN2008-00040, PSS-370100-2007-12, TIN2008-06890-C02-

01 and TIN2011-29827-C02-02) and the Spanish Ministry for Education (FPU

Fellowship Program).

References

[1] M. Meyer, J. Booker, Eliciting and analyzing expert judgment: A practical

guide, Society for Industrial Mathematics, 2001.

[2] M. Riojas, C. Feng, A. Hamilton, J. Rozenblit, Knowledge Elicitation for

Performance Assessment in a Computerized Surgical Training System, Ap-

plied Soft Computing 11 (2011) 3697–3708.

[3] A. Proper, Intelligent Transportation Systems Beneﬁts: 1999 Update, US

Dept. of Transportation, Federal Highway Administration, ITS Joint Pro-

gram Ofﬁce, 1999.

[4] L. Tijerina, E. Parmer, M. Goodman, Driver workload assessment of route

guidance system destination entry while driving: A test track study, in:

Proceedings of the 5th ITS World Congress, 1998, pp. 12–16.

[5] L. Chittaro, L. De Marco, Driver distraction caused by mobile devices:

studying and reducing safety risks, in: Proc. International Workshop on

Mobile Technologies and Health: Beneﬁts and Risks, Udine, 2004.

[6] R. R. Yager, A new approach to the summarization of data, Information

Sciences 28 (1982) 69–86.

[7] R. R. Yager, Fuzzy summaries in database mining, in: In Proceedings 11th

Conference on Artiﬁcial Intelligence for Applications, IEEE, 1995, pp. 265–

269.

[8] J. Kacprzyk, R. Yager, S. Zadrozny, A fuzzy logic based approach to linguis-

tic summaries of databases, International Journal of Applied Mathematics

and Computer Science (2000) 813–834.

[9] J. Kacprzyk, S. Zadrozny, Computing with words is an implementable

paradigm: Fuzzy queries, linguistic data summaries and natural language

generation, IEEE Transactions on Fuzzy Systems 18 (2010) 461–472.

[10] L. A. Zadeh, From computing with numbers to computing with words -

from manipulation of measurements to manipulation of perceptions, IEEE

Transactions on Circuits and Systems 45 (1999) 105–119.

[11] L. A. Zadeh, A new direction in ai: Toward a computational theory of per-

ceptions, AI Magazine 22 (2001).

[12] L. Eciolaza, G. Trivino, B. Delgado, J. Rojas, M. Sevillano, Fuzzy linguistic

reporting in driving simulators, in: IEEE Symposium on Computational

Intelligence in Vehicles and Transportation Systems (CIVTS), IEEE, 2011,

pp. 30–37.

[13] L. Eciolaza, G. Trivino, Linguistic reporting of driver behavior: Summary

and event description, in: Proceedings of the 11th International Conference

on Intelligent Systems Design and Applications (ISDA), 2011, pp. 148–153.

[14] E. Reiter, R. Dale, Building natural language generation systems, Cambridge

Univesity Press, 2000.

[15] O. Gusikhin, D. Filev, N. Rychtyckyj, Intelligent Vehicle Systems: Appli-

cations and New Trends, Informatics in Control Automation and Robotics

(2008) 3–14.

[16] D. Salvucci, Predicting the effects of in-car interface use on driver perfor-

mance: An integrated model approach, International Journal of Human-

Computer Studies 55 (2001) 85–107.

[17] D. Salvucci, K. Macuga, Predicting the effects of cellular-phone dialing on

driver performance, Cognitive Systems Research 3 (2002) 95–102.

[18] Hito cabintec project website, http://www.cabintec.net/proyectos-hito.asp,

2011.

[19] L. A. Zadeh, Toward human level machine intelligence - is it achievable?

the need for a paradigm shift, IEEE Computational Intelligence Magazine 1

(2008) 11–22.

[20] J. Lawry, A methodology for computing with words, International Journal

of Approximate Reasoning 28 (2001) 51 – 89.

[21] L. A. Zadeh, The concept of linguistic variable and its application to approx-

imate reasoning, Information sciences 8 (1975) 199–249.

[22] L. A. Zadeh, A computational approach to fuzzy quantiﬁers in natural lan-

guages, Computing and Mathematics with Applications 9 (1983) 149–184.

[23] A. Bargiela, W. Pedrycz, Granular Computing: An Introduction, Kluwer

Academic Publishers, 2003.

[24] L. Lietard, A new deﬁnition for linguistic summaries of data, in: Proceed-

ings of International Conference on Fuzzy Systems (FUZZ-IEEE), IEEE,

2008, pp. 506–511.

[25] R. Castillo-Ortega, N. Mar´

ın, D. S´

anchez, A fuzzy approach to the linguistic

summarization of time series, Journal of Multiple-Valued Logic and Soft

Computing (2011) 157 – 182.

[26] J. Kacprzyk, S. Zadrozny, Computing with words and

Systemic Functional Linguistic: Linguistic data summaries and natural

language generation, in: V.N. Huynh et al. (Eds.): Integrated Uncertainty

Management and Applications, AISC, Springer-Verlag Berlin, 2010, pp.

23–36.

[27] L. Martinez, Sensory evaluation based on linguistic decision analysis, Inter-

national Journal of Approximate Reasoning 44 (2007) 148 – 164.

[28] E. Herrera-Viedma, E. Peis, J. M. M. del Castillo, S. Alonso, K. Anaya,

A fuzzy linguistic model to evaluate the quality of web sites that store xml

documents, International Journal of Approximate Reasoning 46 (2007) 226

– 253.

[29] I. Glockner, Fuzzy quantiﬁers: a computational theory, Springer Verlag,

2006.

[30] F. D´

ıaz-Hermida, A. Bugar´

ın, S. Barro, Deﬁnition and classiﬁcation of semi-

fuzzy quantiﬁers for the evaluation of fuzzy quantiﬁed sentences, Interna-

tional journal of approximate reasoning 34 (2003) 49–88.

[31] S. Peters, D. Westerst˚

ahl, Quantiﬁers in language and logic, Oxford Univer-

sity Press, USA, 2006.

[32] K. Ogata, Modern control engineering, Prentice Hall, 2009.

[33] M. Delgado, D. S´

anchez, M. Vila, Fuzzy cardinality based evaluation of

quantiﬁed sentences, International Journal of Approximate Reasoning 23

(2000) 23–66.

[34] C. Pappis, M. Sugeno, Fuzzy relational equations and the inverse problem,

Fuzzy sets and Systems 15 (1985) 79–90.

[35] M. Pereira-Farina, L. Eciolaza, G. Trivino, Quality assessment of linguistic

description of data, in: In ESTYLF2012: XVII Congreso Espanol sobre

Tecnolog´

ıas y L´

ogica Fuzzy, 2012.

[36] K. Korta, J. Perry, Pragmatics, 2011.

[37] M. A. K. Halliday, M. I. M. Matthiessen, Construing Experience through

Meaning: A Language-based Approach to Cognition, Cassell London, 1999.

Quality assessment of linguistic description of data

Article

Full-text available

Jan 2012

One important challenge in the field of automatic description of data consists of assessing the qual-ity of the obtained results. This paper explores a possible approach to solve this problem based on analysis performed by readers that are familiar with the specific context. We show an example with a practical application.

Computational theory of perceptions applied to automatic analysis of forest degradation

Conference Paper

Full-text available

Mar 2016

Increasing the Granularity Degree in Linguistic Descriptions of Quasi-periodic Phenomena

Conference Paper

Full-text available

Sep 2013

In previous works, we have developed some computational models of quasi-periodic phenomena based on Fuzzy Finite State Machines. Here, we extend this work to allow designers to obtain detailed linguistic descriptions of relevant amplitude and temporal changes. We include several examples that will help to understand and use this new resource for linguistic description of complex phenomena.

Combining dynamic finite state machines and text-based similarities to represent human behavior

Article

Oct 2019
ENG APPL ARTIF INTEL

The analysis of human behavior is a popular topic of research since it allows obtaining specific information about individuals, their motivations, and the problems and difficulties they can encounter. Human behavior can be grouped to elaborate profiles that would enable the classification of individuals. Nevertheless, the elaboration of profiles related to human behaviors presents some difficulties associated with the volume of data and the number of parameters typically considered. Thus, the development of software able to automatize the manipulation of data through graphical assistants to produce understandable visualizations of the human behaviors is crucial. In this paper, the VISUVER framework is presented. It uses finite state machines to represent and visualize the dynamic human behavior automatically. This behavior could be provided by real data collected by specific sensors or simulated data. The state machines are built in sequential steps in order to illustrate the dynamic evolution of the behavior over time. VISUVER also includes similarity metrics based on text mining techniques to establish possible profiles among the analyzed behaviors. The Intelligent Transportation Systems (ITS) domain has been considered in order to validate the proposal.

Linguistic Description of Complex Phenomena

Thesis

Full-text available

Jan 2012

Alberto Alvarez-Alvarez

Some preliminary reflections on the epistemology of Computational Theory of Perceptions

Article

Full-text available

Aug 2011

The impressive practical applications is one of the main reasons of success of Fuzzy Logic. The Computational Theory of Perceptions (CTP) is an extension of this theory that has not yet reached the same level of applicability. We believe that a review of the fundamentals of CTP from a epistemological perspective would provide useful insights to contribute to energize its development. This is a very preliminary paper where we make some reflections on epistemological aspects of CTP. We have used three representative Zadeh's papers to extract a set of relevant sentences describing CTP. Then, we analyzed this corpus of information from various epistemological points of view to obtain some provisional conclusions.

Linguistic Summarization of Europe Brent Spot Price Time Series Along with the Interpretations from the Perspective of Turkey

Article

Oct 2014

Compared to the predictive models for forecasting crude oil price, the descriptive models are also important for discovering the relations among factors that have influence on crude oil price. In this paper, linguistic summarization of time series, a descriptive technique, is used to summarize the Europe Brent Spot Price (EBSP) data (dollars per barrel) from May 1987 to July 2013 on daily and monthly base. This study is the first attempt in the literature for linguistic summarization of the EBSP time series with the aim to establish a decision support tool that can be used for generating linguistic propositions helping the estimation of future behaviors and fluctuations of oil price time series for oil-importer countries such as Turkey. Economic interpretations of generated linguistic propositions about the main features of EBSP trends (duration, dynamics of change, and variability) and a procurement strategy, indicating that Turkey could cut back on oil importation about $573.88 million between 1996 and 2012, are suggested from the perspective of Turkey.

Formative criterion-based assessment for Moodle quizzes using intelligent computing

Conference Paper

Mar 2013

This paper describes the use of the granular linguistic model of a phenomenon (GLMP) to model the assessment of integer arithmetic learning and implement the automated generation of a formative criterion-based assessment report in natural language, as well as a numerical grade. The report is generated based exclusively on the objective scores automatically generated by Moodle quizzes. GLMP is based on fuzzy logic and the computational theory of perceptions, and uses inference systems based on linguistic rules.

Automatic Linguistic Reporting in Driving Simulation Environments

Article

Jan 2012
APPL SOFT COMPUT

Linguistic data summarization targets the description of patterns emerging in data by means of linguistic expressions. Just as human beings do, computers can use natural language to represent and fuse heterogeneous data in a multi criteria decision making environment. Linguistic data description is particularly well suited for applications in which there is a necessity of understanding data at different levels of expertise or human-computer interaction is involved. In this paper, an application for the linguistic descriptions of driving activity in a simulation environment has been developed. In order to ensure safe driving practices, all new onboard devices in transportation systems need to be evaluated. Work performed in this application paper will be used for the automatic evaluation of onboard devices. Based on Fuzzy Logic, and as a contribution to Computational Theory of Perceptions, the proposed solution is part of our research on granular linguistic models of phenomena. The application generates a set of valid sentences describing the quality of driving. Then a relevancy analysis is performed in order to compile the most representative and suitable statements in a final report. Real time-series data from a vehicle simulator have been used to evaluate the performance of the presented application in the framework of a real project.

An approach to automatic learning assessment based on the computational theory of perceptions

Article

Nov 2012
EXPERT SYST APPL

E-learning systems output a huge quantity of data on a learning process. However, it takes a lot of specialist human resources to manually process these data and generate an assessment report. Additionally, for formative assessment, the report should state the attainment level of the learning goals defined by the instructor.This paper describes the use of the granular linguistic model of a phenomenon (GLMP) to model the assessment of the learning process and implement the automated generation of an assessment report. GLMP is based on fuzzy logic and the computational theory of perceptions. This technique is useful for implementing complex assessment criteria using inference systems based on linguistic rules. Apart from the grade, the model also generates a detailed natural language progress report on the achieved proficiency level, based exclusively on the objective data gathered from correct and incorrect responses. This is illustrated by applying the model to the assessment of Dijkstra’s algorithm learning using a visual simulation-based graph algorithm learning environment, called GRAPHs.

Quality assessment of linguistic description of data

Article

Full-text available

Jan 2012

Construing Experience Through Meaning: A Language Based Approach to Cognition

Book

Full-text available

Jan 1999

Michael Alexander Kirkwood Halliday has been actively analyzing and documenting the interactions between syntax and semantics for over forty years, and his systemic-functional theory has been a foundation for important work in computational linguis-tics for at least thirty years. The first major application of systemic theory was the SHRDLU system by Winograd (1972). The largest ongoing series of applications has been developed at the USC Information Sciences Institute: language generation (Mann 1982; Hovy 1988); discourse analysis and rhetorical structure (Mann and Thompson 1992); and the interface between the lexicon and world knowledge (Bateman et al. 1990; Matthiessen 1995). In this book, Halliday and Matthiessen present a comprehensive survey of seman-tics and its relationships to syntax and cognition. Although they present their subject from a systemic-functional point of view, they show how their approach is related to a wide range of work in both computational and theoretical linguistics. One no-table omission from their 23-page bibliography is Noam Chomsky, whose period of active research almost exactly coincides with Halliday's. They do, however, give a fair summary of semantic theories based on Chomsky's approach, ranging from the early work of Katz and Fodor to the more recent work by Jackendoff. The book consists of 15 chapters organized in five parts. In Part I, the authors contrast the systemic approach with a view of knowledge representation as a "piece-meal accumulation" of concepts with "no overall organization. " Instead of treating language "as a kind of code in which pre-existing conceptual structures are more or less distortedly expressed," they view language as a semiotic system that serves "as the foundation of h u m a n experience. " The goal of systemic theory is to present a comprehensive view of how humans construe experience through language. Unlike Chomsky, they do not consider grammar as "autonomous" but as an integral part of the lexicogrammar, which realizes meaning in words, phrases, sentences, and para-graphs. Part II, comprising Chapters 2 through 7, presents the meaning base, which cor-responds to what many authors would call an ontology. The meaning base, however, represents categories of experience with a topmost node called p h e n o m e n o n instead of categories of existence with a topmost node called entity. The first subdivision of phenomena is a three-way partitioning according to levels of complexity:

Linguistic description of traffic in a roundabout

Article

Jan 2010

The linguistic description of a physical phenomenon is a summary of the available information where certain relevant aspects are remarked while other irrelevant aspects remain hidden. This paper deals with the development of computational systems capable to generate linguistic descriptions from images captured by a video camera. The problem of linguistically labeling images in a database is a challenge where still much work remains to be done. In this paper, we contribute to this field using a model of the observed phenomenon that allows us to interpret the content of images. We build the model by combining techniques from Computer Vision with ideas from the Zadeh's Computational Theory of Perceptions. We include a practical application consisting of a computational system capable to provide a linguistic description of the behavior of traffic in a roundabout.

Combining semantic web technologies and computational theory of perceptions for text generation in financial analysis

Article

Jan 2010

Building Natural-Language Generation Systems

Article

May 1996

Ehud Reiter

This is a very short paper that briefly discusses some of the tasks that NLG systems perform. It is of no research interest, but I have occasionally found it useful as a way of introducing NLG to potential project collaborators who know nothing about the field.

Elicitating and Analyzing Expert Judgement: A Practical Guide

Article

Jan 2001

In this book we describe how to elicit and analyze expert judgment. Expert judgment is defined here to include both the experts' answers to technical questions and their mental processes in reaching an answer. It refers specifically to data that are obtained in a deliberate, structured manner that makes use of the body of research on human cognition and communication. Our aim is to provide a guide for lay persons in expert judgment. These persons may be from physical and engineering sciences, mathematics and statistics, business, or the military. We provide background on the uses of expert judgment and on the processes by which humans solve problems, including those that lead to bias. Detailed guidance is offered on how to elicit expert judgment ranging from selecting the questions to be posed of the experts to selecting and motivating the experts to setting up for and conducting the elicitation. Analysis procedures are introduced and guidance is given on how to understand the data base structure, detect bias and correlation, form models, and aggregate the expert judgments.

Fuzzy quantifiers. A computational theory

Book

Jan 2006

Ingo Glockner

"Almost all", "many", "some": fuzzy quantifiers are vital for effective communication in natural language (NL). This monograph pursues an axiomatic method to achieve a reliable interpretation of these quantifiers in technical applications of fuzzy quantification. Unlike existing work in this area, it targets a much broader class of quantificational phenomena which includes all cases usually considered in linguistics. The topics addressed in the monograph run the gamut from the introduction of the theoretical framework for analysing fuzzy quantification, the formalization of semantical requirements on models of fuzzy quantification, the construction and detailed study of prototypical models which conform to the linguistic desiderata, the development of algorithms for implementing the main types of quantifiers in these models, and finally a preview to fuzzy branching quantifications which might be necessary for modelling NL sentences involving more than one quantifier. The material will be of interest to those working at the crossroads of natural language and fuzzy set theory. The fields of application comprise fuzzy information aggregation and data fusion, flexible database querying and fuzzy information retrieval, multi-criteria decision-making and linguistic data summarization.

Eliciting and Analyzing Expert Judgment, A Practical Guide

Article

Jan 2012
TECHNOMETRICS

J. Charles Kerkering

Linguistic reporting of driver behavior: Summary and event description

Article

Nov 2011

There is a clear need for computational systems able to produce automatic linguistic descriptions of data about phenomena. Linguistic summarization represents an attempt to describe by means of linguistic expressions patterns emerging in data. Generating data summaries can be seen as a very complex and non-trivial data mining task. Language is the unique meta-language to describe and understand various complex phenomena and humans use it for multi-modal data fusion in their brains. Our work is devoted to advance towards the development of a framework for the fusion of heterogenous information using linguistic descriptions based on NL. This application paper deals with the development of a computational system capable of automatically generating linguistic descriptions of driving activity from vehicle simulator data. Based on Fuzzy Logic, and as a contribution towards the development of Computational Theory of Perceptions, the proposed solution is part of our research on granular linguistic models of phenomena. We will generate a set of valid sentences describing a phenomenon through a granular linguistic model of a phenomenon. Then a relevancy analysis will be performed, in order to choose the most suitable sequence of clauses to each specific input data. We have used real time-series data from a vehicle simulator to evaluate the performance of our approach.

Pedrycz, W.: Granular Computing: An Introduction. Kluwer Academic, Dordrecht

Book

Jan 2003

Granular Computing: An Introduction covers a full spectrum of granular computing from the basic methodology through algorithms and granular worlds, to a representative spectrum of applications. This book will appeal to all who are developing intelligent systems, either working at the methodological level or interested in detailed system realisations. The reader is provided with the underlying material on granular computing, as well as exposed to the current developments where it finds the most visible applications. Furthermore, this book provides an extensive bibliography after each chapter - an indespensable source of information to anyone seriously pursuing research in this rapidly developing area.

Fuzzy linguistic reporting in driving simulators

Abstract and Figures

Recommended publications

Linguistic description of traffic in a roundabout

Automatic Linguistic Reporting in Driving Simulation Environments

Linguistic reporting of driver behavior: Summary and event description

Towards linguistic descriptions of phenomena

OLAP navigation in the Granular Linguistic Model of a Phenomenon