Conference PaperPDF Available

Fuzzy linguistic reporting in driving simulators

Authors:

Abstract and Figures

The growth of new intelligent transportation on board systems has increased dramatically drivers' attention to secondary tasks other than driving. Potential risk of committing mistakes while driving suggests the need to evaluate the onboard systems in order to ensure safe driving practices. An interesting possibility consists of using simulators to perform this evaluation. Simulators can provide analysts with great amount of data describing the performed driving and details about each different situation type. Unluckily, handling this huge amount of information is not easy and to perform a series of repetitive experiments can be very tedious. This application paper deals with the development of a computational system capable of automatically generating linguistic descriptions of driving activity from vehicle simulator data. Based on Fuzzy Logic, and as a contribution towards the development of Computational Theory of Perceptions, the proposed solution is part of our research on granular linguistic models of phenomena. This paper presents a first prototype that is part of a long term practical project devoted to evaluate vehicle onboard systems interacting with professional drivers in different situations. At this stage, we focus on generating simple reports about the quality of driving in a specific context. We have used real time-series data from a vehicle simulator to evaluate the performance of our approach.
Content may be subject to copyright.
Automatic Linguistic Reporting in Driving Simulation
Environments
Luka Eciolazaa,, M. Pereira-Fari˜
nab, Gracian Trivinoa
aEuropean Centre for Soft Computing,
Gonzalo Gutierrez Quiros s/n, 33600 Mieres, Asturias, Spain
bCentro de Investigaci´
on en Tecnolox´
ıas da Informaci´
on (CITIUS),
Universidade de Santiago de Compostela, Galicia, Spain
Abstract
Linguistic data summarization targets the description of patterns emerging in data
by means of linguistic expressions. Just as human beings do, computers can use
natural language to represent and fuse heterogeneous data in a multi criteria de-
cision making environment. Linguistic data description is particularly well suited
for applications in which there is a necessity of understanding data at different
levels of expertise or human-computer interaction is involved. In this paper, an
application for the linguistic descriptions of driving activity in a simulation en-
vironment has been developed. In order to ensure safe driving practices, all new
onboard devices in transportation systems need to be evaluated. Work performed
in this application paper will be used for the automatic evaluation of onboard de-
vices. Based on Fuzzy Logic, and as a contribution to Computational Theory of
Perceptions, the proposed solution is part of our research on granular linguistic
models of phenomena. The application generates a set of valid sentences describ-
ing the quality of driving. Then a relevancy analysis is performed in order to
compile the most representative and suitable statements in a final report. Real
time-series data from a vehicle simulator have been used to evaluate the perfor-
mance of the presented application in the framework of a real project.
Keywords: Computational theory of perceptions, linguistic description of data,
driving simulators
Corresponding author
Email addresses: luka.eciolaza@softcomputing.es (Luka Eciolaza),
martin.pereira@usc.es (M. Pereira-Fari˜
na),
gracian.trivino@softcomputing.es (Gracian Trivino)
Preprint submitted to Applied Soft Computing July 4, 2012
1. Introduction
Linguistic summarization represents an attempt to describe by means of lin-
guistic expressions patterns emerging in data. It is intended in general for appli-
cations in which there is a strong human-machine interaction involving accessing
and understanding data, like supervision and control processes. Linguistic sum-
marization is a data mining or knowledge discovery approach for providing ef-
ficient and human friendly framework for the analysis of data within a decision
support environment.
There is a growing necessity for computational systems capable of provid-
ing linguistic descriptions of phenomena. New technologies allow acquiring and
archiving vast volumes of data about time-evolving phenomena and the amount of
information collected in different fields is overwhelming and ever growing. How-
ever, there is a lack of tools and means for processing and interpreting all this
information using computers. In order to be useful, this information must be ex-
plained in an understandable way, including facts that may be derived from the
data and the background knowledge available about the phenomenon under study.
Human specialists describe their perceptions using natural language (NL). NL al-
lows experts to make imprecise representations that summarize their perceptions
of complex phenomena choosing the most adequate degree of granularity in each
circumstance to remark the relevant aspects and to hide the irrelevant ones.
In our opinion, data should not be simply made accessible or summarized as
graphics, tables and simple linguistic variables, but the needed interpretation ar-
guments and conclusions should be provided and explained using NL. Nowadays,
any organization of data as provided by a computer, either in a numerical, categor-
ical, and/or graphical form, is just a tool that can be employed by human experts
to produce an explanation in NL. Understandable linguistic descriptions of phe-
nomena are provided by human experts, computers just being a tool for storing
and accessing data in a very flexible way. This is clearly a problem as the ratio
data/human experts is growing dramatically, since it is becoming easier and easier
to collect data, but providing a human being with expertise on a certain area re-
mains difficult and expensive. On the other hand, assessment by observation does
not meet the validity and reliability criteria necessary for any objective evaluation.
Generated expert reports should be based on unified criteria and independent to
the subjective variability of different experts [1] [2].
In summary, the formulation of data summaries can be seen as a very complex
2
and non-trivial data mining task, and there is a clear need for computational sys-
tems able to produce automatic linguistic descriptions of data about phenomena.
In this paper, we present our approach to implement a computational system
capable of providing linguistic descriptions of driving activity in simulation en-
vironments. The growth of vehicle onboard devices has increased dramatically
driver’s attention to secondary tasks other than driving. There are many studies
[3] [4] [5] analyzing this problem that demonstrate how the integration of onboard
systems can drastically increase the risk of committing mistakes on driving, po-
tentially resulting on traffic accidents. Therefore, there is a clear need to evaluate
the onboard systems in order to ensure safe driving practices.
Driving simulators are interesting tools to perform this evaluation. Simula-
tors can provide analysts with various information related to the driver-device
interaction in a variety of situations. Human factor experts are able to assess the
performance and effectiveness of the onboard devices through a variety of prede-
fined experiments and obtain a vast amount of data. Unluckily, handling this huge
amount of information is not easy and to perform a series of repetitive experiments
can be very tedious.
Our research line belongs to the field of Fuzzy Logic, where since several
years ago the scientific community deals with the problem of obtaining automatic
linguistic summaries of all sorts of data [6] [7] [8] [9]. Our approach is based on
Computational Theory of Perceptions (CTP) [10] [11]. CTP provides a frame-
work to develop computational systems with the capacity of computing with the
meaning of NL expressions, i.e., with the capacity of computing with imprecise
descriptions of phenomena in a similar way humans do it. In this paper we de-
velop on the concept of Granular Linguistic Model of a Phenomenon (GLMP)
[12] [13]. Our long term project deals with generating more complex and useful
linguistic descriptions than the obtained using the existing technology.
State-of-the-art automatic linguistic description techniques in modern training
systems based on fuzzy logic [2], consider expert knowledge recorded as simple
linguistic variables. In our approach, we generate complete linguistic summary
reports through the aggregation of all the information related to the analyzed phe-
nomena. On the other hand, the works in the area of Natural Language Generation
systems [14] contemplate the generation of complete summary reports, however,
they do not consider the quantification of validity degrees for generated sentences
and therefore it is difficult to deal with sensor input data and uncertain informa-
tion.
The work presented in this article builds upon previous works presented in [12]
and [13]. This work represents the last step of a series of prototypes of increasing
3
complexity. Each prototype followed a logic step where different problematic
have been addressed in order to get to the final goal of automating the onboard
device evaluation process. Thus, in [12] the quality of driving during selected
simulation segments was summarized into a set of fixed sentences that were later
on compared with questionnaires filled by experts. In the prototype presented in
[13] the generated reports were sequential descriptions of events within dynamic
or time-evolving data, and there was an initial stage of validity analysis performed
by the GLMP followed by a relevancy analysis to select the appropriate sentences
for the final report.
The application presented here is characterized as follows; (a) it is focused on
the detection of risk events during the driving activity; (b) it compares intervals
of interaction with no-interaction between driver and onboard systems. Finally
(c) it generates a linguistic report describing the overall quality of driving and
it determines the distraction potential of analyzed onboard systems. The report
includes hypothesis about the relation between the detected risk events and the
distraction levels, all based on the input/output signals of the simulator
Once calibrated, the automatically generated document should complement or
replace the evaluation reports generated by human experts.
In this paper, we present the complete linguistic model and report template that
describe the interaction between the driver and the onboard devices in a vehicle.
We measure the potential distraction levels following the requirements defined
within the HITO project. For this the definition of the index of potential distrac-
tion is done. As for the quality assessment of the generated text, we present our
approach that seeks to evaluate reports’ usability from the point of view of human
consumption, identifying different indicators that define the quality of a linguistic
report.
Section 2 provides a brief review of the state of the art in the different subjects
involved. Section 3 introduces the concept of Granular Linguistic Model of a
Phenomenon. The definitions related to the various components are included, and
the process of sentences generation with their respective validity degree are de-
scribed. Section 4 describes the report generation task through the compilation of
the most relevant sentences. Section 5 explains in detail the different parameters
involved in the application, it shows the practical implementation of the GLMP
for the report generation and a definition for the computation of the Index of Po-
tential Distraction is proposed. Section 6 and 7 describe the experimentation and
validation of the obtained reports respectively. Finally, Section 8 contains the
conclusions.
4
Figure 1: CABINTEC: The simulator for Intelligent Transportation Systems.
2. Background Information
This section presents a small review of the various domains involved in this
work. Information about the state of the art practices within these domains will
help the reader understand better the extent and potential of the paper.
2.1. Intelligent Transportation Systems and HITO
Road transportation is changing rapidly due to technological evolution. The
field of Intelligent transportation systems (ITS) is focused on the minimization
of traffic congestions, optimization of traffic management, and the road security
improvement in general.
The increasing capability to acquire, archive and share huge amounts of het-
erogeneous information allow the development of a big number of vehicle on-
board systems focused to improve the overall driving experience with applications
for driving assistance, entertainment or information systems. Apart from traf-
fic management and road security improvement applications, in the last decade,
technologies to monitor and control drivers conditions, such as fatigue or stress
showed an increased interest and there are numerous examples of these devices in
the market [15].
As stated above, the growth of vehicle onboard devices implies the increased
risk of distractions potentially resulting on mistakes while driving. Vehicle on-
board systems must be designed with the priority of ensuring and meeting safety
requirements, focussing on clear and intuitive interfaces. Therefore, there is a
clear need to evaluate the onboard systems in order to ensure safe driving prac-
tices.
5
Driver distraction is an important issue of study both for research in investi-
gating human multitasking abilities and for practical purposes in developing and
constraining new onboard devices. See, for example, the research performed in
[16] [17] focused on modeling and prediction of driver distraction. The analy-
sis of driving performance data like vehicle deviation from lane or speed control
with respect to an accelerating and braking lead vehicle can be used to accurately
model driver’s distraction. They quantify how the vehicle’s deviation from the
lane center increases during period of inattention and how the vehicle returns to
lane center during periods of active steering.
This paper is part of a multidisciplinary long term project with the aim of de-
veloping an ”Intelligent cabin for road transportation of goods and passengers”.
Representative parameters of vehicle, environment and driver behavior are col-
lected and evaluated in order to know, depending on the nature and criticality of
the information, WHAT, HOW, WHEN and WHERE the onboard systems’ infor-
mation should be placed in the driving desk within the vehicles. The project aims
to design a safe, usability oriented and scalable framework for the integration of
new onboard devices.
In particular, HITO (Human Interface by Technology Observation) [18] aims
to develop an effective methodology and framework to evaluate onboard devices
used in road transport in order to optimize reliability levels and to guarantee an er-
gonomic and safe workplace design. For this, HITO proposed the development of
a methodology to measure potential distraction levels caused by onboard systems.
This methodology consists of a series of experiments or exercises on a simulation
environment designed by experts. Initially, the exercises will be analyzed and re-
ported manually by a team of human factor researchers for the evaluation of each
device. Finally, the project will also address the essential task of automatization
of the developed methodology through a software application.
Modern training systems in different areas are now incorporating automated
evaluation functionality. Evaluations are based on expert knowledge and the used
metrics are typically recorded as linguistic variables that can take values such as
low, medium, high or other comparable terms. The work presented by Riojas and
col. [2] is an interesting and updated reference of applying Fuzzy Logic in this
field. In our approach, we are aiming to generate complete linguistic summary
reports and not only simple linguistic variables describing the simulation activity.
2.2. Computational Theory of Perceptions
CTP was introduced in the Zadeh’s seminal paper “From computing with num-
bers to computing with words - from manipulation of measurements to manipu-
6
lation of perceptions” [10] and further developed in subsequent papers [11] [19]
[20]. CTP provides a framework to implement computational systems with the ca-
pacity of computing with the meaning of NL expressions, i.e. with the capacity of
computing with imprecise descriptions of the world in a similar way that humans
do it. According to CTP, our perception of world is granular. A granule underlies
the concept of a linguistic variable [19]. A linguistic variable is a variable whose
values are linguistic labels, i.e.: words or sentences in NL [21]. In this approach,
a fuzzy linguistic label can be viewed as a linguistic summary of numerical data,
e.g., a set of temperature values are labeled as Medium. The definition of the lin-
guistic label includes the concept of ”degree of validity” to describe each element
in the set [21].
2.3. Linguistic Summarization of Data
Linguistic summarization [6] [8] [22] is closely related to CTP and hence to
Granular Computing paradigm. According to [23], information granules are con-
ceptual entities which emerge from the needs of humans in a continuous quest to
abstraction and summarization of information. Linguistic summarization seeks to
process complex information and describe emerging patterns through linguistic
expressions and the manipulation of information granules in the form of words.
Information granulation in the form of NL is used to describe and understand
complex phenomena at different levels of resolution or scales. Similar to human
reasoning, NL representation is used for multi modal data fusion.
The idea of linguistic fuzzy quantifiers was introduced by Zadeh in [22]. The
concept of linguistic fuzzy summary was introduced in [6] and further developed
in [8]. A fuzzy linguistic summary is a set of sentences which express knowl-
edge about a situation through the use of fuzzy linguistic summarizers and fuzzy
linguistic quantifiers.
The basic concept of fuzzy linguistic summary has the general form of a quan-
tified fuzzy proposition [6] [22]: (w, ”Q objects in database are S”); where Q is
called the quantifier, S is the qualifier, also called summarizer, and w is the degree
of validity of the linguistic clause for representing the meaning in the specific con-
text. For example: (0.7, ”sometimes the driving quality has been low”). In recent
years, this basic concept has been developed in different ways [24] and used for
different applications, e.g., data mining [7], database query [8], and for describing
temporal series [25]. See in [26] a review on the state of the research on this field.
During the last years, researchers in the field of computing with words and
perceptions have developed an important set of resources to represent the meaning
of perceptions for making decisions in specific applications [27] [28]. The basic
7
fuzzy linguistic summary covers a very small part of the possibilities of meaning
in NL [29] [30] [31]. Each application will present particular challenges for the
computation of the validity degrees of the different types of linguistic expressions.
3. Granular Linguistic Model of a Phenomenon (GLMP)
In this section, we introduce the components of the GLMP, our approach based
on CTP for developing computational systems able to generate linguistic descrip-
tions of data. In our research line, one of the contributions of this paper consists
of identifying three types of Computational Perceptions (CP), namely, Assertive,
Derivative and Integrative that here, they are extensively used to linguistically
model the evolution in time of phenomena.
3.1. Computational Perception
ACP is the computational model of a unit of information acquired by the
designer about the phenomenon to be modeled. In general, CPs correspond with
specific parts of the phenomenon at certain degrees of granularity. A CP is a
couple (A, W )where:
A= (a1, a2, . . . , an)is a vector of linguistic expressions (words or sentences in
NL) that represents the whole linguistic domain of the CP. Each aidescribes
the value of the CP in each situation with specific degree of granularity.
These sentences can be either simple, e.g., ai=“The vehicle speed is high”
or more complex, e.g., ai=“During interaction sometimes the manoeuvre
execution has been bad..
W= (w1, w2, . . . , wn)is a vector of validity degrees wi[0,1] assigned to each
aiin the specific context. wiis the degree in which aiis valid to describe a
situation.
3.2. Perception Mapping (PM)
We use PMs to create and aggregate CPs. There are many types of PMs and
this paper explores several of them. A PM is a tuple (U, y, g, T )where:
Uis a vector of input CPs, U= (u1, u2, . . . , un), where ui= (Aui, Wui). In the
special case of first order Perception Mappings (1PMs), these are the inputs
to the GLMP and they are values zReither provided by a physical sensor
or obtained from a database.
8
Figure 2: Example of GLMP.
yis the output CP,y= (Ay, Wy) = {(a1, w1),(a2, w2), . . . , (any, wny)}.
gis an aggregation function employed to calculate the vector of validity degrees
assigned to each element in y,Wy= (w1, w2, ..., wny). It implements the
aggregation of input vectors, Wy=g(Wu1, Wu2, ..., Wun), where Wuiare
the degrees of validity of the input perceptions. In Fuzzy Logic many dif-
ferent types of aggregation functions have been developed. For example g
could be implemented using a set of fuzzy rules. In the case of 1PMs,gis
built using a set of membership functions as follows:
Wy= (µa1(z), µa2(z), . . . , µany(z)) = (w1, w2, . . . , wny)
where Wyis the vector of degrees of validity assigned to each ai, and zis
the input data.
Tis a text generation algorithm that allows generating the sentences in Ay. In
simple cases, Tis a linguistic template, e.g., “The temperature in the room
is {high |medium |low}.
3.3. Structure of the GLMP
The GLMP consists of a network of PMs. Each PM receives a set of input
CPs and transmits upwards a CP. We say that each output CP is explained by the
9
PM using a set of input CPs. In the network, each CP covers specific aspects of
the phenomenon with certain degree of granularity. Fig. 2 shows an example of
GLMP. In this example, at every point in time, the phenomenon can be described
at a very basic level in terms of three variables z1,z2, and z3. These variables are
verbalized in through 1PMs {p1
1, p1
2, p1
3}.
Using different aggregation functions and linguistic expressions, the paradigm
GLMP allows the designer to model computationally his/her perceptions. In the
case of Fig. 2, from the outputs of 1PMs, other two higher-level descriptions of
the phenomenon are derived. These descriptions are given in the form of compu-
tational perceptions CP4and CP5, which are explained by 2PMs {p2
4, p2
5}in terms
of CP1,CP2, and CP3. The validity of each item in CP4and C P5is explained
by those items of CP1,CP2and CP3. Finally, the top-order description of the
phenomenon is provided, at the highest level of abstraction, by CP6, explained by
the 2PM {p2
6}in terms of CP4and CP5. Notice that, by using this structure, one
can provide a linguistic description of the phenomenon at different levels, from
the very basic level to the highest or most general level of granularity.
3.4. Types of CP
For this application, as initially introduced in [13] and inspired in the classical
Control Theory [32], we focus on the perception of three important characteristics
of phenomena evolution, namely, the perception of the current state (assertive
CP), the perception of the trend to evolve (derivative CP) and the summary of
accumulated perceptions (integrative CP).
3.4.1. Assertive CP
It is associated with a linguistic expression of type “Y is A”. It represents the
linguistic fuzzy model of the current state of a characteristic of the phenomenon,
e.g., “The Distance to the Vehicle in front is High ”.
3.4.2. Derivative CP
They correspond with trend analysis information and they give insight into
how the phenomenon is evolving in time. It helps contextualizing the information
and it may be important for decision making.
The following example sentences clearly show the importance of the deriva-
tive information, and how it can completely change the context in which certain
decision must be taken.
“The Distance to the Vehicle in front is Medium ”.
10
“The Distance to the Vehicle in front is Medium and Increasing ”.
“The Distance to the Vehicle in front is Medium and Rapidly Decreasing ”.
In this work, we only use derivative 1CPs, i.e., they are directly obtained from the
input signals of the different sensors.
In the Derivative PM,Uis a time-series signal (z={z(kl+ 1), . . . z(k
1), z(k)}) of length l, obtained directly from sensor input data, where krepresents
the current sample and lis defined by the designer, e.g., to filter the noise of the
input signal. zdis the relative change between samples and it is calculated as
follows:
zd= 100 ×z(k)z(kl+1)
¯z
where the average of the zis
¯z=1
l
l
i=1
z(ki+ 1)
The relative change zdis directly used to describe the trend of the perceptions
linguistically. Here, Tis defined by the template: “The value of the attribute is
{Rapidly Decreasing |Decreasing |Steady |Increasing |Rapidly Increasing}.
3.4.3. Integrative CP
The Integrative CP represents the accumulated perception of the phenomenon
over a period of time. The text associated with these perceptions consists of
summary sentences of historical event occurrences, and answers the question of
“Which is usually the state of the Parameter”, i.e.: “Q of Ys are A”.
The typical template for the answer could be:
{Never |A few times |Sometimes |Many times |Most of the time |Always}, the
parameter was {Low |Medium |High}.
The accumulated perception may be very important for decision making. The
following example sentences with assertive and integrative sentences show how
the integrative information can completely change the context in which the deci-
sion must be taken.
”The Vehicle Linearity is Medium ”.
11
”The Vehicle Linearity is Medium and Most of the time it has been High”.
”The Vehicle Linearity is Medium and already Sometimes it has been
Medium ”.
In case of driving quality assessment the last example sentence could show a dis-
traction problem as it seems that medium vehicle linearity events are common and
so there have been many slight distractions.
The definition of an Integrative PM corresponds to the tuple (U,y,g,T). In
this case, Uis an input CP over a time window. The designer sets the parameter
ldefining the length of the time window from which the temporal series of l
samples is obtained. U={u(kl). . . u(k)}, where krepresents the current
sample.
In the case of an attribute defined by three linguistic labels, e.g.: {Low,
Medium, High}. The output CP y= (Ay, Wy)for the integrative PM will be
expressed by eighteen possible sentences {(a1, w1), . . . , (a18, w18 )}, combination
of the six linguistic quantifiers Q={Never, A few times, Sometimes, Several times,
Most of the time, Always}and input linguistic labels Ai={Low, M edium, High}.
gis an aggregation function computed as a quantified sentence. There are
many different approaches for evaluating quantified sentences. In this work we
have used the α-cut based method called GD introduced in [33] instead of the ba-
sic approach of quantified fuzzy propositions [6] [22] where the weights wassoci-
ated with the linguistic expressions are computed as fuzzy cardinalities. The GD
method has been used due to its efficiency and non-strict character. The method
also fulfills some interesting properties related to relative quantifiers defined in
[33].
4. Report Generation
Using the GLMP defined in the previous section, we can generate a set of
valid sentences describing the phenomenon in different levels of granularity or
detail. The GLMP is built during a design stage where a corpus of NL expressions
that are typically used in the application domain is collected. These expressions
describe the relevant features of analyzed phenomena. The sentences describe
each perception in every temporal sample and through quantified sentences the
overall states of the perceptions will be described.
12
Figure 3: Report generation diagram.
A medium size GLMP could generate a huge number of sentences describing a
particular phenomenon. In case of a session of driving simulation analyzed in this
work, the number of generated sentences can amount to hundreds of thousands
for a normal simulation exercise. It is critical to do a relevancy analysis in order
to select and compile the relevant sentences into one document highlighting the
interesting characteristics of a simulation.
The report describing the temporal evolution of a phenomenon is obtained
from the instantiation of the input data following a customized report template.
The template of the reports is defined considering the particular needs of the users
in order to highlight relevant aspects. For the application presented in this paper,
a report template have been created in collaboration with human factors experts
and it will be explained more in detail in section 5.4.
Fig. 3 shows the report generation diagram followed for the automatic report
generation. Initially, within the validity analysis, the full set of valid sentences
describing the analyzed phenomenon are created. In a second stage, a logic of
relevant sentence selection is implemented based on the customized template for
a final report. The computational system selects among the available possibilities
the most suitable linguistic expressions to describe the input data.
In this paper, in order to explain the possible causes of detected incidents,
the report template includes de diagnosis of happening events which implies the
need to solve the inverse problem concerned with fuzzy relations [34]. We have
implemented a linguistic approach to present the solutions where the variables in-
ferred in the diagnosis are explained based on NL sentences generated in previous
section (see the example of application in section 5.4).
5. Application: Linguistic description of potential distraction levels of on-
board devices
In the HITO project, driving simulators have been used in order to set a
methodology for the evaluation of new vehicle onboard devices. This method-
13
ology analyzes driver-device interaction in a variety of situations in order to asses
the index of potential distraction (IPD) of the new device. The designed exercises
require a sufficient degree of concentration from the driver and at the same time
demand the interaction with different onboard devices. The amount of data gen-
erated in the experiments, with a number of drivers in various different situations,
is huge and so the manual assessment process require many human resources.
In this work, linguistic descriptions of driving simulation exercises are gener-
ated automatically. The automated onboard device evaluation process will save
time and resources and it will generate reports based on an unified expert criteria.
The generated descriptions will either replace or complement manually generated
expert reports and they will be mainly focused on the following aspects:
Detection of distraction events during the exercise,
Comparison of intervals with and without interaction between the driver and
onboard devices.
Generation of a linguistic report describing the overall quality of driving,
providing an IPD of the analyzed onboard devices.
5.1. Driving simulator: monitored parameters
Vehicle simulator provides different parameters that are used to determine the
driving quality and the IPD of the analyzed devices over the simulation exercises.
These input parameters are related to the conduction, the controls of the vehicle
and the simulation environment. They are numerical values zRdescribed as
follows:
z1Vehicle Speed: Principal vehicle speed (km/h).
z2Lateral Position: Distance from the central point of the vehicle to the right
extreme of the track where it is (m).
z3Track Width: Width of the track of circulation (cm).
z4TTLC: Time to cross the line on the edge of the road (in seconds) (Lateral
Position/Lateral Speed).
z5HE: Angle between the tangents of the vehicle position and the road (de-
grees).
z6Steering Wheel Position: Measured steering wheel turning angle (degrees).
14
−250
−200
−150
−100
−50
0
50
100
150
200
250
Vehicle Speed
Lateral Positioning
Track Width
TTLC
HE
Steering Wheel Position
Break Usage
Accel pedal usage
Road Slope
Distance to vehicle in front
Overtaking
Retarder
Speed of vehicle in front
Figure 4: Examples of signals obtained from the simulator CABINTEC.
z7Percentage of brake usage: Percentage of actuation over the brake pedal
(%).
z8Percentage of accelerator usage: Percentage of actuation over the accel-
eration pedal (%).
z9Road Slope: Slope or inclination of the road (%).
z10 Distance to the Vehicle in front: (m).
z11 Vehicle Overtaking: Overtaking Situation of the vehicle (boolean).
z12 Retarder: Use of retarder. Hydraulic vehicle braking system (%).
z13 Speed of the Vehicle in front: (km/h).
Obviously, it is possible to extract a lot of useful information from these data.
The detection of risk events and description of driving activity through the lin-
guistic summarization of the above data inputs is only a first approach of the final
automated application. For the comparison of interaction and non-interaction in-
tervals a correlation between the appearance of risks events and the manipulation
15
of the onboard devices is done. The information provided by the onboard devices
may vary depending on the device but we are always able to determine when the
interaction starts and ends within the simulation exercise. Therefore, we created a
time-series signal zint representing the interaction intervals, as follows:
zint =1if Interaction is active
0if Interaction is inactive (1)
5.2. GLMP: Index of potential distraction of an onboard device
The GLMP in this application is designed to answer the general question of
which is the IPD of a given onboard device (See Fig. 9). This linguistic model
is an enhancement over the one presented in [13] where the description of the
Driving Quality was performed. In this case the Driving Quality information is
combined with the information of the analyzed onboard device in order to describe
and evaluate the IPD.
5.2.1. 1CPs
In total, there are 14 input variables for the GLMP, 13 signals from the simu-
lator and one signal defining the interaction with the device. Therefore, there are
14 1CPs, with 1PMs{p1
1, p1
2. . . p1
14}. Each 1PM will be defined by the tuple (U,
y,g,T). As an example, the 1PM of Vehicle Speed is developed where:
Uis the input value z1provided by the speed sensor.
yis a variable of type 1CP describing the Vehicle Speed. Its value is expressed
by linguistic sentences and their corresponding weights of validity as fol-
lows: (Null, wN ull),(S low, wSlow ),(M edium, wMedium ),(F ast, wF ast ).
Here, e.g., Slow stands for complete linguistic expression “The vehicle
speed is Slow”.
gis a function: Wy= (µNull (z), µSlow(z), µM edium(z), µF ast (z)) where µi
are membership functions relating linguistic labels with the sensor’s nu-
merical value z.
T“The vehicle speed is {Null |Slow |Medium |Fast}.
Trapezoidal membership functions have been used to cover the domain of values
of the different inputs parameters zi. The definition of the membership functions
have been made using expert knowledge. Fig. 5 shows several examples of the
used trapezoidal membership functions.
16
Vehicle Speed (km/h)
µNull Slow Medium Fast
−100 −80 −60 −40 −20 0 20 40 60 80 100
0
0.5
1
Steering Wheel Position (degrees)
µStrong Left Soft Left Centered Soft Right Strong Right
0 20 40 60 80 100 120
0
0.5
1
Percentage of accelerator usage (%)
µNull Low Medium High
Figure 5: Examples of trapezoidal linguistic labels used for fuzzification of 1CPs.
5.2.2. 2CP
According to the definition provided by the team of human factors experts in-
volved in the HITO project, we describe and evaluate the quality of driving using
three parameters, namely, Steering Wheel Control,Vehicle Linearity and Security
Distance. These three variables can give insight on the risk events and the distrac-
tion levels while driving. They are 2CPs that are derived directly from the 1CPs
described above. We used sets of fuzzy IF-THEN rules by each corresponding
2PM (named p2
1, p2
2, p2
3in Fig. 9).
y2
1The Steering Wheel Control is the output of p2
1, function of the 1CPs (y1. . . y11)
and the derivatives (yd1. . . yd11). The loss of steering wheel control is de-
fined as an abrupt manoeuvre, unusual in the vehicle direction control, and
causing safety risk. Sudden track changes, big oscillations on the vehicle
directions, out of track circulation, all could be indicators of loss in steering
wheel control.
y2
2The Vehicle Linearity, output of p2
2, is also function of 1CPs (y1. . . y11 )
and their derivatives (yd1. . . yd11). It refers to the uniformity of the vehicle
trajectory.
y2
3Security Distance, output of p2
3, refers to the distance with respect to the
vehicle in front. It is defined depending on Highway Code, road conditions
17
and environmental conditions. This 2CP is function of 1CPs (y7. . . y13 ) and
their derivatives (yd7. . . yd13).
On the other hand, Driving Quality is also a 2CP in the GLMP. Any deviations on
above mentioned three parameters suggest a distraction problem and a degraded
quality of driving.
y2
4Driving Quality is the output of p2
4. It is defined using the CPs (y2
1, y2
2, y2
3).
It determines the quality of driving at every instant during the simulation.
Each 2PM will be defined by the tuple (U,y,g,T). As an example, the 2PM of
Security Distance (y2
3) is developed where:
Uis a set of input CPsU= (u1, . . . , un), where ui= (Ai, Wi)are the output
CPs of the 1PM’s {p1
7, p1
8, . . . , p1
13}and their derivatives. U={y7, . . . , y13,
yd7, . . . , yd13}.
yis a variable of type 2CP describing the Security Distance. Its possible
value is expressed by linguistic sentences and their corresponding weights
of validity as follows: (Low, wLow ),(M edium, wMedium),(High, wH igh ).
Here, e.g., Low stands for ”The Security Distance is Low”.
gis the aggregation function Wy=g(Wy7, . . . , Wy13 , Wyd7, . . . , Wyd13 ), where
Wyis a vector (wLow, wM edium , wH igh )of validity degrees of the percep-
tion’s linguistic labels in y2
3.Wyiare the degrees of validity of the input
computational perceptions.
T“The Security Distance is {Low |Medium |High}.
The aggregation function Wy=g(Wy7, . . . , Wy13 , Wyd7, . . . , Wyd13 )has been im-
plemented using an expert set of fuzzy (IF-THEN) rules. In this rules, operator
AND has been implemented through the minimum, while the operator OR has
been implemented through the maximum. The following rules are an example
of some of the rules used to define the Security distance (see Tables 1 and 2 for
quickly reference the names of CPs).
- IF (y7is Medium) AND (y13 is Low) AND (y10 is Medium) AND (y9is Strong Descendent)
THEN y2
3is Low
- IF (y7is Medium) AND (y13 is Medium) AND (y10 is Very Small) THEN y2
3is Low
18
- IF (y10 is Very Small) AND (y13 is Low) AND (y12 is not 0) THEN y2
3is Low
- IF (y10 is Very Small) AND (y13 is Low) AND (y7is not Null) THEN y2
3is Low
- IF (y10 is Low) AND (yd3is Rapidly Decreasing) THEN y2
3is Low
- IF (y10 is Low) AND (yd3is Decreasing) THEN y2
3is Low
- IF (y13 is Low) AND (y10 is Medium) THEN y2
3is Medium
- IF (y10 is Low) AND (yd10 is Slowly Decreasing) THEN y2
3is Medium
- IF (y10 is Big) THEN y2
3is High
Fig. 6 shows an example of the evolution of the label Low at the 2CPs (y2
1, y2
2, y2
3
and y2
4) over a period of time within a particular simulation where a loss of driving
quality happens.
Figure 6: Examples of instantaneous 2CP signals over a period of time. a) Shows the progression
of the weights (wLow) corresponding to de label ”aLow ” of the parameters y2
1, y2
2, y2
3. b) Shows
the progression of wLow corresponding to the parameter y2
4, i.e.: Driving Quality.
Finally, two more 2CPs have been defined in order determine the driving qual-
ities with and without interaction activity.
y2
i5Driving Quality during interaction. This perception is an integrative CP and
it has been computed from inputs (y2
4and y1
11). It provides the accumulated
perception of the driving quality over the active interaction period during the
simulation exercise. The aggregation function gis the aggregation method
GD that provides quantified sentences as described in 3.4.3.
y2
i6Driving Quality during non-interaction. This perception is also an integra-
tive CP and it has been computed from inputs (y2
4and y1
11) in order to have
19
Figure 7: Example of CP y2
i4or y2
i5. The weights of the quantified sentences and the resulting
output sentences.
the accumulated perception of the driving quality while there has not been
interaction over the simulation exercise.
Although these CPs are not used directly to define the top order perception, their
output sentences will be required for the final report. Fig. 7 shows an example of
the quantified sentences at integrative CPs referred to the quality of driving.
The most relevant question to answer by the 2CPs mentioned in this section
could be about the reasons behind low overall driving quality. Thus, a typical
description statement generated with them could be something like:
“During interaction, sometimes the driving quality has been low,
because the vehicle linearity has been low and . . .
5.2.3. Top order CP: IPD
In this application, the top order CP is the IPD that the analyzed onboard
device has over the driver. This is a more complex type of integrative CP with the
following elements:
Uis a temporal series obtained from couples of instantaneous values of y2
4
and y1
11 along the duration of a simulation session, i.e., IPD is obtained
as a combination of distraction rates over intervals of interaction and non-
interaction.
yis a variable of type 2CP describing the IPD. Its possible value is expressed
by linguistic sentences and their corresponding weights of validity as fol-
20
lows: (Low, wLow),(M edium, wMedium),(H igh, wHigh). Here, e.g., Low
stands for “The Index of Potential Distraction is Low”.
gis the aggregation function described below.
T“The Index of Potential Distraction is {Low |Medium |High}.
The IPD is calculated as follows:
I P D =D1
(D1+D2)(2)
where D1and D2are the distraction rates of intervals with and without interaction
respectively.
D1is calculated as the weighted sum of the cardinalities of labels Low and
Medium on the driving quality y2
4over the active interaction period.
D1=k1×l
t=0 wLow(t)×wI nt
Active(t)
l
t=0 wInt
Active(t)+k2×l
t=0 wMedium(t)×wI nt
Active(t)
l
t=0 wInt
Active(t),(3)
where the values k1= 1 and k2=1
2were chosen empirically as a part of the
aggregation function design. In this equation, wLow corresponds to the validity
degree of linguistic clause “The Driving Quality is Low” of y2
4.wInt
Active corre-
sponds to the validity degree of sentence “Interaction is Active” of y1
14 which is 1
when there is interaction and 0 otherwise, as defined in formula (1). In the same
way, D2is calculated as follows:
D2=k1×l
t=0 wLow(t)×wI nt
Inactive(t)
l
t=0 wInt
Inactive(t)+k2×l
t=0 wMedium(t)×wI nt
Inactive(t)
l
t=0 wInt
Inactive(t),
(4)
With this definition for IPD, the obtained value will be Low while D1< D2.
IPD will be Medium while D1above but still similar to D2, and IPD will be-
come High as D1gets considerably bigger than D2and the driving quality while
interaction gets degraded. Fig. 8 shows the membership functions designed to
verbalize the obtained IPD value. Fig. 9 shows the GLMP developed for this
application. As a reference, tables 1 and 2 show the full list of CPs utilized, in-
cluding the linguistic variables and labels used to describe them.
21
Figure 8: Membership functions used for the verbalization of IPD.
Figure 9: GLM P developed to determine the IP D of an onboard device.
5.3. Identification of distraction events
In order to fulfill the report template as required by the final users, we needed
to identify the, so called, distraction events. Here we describe how to identify
these events as specific situation types into the information available in the GLMP.
The values of D1and D2are an insight to the overall driving quality evalua-
tions during the simulation. However the identification of particular low driving
quality events is also important within the application. These individual distrac-
tion events allow users analyzing in depth the reasons behind them. The evalua-
tion statements of the IPD will be accompanied by the most likely reasons for that
judgements.
The identification of incidences is focused on the 2CPsy2
5and y2
6(Driving
22
Table 1: Table of 1CPs with the corresponding linguistic variables and labels.
CP (y) Linguistic Variables Linguistic Labels (Ay)
y1
1Vehicle Speed {Null, Slow, Medium, Fast}
y1
2Lateral Position {Short, Medium, Long}
y1
3Track Width {Narrow, Medium, Wide}
y1
4TTLC {Big Left, Small, Big Right}
y1
5HE {High Neg, Medium Neg, Low,
Medium, High}
y1
6Steering Wheel Position {Strong Left, Soft Left,
Centered, Soft Right, Strong Right}
y1
7Percentage of brake usage {Null, Low, Medium, High}
y1
8Percentage of accelerator {Null, Low, Medium, High}
usage
y1
9Road Slope {Strong Descendent, Descendent,
Null, Ascendent, Strong Ascendent}
y1
10 Distance to the Vehicle {Not Measurable, Very Small,
in front Medium, Big}
y1
11 Vehicle Overtaking {Not, Yes}
y1
12 Retarder {0, 1, 2, 3, 4}
y1
13 Speed of Vehicle in front {Not Measurable, Null, Low,
Medium, High}
ydDerivative {Rapidly Decreasing, Decreasing,
Steady, Increasing, Rapidly Increasing}
Quality with and without interaction). For this task a window length (wl) is con-
sidered within which the normalized cardinalities of labels Low and Medium
were computed, e.g.:
CARD(Low) = 1
l
k
j=kl
wLow(j).(5)
where kis the current sample, lis the length in samples of the analyzed window,
l=wl ×sr, and sr is the sample rate. Using expert knowledge, in this application
awl of 10 seconds have been selected. The presence of distraction events is
defined as CARD(Low)>0.3or CARD(M edium)>0.6. Note that here, we
must make the crisp decision of either including or not including the description
of an event in the final report.
5.4. Template of the IPD analysis report
As mentioned above, the generation of a report describing the IPD within
a simulation exercise needs to follow a customized report template in order to
highlight the relevant aspects the final user needs.
23
Table 2: Table of 2CPs with the corresponding linguistic variables and labels.
y2
1Steering Wheel Control {Low, Medium, High}
y2
2Vehicle Linearity {Low, Medium, High}
y2
3Security Distance {Low, Medium, High}
y2
4Driving Quality {Low, Medium, High}
y2
5Driving Quality {Low, Medium, High}
during interaction
y2
6Driving Quality {Low, Medium, High}
during non-interaction
yT op IPD {Low, Medium, High}
yiIntegration {Never, A few times, Sometimes
Many times, Most of the time,
Always}
The report template used for this application represents a typical summariza-
tion document defined by human factors experts within the HITO project. This
typical summary focuses on the comparison of driving quality with and without
interaction in order to determine the IPD an onboard device may have. On the
other hand, the report focuses on the abnormal events during the driving activity
providing the final user more information to determine the type of distractions the
onboard device may induce.
Our software application generates a linguistic report including hypothesis
about the relation between the distraction events and parameters that triggered
each of the events.
The template of the report defined for this application consists of the following
sections.
(i) General Observations: This section holds various subsections to describe in
detail the different aspects happened on the simulation exercise.
a) Interaction Activity: This subsection gives information about the interac-
tion activity with the analyzed onboard system during the simulation. It
states the number of independent interactions and the accumulated time
of interaction. A figure indicating the periods of interaction is also in-
cluded.
b) Comparison of interaction vs. non-interaction: The quantified sentences
obtained from y2
i5and y2
i6describing the quality of driving are compared
trying to highlight the similarities and differences between them. The
following sentences show an example:
24
-“Both during interaction and non-interaction,
Most of the time the Driving Quality has been High”.
-”During interaction, Sometimes the Driving Quality has been Low,
while during non-interaction A few times the Driving Quality has been
Low.
Note that the template provides two different linguistic expressions de-
pending on the results. This subsection also provides a relation of distrac-
tion events during the simulation indicating the number of events, and for
each event, occurrence time and cause of event, e.g.,
-”During interaction 1 distraction event happened:”
Event 1) at 100 seconds: The Vehicle Linearity is Low”
c) Detailed description of events during interaction: In this subsection, the
route-cause of events during the interaction are explained individually.
Depending on the rules defined in each aggregation function, the GLMP
navigates backwards on its branches solving the inverse problem, in or-
der to deduce which rules have been triggered and by which parameters.
Each event has an associated time, and so the generation of statements
at the exact time of the event is a straight forward task (we identify the
antecedents of the triggered fuzzy rules), e.g.,
-”Event 1) approximate time at 100 seconds.
The Vehicle Linearity is Low:
* The Vehicle Speed is Medium.
* The Percentage of accelerator usage is High.
* The Vehicle is Not Overtaking.
* The Lateral Position is Rapidly Decreasing.
Each event description is also accompanied with video frames and figures
showing the related input parameters.
(ii) Conclusion: This section provides the estimated IPD during the simulation
exercise.
Therefore, the final layout of generated reports will be dynamic and depending on
the number of distraction events detected during the simulation. Fig. 10 shows
an example of an event description within one of the generated reports (see an
explanation of this text in the next section).
25
Figure 10: Example of event description within a report that was generated during the experimen-
tation.
6. Experimentation
The experimentation was based on data of a series of simulation exercises
performed with professional drivers in a variety of settings. Within the project,
four drivers conducted 8 exercises each in 4 predefined scenarios. These exercises
were designed to evaluate 4 different onboard devices. For the validation of the
implemented application, we were asked to focus on the ”mobile phone” as on-
board device. Exercises performed by the first driver were used to tune the rule
sets and parameters defined within the GLMP.
A standard simulation exercise analyzed within the project contains data of
10-15 minutes, and considering the sample rate of the measurements (50 Hz)
and the size of the GLM P , the number of sentences generated can amount to
hundreds of thousands.
Fig. 10 shows an interesting example of a description of an event occurred
during the execution of an simulation exercise in an inter-urban environment. At
some point during this simulation, the driver (of the truck) must overtake a group
of cyclists that appears on the road. In the particular case of the figure, the numer-
26
ical information indicates the presence of an object (group of cyclists) in front of
the vehicle. The speed of the object is low, while the truck is accelerating hard and
not braking. The prolongation of this situation over a period of time is a reason
why an incident happens. It is worth noting that the description of this particular
event would only appear in a final document in case the event occurred during a
positive interaction period as specified in the user requirements.
As shown in Fig. 10, the description of a particular event contains the expla-
nation sentences that describe it and conjectures about the possible reasons that
caused it. Then, the first graphic shows the first order parameters around the in-
stant of the event. Secondly the progression of the incidence level signal during
the same period is shown. Finally the frame of the simulation video that corre-
sponds to the approximate instant of the incidence is represented (here the group
of cyclists can be observed).
At the time of describing the incidence, the rules that triggered the detection
of a distraction event are identified. In this sense, the inverse problem concerned
with fuzzy relations is investigated here. We propose the diagnosis of occurring
incidents through a linguistic approach. In every instant of the simulation, the
rules that are activated can be tracked. For example, we can access the combina-
tion of parameters in each rule, and the exact variables that trigger and cause the
incidence can be reported.
Quality of a
linguistic
description
What the text
implicates
What the text
says
Relevance Truthfulness
Lexico-grammar
Extension
Context Semantics
Manner
Expression
·Question 1
·Question 2
·Question 3
·Question 4
·Question 5
·Question 6
·Question 7
·Question 8
Layer 4
Layer 3
Layer 2
Layer 1
Figure 11: Hierarchical structure of our concept of the quality of a report.
27
7. Validation
One of the critical problems for the designer of this type of computational
applications is that of assessing a degree of quality to each automatically generated
text. This evaluation is needed to provide him/her with the necessary feedback to
improve the final results.
Here in order to obtain a measure of text quality, the generated report was sent
to five experts in human factors in the HITO project, who fulfilled a form com-
posed by eight questions. This questionnaire is a result of our own research in this
specific field [35]. In this section, we introduce the essentials of our approach and
we address the interested reader to the referenced paper. Our source of inspiration
are Pragmatics [36] and Systemic Functional Linguistics (SFL) [37]. From Prag-
matics, we take the definition of a good communicative act and from SFL we take
the structure of language as a system.
Fig. 11 shows a hierarchical structure of concepts about the report quality that
are explained using the answers obtained using a questionnaire fulfilled by a hu-
man expert. The questions have five possible answers in form of numeric scale of
evaluation in [1, 5]. Table 3 contains our proposal of questions. Question 1 con-
siders subjective relevance meanwhile question 2 considers the inter-subjective
relevance. Questions 3 and 4 deals with evaluating the truthfulness of a report.
With the ratings of the relevance and the truthfulness, we obtain the partial rating
of the report respect to “what the text implicates”. Questions 5 and 6 evaluate
if the report uses the adequate vocabulary, if the order of the ideas is the most
appropriate or if the used expressions are the right ones. Bad quantity reports
contains too much or insufficient statements respect to the fact that try to be de-
scribed; good quantity reports contains the right statements to understand the fact.
To evaluate this aspect, we propose questions 7 and 8.
As mentioned above, for the validation of the implemented application, we
were asked to focus on the ”mobile phone” as onboard device. These sample
gives a good insight of the potential of the developed technology, and the reports
generated in these exercises were used to assess the quality of the automatically
generated reports. In a typical experimental layout, we selected a group of five
people familiar with the HITO project. For each exercise, we gave them: the
description of the context of the simulation, the video of the exercise, the auto-
matically generated document and the form to evaluate it. After 45 minutes, they
had to fulfill the questionnaire. The results of the different reviewers are shown in
Table 4, where R1,R2,R3,R4 and R5 denotes each reviewer that participated in
the experiment. The data in Table 4 are an example of the type of practical results
28
Table 3: Questions about Quality of the generated Reports.
1Indicate in which degree the content of this report
belongs to the application domain of the HITO
project.
2Indicate in which degree you identify the type of
results expressed as the type of results expressed
by yourself.
3After observing the behavior of the driver in
the simulator, do you agree with the assessed
global quality?
4Do you agree with the provided explanations?
5Indicate in which degree the vocabulary is used
correctly.
6Indicate in which degree the ideas are correctly
ordered to facilitate the comprehension
of the report.
7Indicate in which degree the format of the
report, including the use of figures and
punctuation, is the most adequate.
8Indicate in which degree you consider that the
extension of the report is the right respect
to its content.
obtained during the development of the project. The average denotes the global
rating for the report. Two of the five ratings are higher than 4and the minimum
rating is 3.7. The average is calculated as the arithmetic mean of all questions;
therefore, all the questions have the same importance for the global rating. It is
relevant to note that all the reviewers agree with the maximum rating in question
1, the question that tries to determine if the content of the report belongs to the
application domain. Question 7 has the worst ratings, following by question 2 and
question 5.
After analyzing these results the designers must interpret them in order to im-
prove the HITO report generator. For instance, the rating of question 7 indicates
that the format of the report and the different techniques for expressing the in-
formation must be improved, because the reviewers consider that it is not good
enough. Question 2, that analyzes the way as the results are expressed, also has a
low global rating. Therefore, this is other aspect that the designers should improve
to make better reports.
29
Table 4: Average results of the evaluation.
R1 R2 R3 R4 R5
Question 1 5 5 5 5 5
Question 2 4 3 3 5 2
Question 3 4 4 4 4 3
Question 4 5 4 3 5 3
Question 5 3 3 4 4 4
Question 6 4 4 2 5 4
Question 7 4 3 2 4 3
Question 8 4 4 4 5 5
Average 4.1 3.8 3.4 4.6 3.7
8. Conclusion
Linguistic description of phenomena is a very complex challenge. The ex-
pected results of this new technology will soon be useful for experts dealing with
the monitoring and evaluation of big volumes of data. The exploitation of data
generated by simulators is a remarkable example of these applications. Simula-
tors are widely used in a variety of fields and in many situations, domain experts
are required in order to observe and evaluate the amount of data acquired in the
performed exercises.
In this paper, based on CTP, we contribute to the automatic evaluation of ve-
hicle onboard devices in a simulation environment. Using our solution, experts
in the field will save time and resources when analyzing all the data generated in
simulation exercises. The system will provide objective reports generated based
on unified criteria with the interpretation, arguments and conclusions derived from
the data.
We have presented the complete linguistic model and report generation tem-
plate defined on the HITO project. With this application we fulfill the requirement
of automatic evaluation of onboard devices in road transport environment. With
respect to our previous works, we have extended the linguistic model and report
template in order to meet the final specifications of the project. We have included
the detailed description of our approach for automatic text generation and we have
included the definition of the IPD. We have also included our approach for the
quality assessment of the generated text. We seek to evaluate the usability of the
generated report from the users’ point of view, and we have identified different
30
indicators in a hierarchical form to define the quality of a linguistic report.
Through the use of Assertive CP,Derivative CP and Integrative CP computa-
tional perceptions inspired in Control Theory, the GLMP has been clearly adapted
to be applied to the representation of temporal evolution of phenomena.
It is worth noting that in this application we have implemented a solution to
the, so called inverse problem. The computational system navigates through the
GLMP and realizes a backward search in order to determine the rules that were
activated and ultimately triggered each event. Thus, conjectures about the possible
reasons (variables) that caused the events can be established.
Interestingly, the GLMP paradigm has been useful for the whole project team,
including human factors experts, helping to define the new concept of Index of
Potential Distraction. Our experience in the HITO project encourages us to apply
this paradigm in other application fields.
We have a lot of pending future work in this area. Here we have presented
a specific solution to a specific problem of linguistic description of data. In the
context of the HITO project, we have created just a first prototype that must be
validated and tuned after a period of practical application. Many topics of re-
search remain unexplored in front of us, e.g.; develop new types of Perception
Mapping, i.e., new types of sentences; develop the mechanisms to select the more
relevant sentences in each situation type; generate different linguistic expressions
according with the experience of each different user, and develop new methods
for assessing the quality of the obtained reports.
We think that the presented automatic linguistic description approach can be
applied in a wide variety of domains and in multiple forms. For example, this
technology could be used for driving quality assessment in the form of a new on-
board system. In this case the evaluation would be performed on-line in real time
and the generated text messages should be converted to voice. We could also use
this technology for training environment. Depending on students’ performance in
training with respect to established objectives, customized training plans could be
proposed automatically.
Acknowledgment
This work was supported in part by the Spanish Ministry of Science and In-
novation (grants TIN2008-00040, PSS-370100-2007-12, TIN2008-06890-C02-
01 and TIN2011-29827-C02-02) and the Spanish Ministry for Education (FPU
Fellowship Program).
31
References
[1] M. Meyer, J. Booker, Eliciting and analyzing expert judgment: A practical
guide, Society for Industrial Mathematics, 2001.
[2] M. Riojas, C. Feng, A. Hamilton, J. Rozenblit, Knowledge Elicitation for
Performance Assessment in a Computerized Surgical Training System, Ap-
plied Soft Computing 11 (2011) 3697–3708.
[3] A. Proper, Intelligent Transportation Systems Benefits: 1999 Update, US
Dept. of Transportation, Federal Highway Administration, ITS Joint Pro-
gram Office, 1999.
[4] L. Tijerina, E. Parmer, M. Goodman, Driver workload assessment of route
guidance system destination entry while driving: A test track study, in:
Proceedings of the 5th ITS World Congress, 1998, pp. 12–16.
[5] L. Chittaro, L. De Marco, Driver distraction caused by mobile devices:
studying and reducing safety risks, in: Proc. International Workshop on
Mobile Technologies and Health: Benefits and Risks, Udine, 2004.
[6] R. R. Yager, A new approach to the summarization of data, Information
Sciences 28 (1982) 69–86.
[7] R. R. Yager, Fuzzy summaries in database mining, in: In Proceedings 11th
Conference on Artificial Intelligence for Applications, IEEE, 1995, pp. 265–
269.
[8] J. Kacprzyk, R. Yager, S. Zadrozny, A fuzzy logic based approach to linguis-
tic summaries of databases, International Journal of Applied Mathematics
and Computer Science (2000) 813–834.
[9] J. Kacprzyk, S. Zadrozny, Computing with words is an implementable
paradigm: Fuzzy queries, linguistic data summaries and natural language
generation, IEEE Transactions on Fuzzy Systems 18 (2010) 461–472.
[10] L. A. Zadeh, From computing with numbers to computing with words -
from manipulation of measurements to manipulation of perceptions, IEEE
Transactions on Circuits and Systems 45 (1999) 105–119.
[11] L. A. Zadeh, A new direction in ai: Toward a computational theory of per-
ceptions, AI Magazine 22 (2001).
32
[12] L. Eciolaza, G. Trivino, B. Delgado, J. Rojas, M. Sevillano, Fuzzy linguistic
reporting in driving simulators, in: IEEE Symposium on Computational
Intelligence in Vehicles and Transportation Systems (CIVTS), IEEE, 2011,
pp. 30–37.
[13] L. Eciolaza, G. Trivino, Linguistic reporting of driver behavior: Summary
and event description, in: Proceedings of the 11th International Conference
on Intelligent Systems Design and Applications (ISDA), 2011, pp. 148–153.
[14] E. Reiter, R. Dale, Building natural language generation systems, Cambridge
Univesity Press, 2000.
[15] O. Gusikhin, D. Filev, N. Rychtyckyj, Intelligent Vehicle Systems: Appli-
cations and New Trends, Informatics in Control Automation and Robotics
(2008) 3–14.
[16] D. Salvucci, Predicting the effects of in-car interface use on driver perfor-
mance: An integrated model approach, International Journal of Human-
Computer Studies 55 (2001) 85–107.
[17] D. Salvucci, K. Macuga, Predicting the effects of cellular-phone dialing on
driver performance, Cognitive Systems Research 3 (2002) 95–102.
[18] Hito cabintec project website, http://www.cabintec.net/proyectos-hito.asp,
2011.
[19] L. A. Zadeh, Toward human level machine intelligence - is it achievable?
the need for a paradigm shift, IEEE Computational Intelligence Magazine 1
(2008) 11–22.
[20] J. Lawry, A methodology for computing with words, International Journal
of Approximate Reasoning 28 (2001) 51 – 89.
[21] L. A. Zadeh, The concept of linguistic variable and its application to approx-
imate reasoning, Information sciences 8 (1975) 199–249.
[22] L. A. Zadeh, A computational approach to fuzzy quantifiers in natural lan-
guages, Computing and Mathematics with Applications 9 (1983) 149–184.
[23] A. Bargiela, W. Pedrycz, Granular Computing: An Introduction, Kluwer
Academic Publishers, 2003.
33
[24] L. Lietard, A new definition for linguistic summaries of data, in: Proceed-
ings of International Conference on Fuzzy Systems (FUZZ-IEEE), IEEE,
2008, pp. 506–511.
[25] R. Castillo-Ortega, N. Mar´
ın, D. S´
anchez, A fuzzy approach to the linguistic
summarization of time series, Journal of Multiple-Valued Logic and Soft
Computing (2011) 157 – 182.
[26] J. Kacprzyk, S. Zadrozny, Computing with words and
Systemic Functional Linguistic: Linguistic data summaries and natural
language generation, in: V.N. Huynh et al. (Eds.): Integrated Uncertainty
Management and Applications, AISC, Springer-Verlag Berlin, 2010, pp.
23–36.
[27] L. Martinez, Sensory evaluation based on linguistic decision analysis, Inter-
national Journal of Approximate Reasoning 44 (2007) 148 – 164.
[28] E. Herrera-Viedma, E. Peis, J. M. M. del Castillo, S. Alonso, K. Anaya,
A fuzzy linguistic model to evaluate the quality of web sites that store xml
documents, International Journal of Approximate Reasoning 46 (2007) 226
– 253.
[29] I. Glockner, Fuzzy quantifiers: a computational theory, Springer Verlag,
2006.
[30] F. D´
ıaz-Hermida, A. Bugar´
ın, S. Barro, Definition and classification of semi-
fuzzy quantifiers for the evaluation of fuzzy quantified sentences, Interna-
tional journal of approximate reasoning 34 (2003) 49–88.
[31] S. Peters, D. Westerst˚
ahl, Quantifiers in language and logic, Oxford Univer-
sity Press, USA, 2006.
[32] K. Ogata, Modern control engineering, Prentice Hall, 2009.
[33] M. Delgado, D. S´
anchez, M. Vila, Fuzzy cardinality based evaluation of
quantified sentences, International Journal of Approximate Reasoning 23
(2000) 23–66.
[34] C. Pappis, M. Sugeno, Fuzzy relational equations and the inverse problem,
Fuzzy sets and Systems 15 (1985) 79–90.
34
[35] M. Pereira-Farina, L. Eciolaza, G. Trivino, Quality assessment of linguistic
description of data, in: In ESTYLF2012: XVII Congreso Espanol sobre
Tecnolog´
ıas y L´
ogica Fuzzy, 2012.
[36] K. Korta, J. Perry, Pragmatics, 2011.
[37] M. A. K. Halliday, M. I. M. Matthiessen, Construing Experience through
Meaning: A Language-based Approach to Cognition, Cassell London, 1999.
35
... In previous works, we have developed several simple examples of linguistic descriptions of data, e.g., in [15], we describe the behavior of traffic in a roundabout, in [8], we provide linguistic descriptions about the different styles of driving. In [5], linguistic summarizations of weather data are proposed by other authors. ...
... Our participation in the HITO project (Human Interface by Technology Observation) deals with the development of a computational system capable of automatically generating linguistic descriptions of driving activity from vehicle simulator data. Our solution is built over the concept of Granular Linguistic Model of a Phenomenon (GLMP) [7,8], that seeks to broaden the understanding of the Computational Theory of Perceptions (CTP) [16], representing our approach to it. This project takes into account different on board devices (mobile phone, GPS, on board pc, tachometer, etc) used in road transport in order to optimize reliability levels and to guarantee an ergonomic and safe workplace design. ...
... A standard simulation exercise analyzed within the project contains data of 10-15 minutes. Figure 2 is a partial copy of one of this reports (see [7,8] for more detail). ...
Article
Full-text available
One important challenge in the field of automatic description of data consists of assessing the qual-ity of the obtained results. This paper explores a possible approach to solve this problem based on analysis performed by readers that are familiar with the specific context. We show an example with a practical application.
... Successful examples of GLMP model application to produce linguistic descriptions of different types of phenomena can be found for the behavior of traffic in a roundabout [13], in the reporting of financial data [15], in descriptions about relevant features of the Mars' Surface [14], or in the assessing of reports produced by truck driving simulators [16]. This paper is organized as follows. ...
... Our research line is based on the Computational Theory of Perceptions introduced by Zadeh [6][7]. In previous works, we have developed computational systems able to generate linguistic descriptions of different types of phenomena, e.g., gait analysis [8], activity recognition [9] and traffic evolution [10][11]. We have used FFSMs to model and linguistically describe the temporal evolution of quasi-periodic signals during a period of time. ...
Conference Paper
Full-text available
In previous works, we have developed some computational models of quasi-periodic phenomena based on Fuzzy Finite State Machines. Here, we extend this work to allow designers to obtain detailed linguistic descriptions of relevant amplitude and temporal changes. We include several examples that will help to understand and use this new resource for linguistic description of complex phenomena.
Article
The analysis of human behavior is a popular topic of research since it allows obtaining specific information about individuals, their motivations, and the problems and difficulties they can encounter. Human behavior can be grouped to elaborate profiles that would enable the classification of individuals. Nevertheless, the elaboration of profiles related to human behaviors presents some difficulties associated with the volume of data and the number of parameters typically considered. Thus, the development of software able to automatize the manipulation of data through graphical assistants to produce understandable visualizations of the human behaviors is crucial. In this paper, the VISUVER framework is presented. It uses finite state machines to represent and visualize the dynamic human behavior automatically. This behavior could be provided by real data collected by specific sensors or simulated data. The state machines are built in sequential steps in order to illustrate the dynamic evolution of the behavior over time. VISUVER also includes similarity metrics based on text mining techniques to establish possible profiles among the analyzed behaviors. The Intelligent Transportation Systems (ITS) domain has been considered in order to validate the proposal.
Article
Full-text available
The impressive practical applications is one of the main reasons of success of Fuzzy Logic. The Computational Theory of Perceptions (CTP) is an extension of this theory that has not yet reached the same level of applicability. We believe that a review of the fundamentals of CTP from a epistemological perspective would provide useful insights to contribute to energize its development. This is a very preliminary paper where we make some reflections on epistemological aspects of CTP. We have used three representative Zadeh's papers to extract a set of relevant sentences describing CTP. Then, we analyzed this corpus of information from various epistemological points of view to obtain some provisional conclusions.
Article
Compared to the predictive models for forecasting crude oil price, the descriptive models are also important for discovering the relations among factors that have influence on crude oil price. In this paper, linguistic summarization of time series, a descriptive technique, is used to summarize the Europe Brent Spot Price (EBSP) data (dollars per barrel) from May 1987 to July 2013 on daily and monthly base. This study is the first attempt in the literature for linguistic summarization of the EBSP time series with the aim to establish a decision support tool that can be used for generating linguistic propositions helping the estimation of future behaviors and fluctuations of oil price time series for oil-importer countries such as Turkey. Economic interpretations of generated linguistic propositions about the main features of EBSP trends (duration, dynamics of change, and variability) and a procurement strategy, indicating that Turkey could cut back on oil importation about $573.88 million between 1996 and 2012, are suggested from the perspective of Turkey.
Conference Paper
This paper describes the use of the granular linguistic model of a phenomenon (GLMP) to model the assessment of integer arithmetic learning and implement the automated generation of a formative criterion-based assessment report in natural language, as well as a numerical grade. The report is generated based exclusively on the objective scores automatically generated by Moodle quizzes. GLMP is based on fuzzy logic and the computational theory of perceptions, and uses inference systems based on linguistic rules.
Article
Linguistic data summarization targets the description of patterns emerging in data by means of linguistic expressions. Just as human beings do, computers can use natural language to represent and fuse heterogeneous data in a multi criteria decision making environment. Linguistic data description is particularly well suited for applications in which there is a necessity of understanding data at different levels of expertise or human-computer interaction is involved. In this paper, an application for the linguistic descriptions of driving activity in a simulation environment has been developed. In order to ensure safe driving practices, all new onboard devices in transportation systems need to be evaluated. Work performed in this application paper will be used for the automatic evaluation of onboard devices. Based on Fuzzy Logic, and as a contribution to Computational Theory of Perceptions, the proposed solution is part of our research on granular linguistic models of phenomena. The application generates a set of valid sentences describing the quality of driving. Then a relevancy analysis is performed in order to compile the most representative and suitable statements in a final report. Real time-series data from a vehicle simulator have been used to evaluate the performance of the presented application in the framework of a real project.
Article
E-learning systems output a huge quantity of data on a learning process. However, it takes a lot of specialist human resources to manually process these data and generate an assessment report. Additionally, for formative assessment, the report should state the attainment level of the learning goals defined by the instructor.This paper describes the use of the granular linguistic model of a phenomenon (GLMP) to model the assessment of the learning process and implement the automated generation of an assessment report. GLMP is based on fuzzy logic and the computational theory of perceptions. This technique is useful for implementing complex assessment criteria using inference systems based on linguistic rules. Apart from the grade, the model also generates a detailed natural language progress report on the achieved proficiency level, based exclusively on the objective data gathered from correct and incorrect responses. This is illustrated by applying the model to the assessment of Dijkstra’s algorithm learning using a visual simulation-based graph algorithm learning environment, called GRAPHs.
Article
Full-text available
One important challenge in the field of automatic description of data consists of assessing the qual-ity of the obtained results. This paper explores a possible approach to solve this problem based on analysis performed by readers that are familiar with the specific context. We show an example with a practical application.
Book
Full-text available
Michael Alexander Kirkwood Halliday has been actively analyzing and documenting the interactions between syntax and semantics for over forty years, and his systemic-functional theory has been a foundation for important work in computational linguis-tics for at least thirty years. The first major application of systemic theory was the SHRDLU system by Winograd (1972). The largest ongoing series of applications has been developed at the USC Information Sciences Institute: language generation (Mann 1982; Hovy 1988); discourse analysis and rhetorical structure (Mann and Thompson 1992); and the interface between the lexicon and world knowledge (Bateman et al. 1990; Matthiessen 1995). In this book, Halliday and Matthiessen present a comprehensive survey of seman-tics and its relationships to syntax and cognition. Although they present their subject from a systemic-functional point of view, they show how their approach is related to a wide range of work in both computational and theoretical linguistics. One no-table omission from their 23-page bibliography is Noam Chomsky, whose period of active research almost exactly coincides with Halliday's. They do, however, give a fair summary of semantic theories based on Chomsky's approach, ranging from the early work of Katz and Fodor to the more recent work by Jackendoff. The book consists of 15 chapters organized in five parts. In Part I, the authors contrast the systemic approach with a view of knowledge representation as a "piece-meal accumulation" of concepts with "no overall organization. " Instead of treating language "as a kind of code in which pre-existing conceptual structures are more or less distortedly expressed," they view language as a semiotic system that serves "as the foundation of h u m a n experience. " The goal of systemic theory is to present a comprehensive view of how humans construe experience through language. Unlike Chomsky, they do not consider grammar as "autonomous" but as an integral part of the lexicogrammar, which realizes meaning in words, phrases, sentences, and para-graphs. Part II, comprising Chapters 2 through 7, presents the meaning base, which cor-responds to what many authors would call an ontology. The meaning base, however, represents categories of experience with a topmost node called p h e n o m e n o n instead of categories of existence with a topmost node called entity. The first subdivision of phenomena is a three-way partitioning according to levels of complexity:
Article
The linguistic description of a physical phenomenon is a summary of the available information where certain relevant aspects are remarked while other irrelevant aspects remain hidden. This paper deals with the development of computational systems capable to generate linguistic descriptions from images captured by a video camera. The problem of linguistically labeling images in a database is a challenge where still much work remains to be done. In this paper, we contribute to this field using a model of the observed phenomenon that allows us to interpret the content of images. We build the model by combining techniques from Computer Vision with ideas from the Zadeh's Computational Theory of Perceptions. We include a practical application consisting of a computational system capable to provide a linguistic description of the behavior of traffic in a roundabout.
Article
This is a very short paper that briefly discusses some of the tasks that NLG systems perform. It is of no research interest, but I have occasionally found it useful as a way of introducing NLG to potential project collaborators who know nothing about the field.
Article
In this book we describe how to elicit and analyze expert judgment. Expert judgment is defined here to include both the experts' answers to technical questions and their mental processes in reaching an answer. It refers specifically to data that are obtained in a deliberate, structured manner that makes use of the body of research on human cognition and communication. Our aim is to provide a guide for lay persons in expert judgment. These persons may be from physical and engineering sciences, mathematics and statistics, business, or the military. We provide background on the uses of expert judgment and on the processes by which humans solve problems, including those that lead to bias. Detailed guidance is offered on how to elicit expert judgment ranging from selecting the questions to be posed of the experts to selecting and motivating the experts to setting up for and conducting the elicitation. Analysis procedures are introduced and guidance is given on how to understand the data base structure, detect bias and correlation, form models, and aggregate the expert judgments.
Book
"Almost all", "many", "some": fuzzy quantifiers are vital for effective communication in natural language (NL). This monograph pursues an axiomatic method to achieve a reliable interpretation of these quantifiers in technical applications of fuzzy quantification. Unlike existing work in this area, it targets a much broader class of quantificational phenomena which includes all cases usually considered in linguistics. The topics addressed in the monograph run the gamut from the introduction of the theoretical framework for analysing fuzzy quantification, the formalization of semantical requirements on models of fuzzy quantification, the construction and detailed study of prototypical models which conform to the linguistic desiderata, the development of algorithms for implementing the main types of quantifiers in these models, and finally a preview to fuzzy branching quantifications which might be necessary for modelling NL sentences involving more than one quantifier. The material will be of interest to those working at the crossroads of natural language and fuzzy set theory. The fields of application comprise fuzzy information aggregation and data fusion, flexible database querying and fuzzy information retrieval, multi-criteria decision-making and linguistic data summarization.
Article
In this book we describe how to elicit and analyze expert judgment. Expert judgment is defined here to include both the experts' answers to technical questions and their mental processes in reaching an answer. It refers specifically to data that are obtained in a deliberate, structured manner that makes use of the body of research on human cognition and communication. Our aim is to provide a guide for lay persons in expert judgment. These persons may be from physical and engineering sciences, mathematics and statistics, business, or the military. We provide background on the uses of expert judgment and on the processes by which humans solve problems, including those that lead to bias. Detailed guidance is offered on how to elicit expert judgment ranging from selecting the questions to be posed of the experts to selecting and motivating the experts to setting up for and conducting the elicitation. Analysis procedures are introduced and guidance is given on how to understand the data base structure, detect bias and correlation, form models, and aggregate the expert judgments.
Article
There is a clear need for computational systems able to produce automatic linguistic descriptions of data about phenomena. Linguistic summarization represents an attempt to describe by means of linguistic expressions patterns emerging in data. Generating data summaries can be seen as a very complex and non-trivial data mining task. Language is the unique meta-language to describe and understand various complex phenomena and humans use it for multi-modal data fusion in their brains. Our work is devoted to advance towards the development of a framework for the fusion of heterogenous information using linguistic descriptions based on NL. This application paper deals with the development of a computational system capable of automatically generating linguistic descriptions of driving activity from vehicle simulator data. Based on Fuzzy Logic, and as a contribution towards the development of Computational Theory of Perceptions, the proposed solution is part of our research on granular linguistic models of phenomena. We will generate a set of valid sentences describing a phenomenon through a granular linguistic model of a phenomenon. Then a relevancy analysis will be performed, in order to choose the most suitable sequence of clauses to each specific input data. We have used real time-series data from a vehicle simulator to evaluate the performance of our approach.
Book
Granular Computing: An Introduction covers a full spectrum of granular computing from the basic methodology through algorithms and granular worlds, to a representative spectrum of applications. This book will appeal to all who are developing intelligent systems, either working at the methodological level or interested in detailed system realisations. The reader is provided with the underlying material on granular computing, as well as exposed to the current developments where it finds the most visible applications. Furthermore, this book provides an extensive bibliography after each chapter - an indespensable source of information to anyone seriously pursuing research in this rapidly developing area.