ChapterPDF Available

The Importance of Visualization and Interaction in the Anomaly Detection Process

Authors:

Abstract and Figures

Large volumes of heterogeneous data from multiple sources need to be analyzed during the surveillance of large sea, air, and land areas. Timely detection and identification of anomalous behavior or any threat activity is an important objective for enabling homeland security. While it is worth acknowledging that many existing mining applications support identification of anomalous behavior, autonomous anomaly detection systems for area surveillance are rarely used in the real world since these capabilities and applications present two critical challenges: they need to provide adequate user support and they need to involve the user in the underlying detection process. Visualization and interaction play a crucial role in providing adequate user support and involving the user in the detection process. Therefore, this chapter elaborates on the role of visualization and interaction in the anomaly detection process, using the surveillance of sea areas as a case study. After providing a brief description of how operators identify conflict traffic situations and anomalies, the anomaly detection problem is characterized from a data mining point of view, suggesting how operators may enhance the process through visualization and interaction.
Content may be subject to copyright.
Innovative Approaches
of Data Visualization and
Visual Analytics
Mao Lin Huang
University of Technology, Sydney, Australia
Weidong Huang
CSIRO, Australia
A volume in the Advances in Data Mining
and Database Management (ADMDM)
Book Series
Published in the United States of America by
Information Science Reference (an imprint of IGI Global)
701 E. Chocolate Avenue
Hershey PA 17033
Tel: 717-533-8845
Fax: 717-533-8661
E-mail: cust@igi-global.com
Web site: http://www.igi-global.com
Copyright © 2014 by IGI Global. All rights reserved. No part of this publication may be reproduced, stored or distributed in
any form or by any means, electronic or mechanical, including photocopying, without written permission from the publisher.
Product or company names used in this set are for identification purposes only. Inclusion of the names of the products or
companies does not indicate a claim of ownership by IGI Global of the trademark or registered trademark.
Library of Congress Cataloging-in-Publication Data
British Cataloguing in Publication Data
A Cataloguing in Publication record for this book is available from the British Library.
All work contributed to this book is new, previously-unpublished material. The views expressed in this book are those of the
authors, but not necessarily of the publisher.
Innovative approaches of data visualization and visual analytics / Mao Lin Huang and Weidong Huang, editors.
pages cm
Summary: “This book evaluates the latest trends and developments in force-based data visualization techniques, addressing
issues in the design, development, evaluation, and application of algorithms and network topologies”-- Provided by
publisher.
ISBN 978-1-4666-4309-3 (hardcover) -- ISBN 978-1-4666-4310-9 (ebook) -- ISBN 978-1-4666-4311-6 (print & perpetual
access) 1. Information visualization. I. Huang, Mao Lin. II. Huang, Weidong, 1968-
QA76.9.I52I56 2014
001.4’226--dc23
2013011317
This book is published in the IGI Global book series Advances in Data Mining and Database Management (ADMDM)
(ISSN: 2327-1981; eISSN: 2327-199X)
Managing Director:
Editorial Director:
Production Manager:
Publishing Systems Analyst:
Development Editor:
Assistant Acquisitions Editor:
Typesetter:
Cover Design:
Lindsay Johnston
Joel Gamon
Jennifer Yoder
Adrienne Freeland
Christine Smith
Kayla Wolfe
Travis Gundrum
Jason Mull
133
Copyright © 2014, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Chapter 7
DOI: 10.4018/978-1-4666-4309-3.ch007
The Importance of Visualization
and Interaction in the
Anomaly Detection Process
ABSTRACT
Large volumes of heterogeneous data from multiple sources need to be analyzed during the surveillance
of large sea, air, and land areas. Timely detection and identification of anomalous behavior or any threat
activity is an important objective for enabling homeland security. While it is worth acknowledging that
many existing mining applications support identification of anomalous behavior, autonomous anomaly
detection systems for area surveillance are rarely used in the real world since these capabilities and ap-
plications present two critical challenges: they need to provide adequate user support and they need to
involve the user in the underlying detection process. Visualization and interaction play a crucial role in
providing adequate user support and involving the user in the detection process. Therefore, this chapter
elaborates on the role of visualization and interaction in the anomaly detection process, using the surveil-
lance of sea areas as a case study. After providing a brief description of how operators identify conflict
traffic situations and anomalies, the anomaly detection problem is characterized from a data mining
point of view, suggesting how operators may enhance the process through visualization and interaction.
Maria Riveiro
Informatics Research Centre, University of Skövde, Skövde, Sweden
134
The Importance of Visualization and Interaction
INTRODUCTION
Exploring, analyzing and making decisions based
on vast amounts data are complex tasks that are
carried out in a daily basis. People, both in their
business and private lives, walk the path from
data to decision using diverse means of support.
While purely automatic or purely visual analysis
methods are used and continued to be developed,
the complex nature of many real-world problems
makes it indispensable to include humans in the
data analysis process.
Automatic analysis methods cannot be ap-
plied on ill-defined problems. Furthermore, some
real-world problems require dynamic adaptation
of the analysis solution, which is very difficult
to be handled by automatic means (Keim et al.,
2009). Visual analysis methods exploit human
creativity, knowledge, intuition and experience
to solve problems at hand. While visualization
approaches generally give very good results for
small data sets, they fail when the required data
for solving the problem is too large to be captured
by a human analyst (Keim et al., 2009).
The surveillance of large sea areas normally
requires the analysis of huge volumes of hetero-
geneous, multidimensional and dynamic data, in
order to improve vessel traffic safety, efficiency
and protect the environment (Kharchenko &
Vasylyev, 2002). Human operators may be over-
whelmed by the data, by the traditional manual
methods of data analysis or by other factors, like
time pressure, high stress, inconsistencies or the
imperfect and uncertain nature of the information.
In order to support the operator while monitoring
large sea areas, the identification of anomalous
behavior or situations that might need further in-
vestigation may reduce operator’s cognitive load.
While it is worth acknowledging that many
existing mining applications support identifica-
tion of anomalous behavior, autonomous anomaly
detection systems for area surveillance are rarely
used in real world settings. We claim that anomaly
detection systems present, among others, two key
challenges: they need to provide adequate user
support and they need to involve the user in the
underlying detection process. Although these
aspects cannot be considered independently, they
present distinctive characteristics and demand
different solutions. The first challenge concerns
the necessity of providing adequate user support
during the whole detection and identification of
anomalous behavior process, allowing a true dis-
course with the information. This issue includes
deepen our understanding of the human analytical
and decision making processes. Due to the fact that
anomaly detection is a complex and not a well-
defined problem, user involvement is needed. The
second challenge involves the study of adequate
ways of interacting and visualizing the underlying
data mining layers. Human expert knowledge is
very valuable in these cases, as it can be used to
guide the anomaly detection process, for example,
reducing the search space, updating knowledge
expert rules or refining normal models derived
from the data. We believe that the visualization
of the data and the data mining process, as well
as the availability of interaction techniques play
a crucial role in such involvement.
Thus, this chapter aims to: (1) review anomaly
detection methods used in the maritime domain,
with specific emphasis on the challenges they
present from a user’s perspective, (2) discuss the
role that visualization and interaction plays in the
anomaly detection process, (3) identify leverage
points where the use of visualization and inter-
action could make a positive difference, and (4)
present examples of how some of the challenges
encountered have been tackled in current research
carried out at our research center.
The remainder of the chapter is structured as
follows: the following section briefly explores the
use of visualization and interaction in data mining.
The role of visualization and interaction in mari-
time anomaly detection is discussed afterwards.
Then, a review of relevant anomaly detection
approaches applied to the maritime anomaly de-
tection problem is presented. Based on field work
135
The Importance of Visualization and Interaction
carried out at various maritime control centers,
we provide a brief description of how maritime
operators monitor traffic. Enhancements of the
anomaly detection process using visualization/
interaction and examples are introduced thereafter.
Finally, conclusions are outlined.
THE ROLE OF VISUALIZATION AND
INTERACTION IN DATA MINING
Data Mining (DM) is defined as the process of
identifying or discovering useful and as yet un-
discovered knowledge from the real-world data
(Hand et al., 2001). Data mining is often placed
in the broader context of Knowledge Discovery
in Databases (KDD). KDD is an iterative process
consisting of data preparation and cleaning, hy-
pothesis generation (data mining is used basically
in this phase) and interpretation and analysis. The
CRISP-DM (CRoss Industry Standard Process for
Data Mining) model (Shearer, 2000) describes
the data mining process in general, specifying
the following phases and tasks: (1) business
understanding (determine business objectives,
situation assessment, determine data mining goal,
produce project plan), (2) data understanding
(collect initial data, describe data, explore data,
verify data quality), (3) data preparation (data set
description, select data, clean data, construct data,
integrate data, format data), (4) modeling (select
modeling technique, generate test design, build
model, asses model), (5) evaluation (evaluate
results, review process, determine next steps) and
(6) deployment (plan deployment, plan monitoring
and maintenance, produce final report). Here, the
CRISP-DM is used as a framework to describe
the anomaly detection process. Other descrip-
tions of the data mining process can be found
in the literature, such as the model presented by
Harrison-John (1997), which describes the data
mining process as a cyclic process of seven stages:
problem definition, data extraction, data cleansing,
data engineering, mining algorithm application
and analysis of results, where the emphasis is on
the data selection and parameter selection tasks.
The integration of DM and information visu-
alization techniques has received a lot of atten-
tion in recent years, since automatic data mining
approaches only work well for well-defined and
specific problems (Kerren et al., 2007). Numer-
ous authors (e.g. Keim [2002] and Fayyad et al.
[2002]) recognize the need to tightly include the
human in the exploration process.
Visualization can contribute to the data min-
ing process in three ways: it can represent the
results of complex computational algorithms, it
can depict the data mining process and it can be
used to discover complex patterns which cannot
be detected automatically but by the power-
ful human visual system (visual data mining).
Visual data mining focuses on integrating the
user in the knowledge discovery process using
effective and efficient visualization techniques
and interaction capabilities. A classification of
visual data mining methods regarding data type,
visualization technique and the interaction/dis-
tortion technique can be found in Keim (2002).
Additionally, significant examples of the use of
data mining and data visualization can be found
in Fayyad et al. (2002). In Meneses and Grinstein
(2001), the authors present a description of the
data mining process incorporating visualization
as a component. Visualization allows users and
analysts to interact with several entities involved
in the data mining cycle.
Interaction is a core component of the analy-
sis and knowledge discovery process. Users can
interact with the data in many different ways
(Fayyad et al., 2002): selecting sources of data,
browsing, querying, sampling, selecting graphical
representations, and so forth. But users may also
interact with the underlying data mining process,
selecting input parameters, selecting algorithms,
validating models, modifying thresholds, and so
forth. Nevertheless, examples of interactions be-
tween users and entities that are part of any data
mining process are not common in the literature.
136
The Importance of Visualization and Interaction
THE ROLE OF VISUALIZATION
AND INTERACTION IN
ANOMALY DETECTION
Anomaly detection methods have been used
in multiple areas, like network security, video
surveillance, human activity monitoring, etc.
The majority of published work on anomaly
detection focuses on the technological aspects:
new and combinations of methods, additional
improvements of existing methods, reduction
of false alarms, correlations among alarms, etc.
Publications regarding the use of anomaly de-
tection methods in real environments or human
factors studies regarding anomaly detection are
scarce. Even if interaction, usability, cognitive task
analysis or acceptability are not normally matters
within anomaly detection research, visualization
has received more attention.
The majority of the examples regarding the use
of visualization to enhance anomaly detection are
published in the area of network security. Even
though an exhaustive review on the use of visu-
alization for network security is out of the scope
of this chapter, we outline here some examples
where visualization has been used for enhancing
the anomaly detection process.
Axelsson (2005) addresses the problem of false
alarms within intrusion detection and proposes
four different visualization approaches to aid
the operator to correctly identify false (and true)
alarms. Likewise, Mansmann (2008) devotes his
dissertation to the use of visualization for moni-
toring, detecting and interpreting security threats.
New scalable visualization metaphors for detailed
analysis of large network time series are presented:
a hierarchical map of the IP address space, graph-
based approaches for tracking behavioral changes
of hosts and higher-level network entities and the
application of Self Organizing Maps (SOMs) to
analyze both structured network protocol data
and unstructured information, e.g., textual con-
text of email messages. Other examples of novel
visualization approaches for network traffic that
support intrusion detection are presented in Onut
et al. (2004), Teoh et al. (2004), Muelder et al.
(2005), Livnat et al. (2005), and Cai and de M.
Franco (2009).
Onut et al. (2004) present two types of graphi-
cal views for information extracted at the network
layer: services behavior view (behavior of the
internal/external hosts with respect to a certain set
of services) and category view (hosts are sorted
with respect to a particular relevant attribute,
like number of IPs used). In Teoh et al. (2004),
the authors describe an integration of visual and
automated data mining methods for discovering
and investigating anomalies in Internet routing.
The analysis tool presents different components
that complement each other, where visualization
and interaction are key to support user involve-
ment. Muelder et al. (2005) employ visualization
to detect scans interactively, while Livnat et al.
(2005) suggest a novel paradigm for visual cor-
relation of network alerts from disparate logs, that
facilitates and promotes situational awareness in
complex network environments. This approach is
based on the notion that an alert must possess three
attributes, namely, what, when, and where. Cai and
de M. Franco (2009) exploit both interaction and
visualization to reveal real-time network anoma-
lous events. Glyphs are defined with multiple
network attributes and clustered with a recursive
optimization algorithm for dimensional reduction.
The user’s visual latency time is incorporated
into the recursive process so that it updates the
display and the optimization model according to
a human-based delay factor.
Despite the extensive number of examples of
the application of visualization to anomaly detec-
tion in network security, few examples exist outside
this domain. An exception is the work presented
in Iwata and Saito (2004), where a new anomaly
detection method that visualizes data in 2- or
3-dimensional space based on the probabilities of
belonging to each component of the model and the
137
The Importance of Visualization and Interaction
probability of not belonging to any component,
anomaly, is proposed. For evaluation purposes,
the method is applied to an artificial time series.
ANOMALY DETECTION METHODS
FOR MARITIME TRAFFIC
It is hard to clarify what exactly anomaly detection
means. Anomaly is a many-sided concept and it
is normally associated with terms like abnormal,
unusual, irregular, rare, deviation, strange, ille-
gal, threat, atypical, inconsistent, etc. Many data
mining techniques analyze data in order to find
behavioral anomalies. Behavioral anomalies are
defined as deviations from the normal behavior.
Here, an anomaly is defined from a user (opera-
tor or organization) point of view, as events or
situations that need to be detected and identified
(see Riveiro et al. [2009] for a detailed discus-
sion). A classification and examples of sea traffic
anomalies from operators and practitioners point
of view is provided in Roy (2008).
Most of the published work regarding anomaly
detection, as previously shown, relates to intrusion
detection applications for network traffic. Algo-
rithms used in the detection of intrusions/attacks
are traditionally classified in three main groups
(Patcha & Park, 2007): anomaly (referring only
to data-driven approaches), signature or hybrid.
Systems based on anomaly detection schemes
(data-driven approaches) look for abnormalities
in the traffic, assuming that something that is
abnormal is probably suspicious. Such detectors
are based on what constitutes normal behavior
and what percentage of the activity we want or are
allowed/willing to flag as abnormal. Signature-
based approaches look for predefined patterns in
the data. Hybrid approaches combine data and
knowledge driven approaches.
In the civil security domain, anomaly detec-
tion is not as mature as it is the network security
arena. To the best of our knowledge, anomaly
detection and behavioral analysis approaches ap-
plied to sea surveillance have not been covered in
previously published anomaly detection reviews
and, in particular, no review includes any analysis
regarding human factors.
This section presents a review of anomaly
detection approaches for sea surveillance. The
objective is to analyze where human involvement
is needed and how visualization and interaction
might facilitate anomaly detection. The classifi-
cation and description of each method includes
information regarding: (1) detection method (data
or knowledge driven), (2) nature of data analyzed,
and (3) usage frequency (real-time continuous
monitoring or periodic analysis). Moreover, for
each method, we provide a brief description of its
fundamentals and analyze the following aspects
(if they apply): (1) input parameters, (2) normal
model and rule set, (3) a description of the detec-
tion process, (4) output, and (5) explanation of
the detections.
Data-Driven Anomaly Detection
In this category, approaches used within maritime
anomaly detection can be classified as statistical
(parametric and non-parametric) and machine
learning based (e.g. Bayesian networks, neural
networks or clustering techniques).
Statistical Parametric
Kraiman et al. (2002) present an anomaly detector
processor, which exploits multisensor tracking and
surveillance data to identify interesting events.
The authors demonstrate the detector within a
Vessel Traffic Service (VTS) environment, using
input data regarding vessel type, speed, location,
report time and heading, as well as environmental
information such as tides, wind speed and direction
(nonetheless, examples shown in the article are
limited to position and speed values). The detec-
tion approach is a statistical parametric method,
based on a combination of SOMs and Gaussian
Mixture Models (GMM). The parameters of the
138
The Importance of Visualization and Interaction
Gaussian distributions (mean and covariance
matrices) can be estimated from the available
training data using SOMs. Each node of the grid
is characterized by an N-dimensional Gaussian
probability function, where the means are given
by the final values of the nodes and the variances
are given by the dispersion of the training data
around each node. Therefore, the baseline profile
or normal model is a multidimensional likelihood
function that it is used to estimate the probability
value of a new observation. Over the likelihood,
Bayes’ rule is applied to calculate the probability
value of obtaining such observation. In order to
do so, the user must introduce the percentage of
the training data that is anomalous (an important
input parameter).
The detector based on this approach presents
a Graphical User Interface (GUI) to facilitate
operator interaction. Even if the functionality of
the GUI is not described in Kraiman et al. (2002),
the following input parameters can be determined:
attributes used during the training phase, weight of
the attributes, characteristics of the SOM (number
of nodes and training radius), percentage of train-
ing data that is anomalous, threshold for reported
anomalies and width of the temporal window for
cumulative probability calculation. The output
consists of a plot of cumulative probability of
anomaly versus time and a characterization of
the anomaly, explanation (showing in percent
how the different attributes have contributed to
the anomaly). Anomalous vessels are displayed
in red over the geographical area. The detector
was trained over one week of traffic data, but no
information regarding the performance of the
detector is given.
The method described in Laxhammar (2008)
is similar to Kraiman et al.’s approach, but in
this case the normal model representing vessel
behavior is built using a combination of a greedy
version of the Expectation-Maximization (EM)
algorithm and GMM. Here, EM is used to estimate
the parameters, mean and covariance, needed
to combine the Gaussian distributions. Since
the classical EM algorithm is very sensitive to
initialization (it may converge to a local optimal
solution different from the global) a greedy ver-
sion is proposed. Instead of starting randomly,
the greedy EM builds the optimal mixture model
adding new components one at a time (support
for such initialization and components weights are
input parameters). Another input parameter is the
maximum number of mixture components. The
method is tested over real maritime traffic data
from Swedish waters, where position, speed and
course are considered. Latitude and longitude are
discretized. One week of data was used for train-
ing and one week for validation (EM requires a
validation set during training).
Statistical Non-Parametric
Ristic et al. (2008) present a statistical non-para-
metric analysis of vessel motion patterns, in ports
and waterways, using Automatic Identification
System (AIS) data. The detection is carried out
using adaptive Kernel Density Estimation (KDE).
The variables used are position (two dimensions)
and velocity (two dimensions). The suggested
solution assumes that the AIS data has been pre-
processed and patterns have been extracted (these
patterns constitute the baseline used during the
detection process).
The normal model is, thus, a collection of
motion patterns extracted from historical AIS
data. The necessary input parameters (even if
they are not specifically pointed out in the paper)
are type of kernel (‘normal’ is usually the default
value), smoothing parameter (bandwidth of the
kernel-smoothing window) and threshold value.
Threshold determines the probability value of
an alarm, establishing the border between two
hypotheses (normal, H0, or abnormal vessel
behavior, H1). The output is the outcome of the
classification (H0 or H1).
The existing publications concerning this
method do not contain information regarding
how to create the normal model or baseline. It is
139
The Importance of Visualization and Interaction
problematic to define motion patters, since there
are multiple origins, destinations and connec-
tions paths in maritime traffic data. Moreover,
non-parametric methods like KDE require large
amounts of representative data of normal behavior,
compared to traditional parameterized approaches.
Clustering and Outlier Detection
Vessel motion baseline profiles can also be built
considering trajectories. Similar vessel trajecto-
ries are grouped thereby modeling regular traffic
routes. Deviations from such routes are considered
anomalous (Euclidean distances between clusters
and trajectories may be used as metrics). An
example application in the maritime domain is
the work presented in Dahlbom and Niklasson
(2007). The authors focus on the use of a trajectory
clustering algorithm over maritime traffic in order
to create normal sea lanes, not on the problem of
detecting anomalies. Simulated radar readings of
vessel traffic along the southern coast of Sweden
are used in the experiments. The authors discuss
the problems the clustering algorithm presents
regarding matching incoming trajectories to
clusters. The authors argue that prefix matching
is not suitable for coastal surveillance and propose
the use of splines.
Rhodes’ research group (BAE Systems) has
extensively studied the problem of learning normal
vessel motion patterns (see Rhodes et al. (2005);
Bomberger et al. (2006); Rhodes et al. (2007a)).
The presented approaches are applied to harbor
areas and both simulated and real AIS data are
analyzed. Position (latitude and longitude) and
velocity (course and speed) are considered. The
discretization of both features (position and ve-
locity) is necessary. The system takes real-time
tracking information and uses continuous on-the-
fly learning that enables concurrent recognition of
patterns of current motion states. In Rhodes et al.
(2005), the learning approach combines an unsu-
pervised clustering algorithm (Fuzzy ARTMAP
neural network) and a supervised mapping and
labeling algorithm. Extensions of this approach
can be found in Bomberger et al. (2006); Rhodes
et al. (2007a). Even if the authors claim that opera-
tor intervention is not necessary, they agree that
operators or analysts can help teaching the model
via simple point and click actions, increasing the
speed and performance of the learning phase.
Bayesian Inference
A Bayesian Network (BN) is a graphical model
that encodes probabilistic relationships among
variables of interest (Patcha & Park, 2007). The
graphical model conveys information regarding
causal relations and interdependencies between
variables. A BN is a suitable approach to anomaly
detection, since it can be used when there is a need
to combine prior knowledge with data (Patcha &
Park, 2007). Moreover, due to their transparency,
human domain experts are able to validate and
improve BNs.
An example of the application of BN to the
maritime anomaly detection problem is provided
in Johansson and Falkman (2007). Synthetic data
is used during the experimental phase (simulated
radar readings). The variables used are x, y, head-
ing, speed, heading, speed and vesseltype. The
feature space is discretized. The BN represents the
underlying probability distribution of the data, as-
suming that we can construct such representation.
Based on the data, first the structure of the graph
is built and then the conditional probabilities are
estimated. Two important input parameters are,
as in other approaches, the size of the window
(number of most recent samples) that averages the
probability value over time and the threshold used
to flag an alarm (balance between the detection
rate, recall, and the precision, false positives).
The normal model is thus the BN built from data
and the output is a joint probability value P(x,
y,heading,speed,speed,heading,vesseltype).
When anomalous behavior is detected using this
approach, no further information or explanation
is provided, meaning that no feature or group of
features are suggested as rationale behind alarms.
140
The Importance of Visualization and Interaction
Knowledge-Driven
Anomaly Detection
The majority of the few anomaly detection ca-
pabilities implemented in real maritime control
centers are rule (signature or misuse) based
systems. Such systems allow operators to create
simple rules that will trigger an alarm (e.g. IF
<vessel in shallow waters> TEHN <danger of
grounding>).
Initial steps to more elaborate anomalous situa-
tion detector, i.e. combinations of events over time
is presented in Edlund et al. (2006). Based on an
agent framework and using an ontology geared
toward sea surveillance, the authors described a
rule-based situation assessment system that ana-
lyzes situations developing over time. Rules are
created by experts using the rule editor agent. In
order to create new rules, experts select known ob-
jects from a list, choose their relation (approaching,
leaving, inharborarea) and connect them in time.
The ontology is based on a previously published
core ontology for situation awareness. No user
interaction with the ontology is supported. The
rule editor GUI and the detector, reasoner agent,
are under development.
Hybrid Approaches
Hybrid approaches to anomaly detection combine
both data-driven and knowledge-based methods,
overcoming some of the drawbacks of each
particular method (high false alarm rate in the
data-driven case and the possibility of detecting
only known patterns in the knowledge-based
case). An example of a compound approach to
the maritime anomaly detection problem is the
detector implemented in SeeCoast (Seibert et
al., 2006). The detector applies rule-based and
learning-based pattern recognition algorithms to
alert illegal, threatening and anomalous vessel ac-
tivities. SeeCoast extends the detection capability
of the learning-based pattern module described
above (see Bomberger et al. (2006); Rhodes et
al. (2007a,b)) using a rule-based track activity
analysis. The rule-based component implements a
three-stage approach to rule-building and match-
ing: domain modeling, pattern definition and pat-
tern matching. In the domain modeling stage, an
ontology is built describing the data sources and
the attributes of data reports (e.g., fused tracks as
a data type, with velocity as a data field). In the
pattern definition stage, operators use a GUI to
create patterns based on the ontology (e.g. <any
track whose location is within a restricted area>).
A GUI allows operators to script patterns, walk-
ing the operator through a series of selections
and questions that use information about the data
environment to simplify the process. Operators
can also create patterns from templates that only
require specification of key inputs. In the pattern
matching stage, operators select a set of patterns
to be monitored for and the system then gener-
ates alerts for matching instances. A snapshot of
the flagged vessel assists the operator deciding
on further actions (offering thus, explanation
capabilities).
SeeCoast is a complex and powerful port
security and monitoring system that besides
the anomalous detector module includes video
processing to detect, classify and track vessels;
multi-sensor track correlation of video track data
with radar and AIS tracks; ship size classification,
display enhancements for improved situational
awareness and forensic analysis.
HOW DO EXPERTS MONITOR
MARITIME TRAFFIC?
In maritime transportation, traffic control is car-
ried out by both coastal and port Vessel Traffic
Services (VTS), whose centers aim to improve
vessel traffic safety and efficiency, safeguard
human life at sea, as well as protect the maritime
environment, adjacent shore areas, work sites, and
141
The Importance of Visualization and Interaction
offshore installations from the possible adverse
effects of marine traffic. Three maritime control
centers were visited during our field work. Such
centers offer their services 365 days/year and 24
hours/day. The essential sources of data used for
monitoring maritime traffic are radar data, Au-
tomatic Identification System (AIS) messages,
VHF radio, Closed Circuit TV cameras (CCTV),
harbor planning and administrative information,
data bases with historical information about the
vessels, telephone and fax, and meteo/hydro
equipment (weather reports and marine currents
information). The VTS operators interviewed have
lengthy maritime and seagoing experience and
receive education in accordance with the Interna-
tional Association of Marine Aids to Navigation
and Lighthouse Authorities guidelines.
VTS operators use various surveillance sys-
tems. The systems are customized for each center
and display real-time radar and AIS data (some-
times referred to as ’common operating maritime
picture’) that serve as a basis for carrying out
main tasks such as monitoring and information
services. The visualization and interaction capa-
bilities of the systems used are quite limited. The
main visualization consists of a geographical map
where vessels are displayed using different icons
and colors. Speed vectors and navigational infor-
mation are displayed over the background map.
Other graphical representations are not provided
(no abstract representations, links between enti-
ties, or 3D visualizations are available). Selection
and zooming in/out are provided as interaction
methods. The systems used at the VTS centers
allow some manual identification of anomalies.
For example, the operators can make queries that
show all the vessels exceeding a certain speed
value or crossing a particular borderline. These
functionalities, which may be considered anomaly
detectors, are rarely used, since they must be car-
ried out manually (operators stated that they are
time consuming procedures) and do not cover
many of the situations the operators are interested
in detecting (see Figure 1).
VTS operators need the timely identification
of possible traffic-conflict situations emerging in
the surveyed area, and respond appropriately.
Examples of such situations are vessel collisions
and groundings in the port and entrance areas.
Moreover, personnel interviewed in these centers
would appreciate support in detecting vessels
navigating through restricted zones, vessels not
following the established sea lanes, vessels not
following the normal route with regard to the
reported destination, cargo of special interest,
vessels carrying dangerous gods sailing close to
passenger ships or protected areas, vessels with
a history of being involved in illegal activities,
suspicious flag or port, fishing or recreational
craft approaching traffic separation zones, etc.
Despite slight differences among the three
visited centers, the actual process of finding
anomalous behavior and conflict situations can
be summarized in five stages: (1) overview
Figure 1. VTS Gothenburg, Sweden The figures
depict two working areas in the control room,
illustrating environment, tools, and systems used
142
The Importance of Visualization and Interaction
(monitor and explore): continuous control of the
traffic in real-time, using radar, VHF radio, and
AIS information; (2) if something is unusual or
unfamiliar (operators normally base their judgment
on their experience), detailed information must be
obtained, like zooming into the area and starting
VHF radio communication with vessel of interest;
(3) waiting time: operators usually wait a reason-
able period of time, observing how the situation
develops. At this stage, operators might listen to
VHF radio communications among vessels, to
increase their understanding of the situation; (4)
more detail (focus): if the situation has not become
normal, they intensify the dialog with the vessel
of interest or try to obtain more data using, for
example, additional information stored in data
bases; (5) taking action: if they believe that an
incident has occurred, they take action, alerting
other organizations and reporting the situation.
This basic pattern, or loop, describes the typical
overall process. Operators move back and forth
between these stages, for example, between stage
3 (waiting time) and 4 (more detail). The stages
vary in length and some stages include several
sub-loops.
USING VISUALIZATION
AND INTERACTION IN
ANOMALY DETECTION
Considering the insights gained during our vis-
its to maritime control centers and the review
presented in one of the previous sections, the
anomaly detection process can be divided in:
on-line and off-line processing (see Figure 2).
On-line processing refers to the analysis in real-
time of the incoming data, whereas the off-line
processing refers to the establishment or normal
models from (training) data and rules that are
used during the on-line detection process. Both
processes resemble typical data mining cycles
and are, obviously, interconnected.
Figure 2. Using visualization and interaction to support user involvement in the anomaly detection process
143
The Importance of Visualization and Interaction
We argue that visualization and interaction is
key to improving anomaly detection performance
in general, and in particular, visualization and
interaction are key to perform an adequate analy-
sis of the data, construct understandable normal
models, update and validate such models and
create useful and comprehensible output, that can
not only generate suitable responses from opera-
tors but also improve the whole anomaly detection
process. Figure 2 points where visualization and
interaction could make a positive difference.
Data Visualization
Data visualization supports the understanding of
the data and the interaction between the analyst/
operator and the dataset during the preprocessing
phase. There is a wide variety of techniques to
visualize both low and multidimensional datasets
(e.g. pixel-based techniques, scatter-plots, parallel
coordinates, geometric projections and icon-based
methods). Keim (2002) reviews and provides a
classification of visualizations based on the data
type to be visualized, the visualization technique
and the interaction and distortion technique.
In order to select appropriate visualization
techniques, it should be taken into account the
spatial and temporal nature of the information
in the maritime domain. An interesting example
of the visualization of vessel tracks is the work
presented in Willems et al. (2009). Analyzing AIS
data, the authors present an overlay map that show
where sea lanes, anchoring zones or slow moving
vessels are located.
The visualization of the data may also sup-
port the analyst while cleaning, selecting and
transforming the data. A common problematic
phase that influences the detector performance
in many of the anomaly detection methods re-
viewed (see Laxhammar (2008); Bomberger et
al. (2006)) is the discretization and normalization
of the feature space. Proper representations of the
data regarding how the discretization affects the
construction of the baseline behavior and how
samples are distributed over the feature space are
needed. Moreover, other aspects that should be
considered in this case are, for example, how to
represent inconsistencies in the data, uncertainty,
quality, reliability, etc.
Parameter Visualization
Parameter visualization supports the interaction
between the analyst/operator and the process of
selecting, tuning and optimizing input values to the
on-line and off-line processes involved. Parameter
selection and tuning requires the exploration of
several alternatives (Meneses & Grinstein, 2001)
and it is a complex optimization problem.
Statistical anomaly detection methods require
the selection and tuning of multiple parameters
(e.g. learning rates, type of kernel function,
smoothing values and number of Gaussian mix-
tures). The reviewed approaches do not make
clear the correlation between domain features and
parameter setting values, and parameters seem to
be selected in a more or less ad hoc manner. One
arduous task in all the reviewed anomaly detec-
tion methods is tuning the anomaly threshold.
The threshold value balances the detection rate
(recall) and the number of false positives or false
alarms (precision). Another delicate matter is the
selection of the sliding window size that averages
probability/likelihood values that are compared
to the threshold. If the window size is too small,
the system will be sensitive to data or sensor er-
ror, while a too large value may hide anomalies.
Visualization and interaction can be used to
understand the parameter selection and tuning
optimization process for a particular dataset,
providing comprehensible views of the impact
that these steps have in the final detection stages.
Unfortunately, the visualization of parameter
selection and tuning processes has been mainly
overlooked by the anomaly detection research
community (an exception is the work presented
in Meneses and Grinstein [2001]).
144
The Importance of Visualization and Interaction
Model Visualization
Model visualization supports the comprehension
and interaction with normal models and rules
embedded in the system. Such visual representa-
tions may support the creation, validation and
update phases. The analyst/operator may be able to
compare models, communicate them to colleagues
and evaluate if they match his/her understanding
of the world.
The representation of normal models built
from data has hardly received any attention by the
research community. An exception is Rheingans
and desJardins (2000), where the authors describe
a set of visualization methods that help users to
understand and analyze the behavior of learned
models (the article focuses on classification tasks
using BN).
Knowledge-based approaches normally use
a set of rules that represent situations that are
of interest to an analyst/operator (unlike data-
driven approaches, these signatures represent the
‘anomalousbehavior). Visual and interactive rep-
resentations of rules provide a natural way of un-
derstand, create, validate, update and prune them.
In opposition to the lack of proposals regarding
visualizations of induced models, extensive work
has been done on the representation of rules (most
of the publications refer to signatures extracted
from large data sets). For example, a framework
for mining and analyzing large rule sets through
visualization is presented in Bruzzese and Davino
(2008). During our review on knowledge-based
approaches, another important matter related to
the creation of rules that may benefit from the use
of visualization and interaction is the necessity
of constructing proper ontologies that represent
objects, concepts, events and relationships (see
Seibert et al. (2006)).
Detection Visualization
Detection visualization supports the understanding
of the whole process, from data to alarms.
This process is a continuous hypothesis genera-
tion and testing cycle that involves all the aspects
previously seen. An example of the visualization
of the detection process can be seen in Kraiman
et al. (2002). The GUI shows data, probability
values vs. time, alarms and explanations.
Outcome Visualization
Outcome visualization refers to the representa-
tion of triggered alarms. Visual representations
of alarms should support their analysis, in order
to find, for example, correlations among them.
Monitoring generated alarms is normally a
challenging activity. In our visits to maritime
centers, operators have highlighted the necessity
of keeping interactive lists of alarms (ordered by
importance). Operator response to the generated
alerts (acknowledgment/rejection) may be used
as a teaching signal to the detector, refining thus,
its performance.
Explanations
Jensen et al. (1995) claim that decision support
systems should have features for explaining how
they have come up with their recommendations
in order to support the decision maker as well as
increase his/her confidence in the system. The
ability of explaining the reasoning behind an alarm
is of great importance in order for an operator
to fully accept the advice the system provides.
Despite this fact, the amount of research devoted
to this subject is relatively sparse and most of the
reviewed work does not tackle this issue. One of
the reasons is that it might be difficult to point
out which features have triggered the alarm or it
might be difficult to construct or communicate the
evidence. For example, the outcome of data-driven
statistical approaches is normally a probability
value per observation that represents P(features
considered). It might not be possible to point
out which feature or features are the cause of the
145
The Importance of Visualization and Interaction
alarm. In this case, we need additional methods
that investigate further the outcomes generated.
Limitations
A relevant aspect that we have not discussed dur-
ing the analysis of the anomaly detection process
is the different users and roles that might use the
anomaly detection capability. The daily operator
(that monitors on-line traffic) might not be able
to create or update normal models or rules, due
to time constraints, policies or lack of background
knowledge. On the other hand, analysts may be
able to maintain and configure the system regard-
ing their off-line workings, selection and tuning
of parameters, update models, thresholds, etc.
EXAMPLES
In this section we illustrate some of the aspects
discussed in the previous section with examples
from our own research within the maritime domain.
The objective of this section is to demonstrate how
visualization can enhance the anomaly detection
process, focusing on some of the steps of the
process presented above.
The first example illustrates how parameter and
model visualization can support the selection of
methods that match the problem. Figure 3 (inspired
by the study presented in Laxhammar et al. (2009))
shows how two different anomaly detection ap-
proaches model two parallel vessel trajectories.
The left peak is calculated using GMM and the
right peak is calculated using KDE. The GMM
peak is unimodal, hiding the separation between
the parallel sea lanes while the bimodal KDE sat-
isfactorily captures the separation between them.
Hence, we may conclude that KDE method models
more accurate the vessel trajectories analyzed.
The second example concerns model visualiza-
tion. Figure 4 presents a visualization of normal
vessel behavioral models built from real AIS data.
In order to build such model, we have used a
statistical method that combines SOM and GMM.
The biggest challenge we have faced while trying
to represent the normal model was the fact that
we would need an eight-dimensional space to
represent such probability density function (we
use eight vessel features: position, speed, course
over ground, heading, length, width and draught).
We projected the probability function over a 2
dimensional map. High values of probability are
represented in red, while blue represents lower
probability values. These visualizations allow
comprehension of normal vessel behavior built
from data, supporting validation and improvement
of such models.
The last example, Figure 5, shows how expla-
nations can be visualized using trees. In this case,
BNs are used to find anomalous vessels hidden
in AIS data. In order to generate comprehensible
Figure 3. Two parallel vessel trajectories and their model estimations using GMM and KDE probability
density functions. A full comparison between these two methods can be read in Laxhammar et al. (2009)
146
The Importance of Visualization and Interaction
explanations from BNs outcomes, we have tested
two algorithms, Explanation Tree and Causal
Explanation Tree. The tree in Figure 5 shows
which of the features are selected as best causes
of anomaly when the BN pointed a vessel as
suspicious (in this particular case, the abnormal-
ity hidden is a vessel speeding in a slow moving
area). More details on the application of these
algorithms can be found in Helldin and Riveiro
(2009).
Figure 4. Visualizations of normal behavioral models for cargo (left), tanker (middle) and passenger
(right) vessels. The models are calculated using a combination of SOMs and GMMs from real AIS
data along the Swedish west coast. The following features are considered: position, speed, course over
ground, heading, length, width and draught. The probability values are projected over a geographical
map (Google Earth).
Figure 5. Explanation visualization: A Conditional Explanation Tree (CET) explains the inference made
by a BN
147
The Importance of Visualization and Interaction
CONCLUSION
Current anomaly detection capabilities and tools
provide very limited possibilities to incorporate
any expert knowledge or any user input at all. In our
opinion, designers and developers underestimate
the benefits of human involvement in the anomaly
detection process. The necessity of such involve-
ment can be seen from two perspectives. Firstly,
anomaly detection systems for sea surveillance are
not used autonomously in the real world. We need
to provide adequate support for human decision
makers, making transparent and trustworthy the
anomaly detection process. Secondly, since the
anomaly detection problem is hard to solve in
an automatic manner (it normally generates high
number of false alarms due to its complexity), we
need to include expert knowledge in the loop in
order to improve detector’s performance.
Based on a review of anomaly detection
methods applied to maritime traffic data, this
chapter examines the anomaly detection process,
highlighting where visualization and interaction
can be used to support human involvement, thus,
enhancing the process. The analysis presented here
may inform the design of future anomaly detec-
tion systems when fully automatic approaches are
not viable and human participation is needed. We
would like to facilitate the design of interfaces
that support human involvement and are properly
integrated in the overall KDD process. The feed-
back that analyst/operator can provide to these
processes can hardly be obtained by other means.
REFERENCES
Axelsson, S. (2005). Understanding Intrusion
Detection Through Visualization. (Ph.D. thesis).
Goteborg, Sweden: Chalmers University of
Technology.
Bomberger, N., Rhodes, B., Seibert, M., & Wax-
man, A. (2006). Associative learning of vessel
motion patterns for maritime situation awareness.
In Proceedings of 9th International Conference
on Information Fusion. New Brunswick, NJ:
IEEE Press.
Bruzzese, D., & Davino, C. (2008). Visual mining
of association rules. In Visual Data Mining (103–
122). Berlin: Springer-Verlag. doi:10.1007/978-
3-540-71080-6_8.
Cai, Y. & de M. Franco, R. (2009). Interactive
visualization of network anomalous events. In:
Computational Science, 5544, 450–459. Berlin:
Springer.
Dahlbom, A., & Niklasson, L. (2007). Trajectory
clustering for coastal surveillance. In Proceedings
of the 10th International Conference on Informa-
tion Fusion. QC, Canada: IEEE Press.
Demšar, U. 2006. Data Mining of Geospatial
Data: Combining Visual and Automatic Meth-
ods. (Ph.D. thesis). Stockholm, Royal Institute
of Technology (KTH).
Edlund, J., Gronkvist, M., Lingvall, A., & Svies-
tins, E. (2006). Rule-based situation assessment
for sea surveillance. In Proceedings of SPIE Con-
ference on Multisensor, Multisource Information
Fusion: Architectures, Algorithms and Applica-
tions, 624, 1–11. Bellingham, WA: SPIE Press.
Fayyad, U., Grinstein, G., & Wierse, A. (Eds.).
(2002). Information visualization in data mining
and knowledge discovery. San Francisco: Morgan
Kaufmann Publishers Inc..
Hand, D. J., Mannila, H., & Smyth, P. (2001).
Principles of data mining. Adaptive computation
and machine learning. Cambridge, MA: The
MIT Press.
Harrison-John, G. (1997). Enhancements to the
Data Mining Process. (Ph.D. thesis). Stanford,
CA, Stanford University.
148
The Importance of Visualization and Interaction
Helldin, T., & Riveiro, M. (2009). Explanation
methods for bayesian networks: review and ap-
plication to a maritime scenario. In: 3rd Annual
Skövde Workshop on Information Fusion Topic,
11–16. New Brunswick, NJ: IEEE Press.
Iwata, T., & Saito, K. (2004). Visualization
of anomaly using mixture model. In Knowl-
edge-Based Intelligent Information and En-
gineering System, 624–631. Berlin: Springer.
doi:10.1007/978-3-540-30133-2_82.
Jensen, F., Aldenryd, S., & Jensen, K. (1995).
Sensitivity analysis in bayesian networks. In
Symbolic and Quantitative Approaches to Reason-
ing and Uncertainty, 243–250. Berlin: Springer.
doi:10.1007/3-540-60112-0_28.
Johansson, F., & Falkman, G. (2007). Detection of
vessel anomalies–A bayesian network approach.
In Proceedings of the 3rd International Confer-
ence on Intelligent Sensors, Sensor Networks,
and Information Processing. New Brunswick,
NJ: IEEE Press.
Keim, D. (2002). Information visualization and
visual data mining. IEEE Transactions on Vi-
sualization and Computer Graphics, 7(1), 1–8.
doi:10.1109/2945.981847.
Keim, D. A., Mansmann, F., & Thomas, J. (2009).
Visual analytics: How much visualization and how
much analytics. SIGKDD Explorations, 11(2).
Kerren, A., Stasko, J., Fekete, J.-D., & North, C.
(2007). Workshop report: Information visualiza-
tion–human-centered issues in visual represen-
tation, interaction, and evaluation. Information
Visualization, 6, 189–196.
Kharchenko, V., & Vasylyev, V. (2002). Applica-
tion of the intellectual decision making system
for vessel traffic control. In Proceedings of 14th
International Conference on Microwaves, Radar,
and Wireless Communications, 2, 639–642. New
Brunswick, NJ: IEEE Press.
Kraiman, J. B., Arouh, S. L., & Webb, M. L.
(2002). Automated anomaly detection processor.
In Sisti & Trevisani (Eds.), Proceedings of SPIE:
Enabling Technologies for Simulation Science VI
(128–137). Bellingham, WA: SPIE Press.
Laxhammar, R. (2008). Anomaly detection for
sea surveillance. In Proceedings of the 11th In-
ternational Conference on Information Fusion,
47–54. Cologne, Germany: IEEE Press.
Laxhammar, R., Falkman, G., & Sviestins, E.
(2009). Anomaly detection in sea traffic-A com-
parison of the gaussian mixture model and the
kernel density estimator. In Proceedings of the
12th International Conference on Information Fu-
sion, 756–763. New Brunswick, NJ: IEEE Press.
Livnat, Y., Agutter, J., Moon, S., Erbacher, R. F., &
Foresti, S. (2005). A visual paradigm for network
intrusion detection. In Proceedings of the 2005
IEEE Workshop on Information Assurance and
Security, 92–99. New Brunswick, NJ: IEEE Press.
Mansmann, F. (2008). Visual Analysis of Network
Traffic: Interactive Monitoring, Detection, and
Interpretation of Security Threats. (Ph.D. thesis).
Konstanz, Germany, Universität Konstanz.
Meneses, C. J., & Grinstein, G. G. (2001). Visu-
alization for enhancing the data mining process.
[Bellingham, WA: SPIE Press.]. Proceedings of
the Society for Photo-Instrumentation Engineers,
4384, 126–137. doi:10.1117/12.421066.
Muelder, C., Ma, K.-L., & Bartoletti, T. (2005).
Interactive visualization for network and port scan
detection. In Proceedings of 2005 Recent Advances
in Intrusion Detection, 1–20. New Brunswick,
NJ: IEEE Press.
Onut, I. V., Zhu, B., & Ghorbani, A. A. (2004). A
novel visualization technique for network anomaly
detection. In Proceedings of the 2nd Annual Con-
ference on Privacy, Security, and Trust, 167–174.
New York: ACM Press.
149
The Importance of Visualization and Interaction
Patcha, A., & Park, J.-M. (2007). An overview
of anomaly detection techniques: Existing solu-
tions and latest technological trends. Computer
Networks, 51(12), 3448–3470. doi:10.1016/j.
comnet.2007.02.001.
Rheingans, P., & desJardins, M. (2000). Visual-
izing high-dimensional predictive model quality.
[New Brunswick, NJ: IEEE Press.]. Proceedings
of IEEE Visualization, 2000, 493–496.
Rhodes, B., Bomberger, N., Seibert, M., & Wax-
man, A. (2005). Maritime situation monitoring and
awareness using learning mechanisms. Military
Communications Conference, 1, 646–652. New
Brunswick, NJ: IEEE Press.
Rhodes, B., Bomberger, N., & Zandipour, M.
(2007a). Probabilistic associative learning of
vessel motion patterns at multiple spatial scales
for maritime situation awareness. In: 10th Inter-
national Conference on Information Fusion, 1–8.
Rhodes, B. J., Bomberger, N. A., Zandipour,
M., Waxman, A. M., & Seibert, M. (2007b).
Cognitively-inspired motion pattern learning &
analysis algorithms for higher-level fusion and
automated scene understanding. In Military Com-
munications Conference (MILCOM 2007), 1–6.
New Brunswick, NJ: IEEE Press.
Ristic, B., Scala, B. L., Morelande, M., & Gordon,
N. (2008). Statistical analysis of motion patterns
in AIS data: Anomaly detection and motion
prediction. In Proceedings of 11th International
Conference of Information Fusion. New Bruns-
wick, NJ: IEEE Press.
Riveiro, M., Falkman, G., & Ziemke, T. (2008).
Improving maritime anomaly detection and situ-
ation awareness through interactive visualization.
In Proceedings of 11th International Conference
on Information Fusion, 47–54. New Brunswick,
NJ: IEEE Press.
Riveiro, M., Falkman, G., Ziemke, T., &
Kronhamn, T. (2009). Reasoning about anoma-
lies: A study of the analytical process of detecting
and identifying anomalous behavior in maritime
traffic data. InTolone, , & Ribarsky, (Eds.), SPIE
Defense, Security, and Sensing. Visual Analytics
for Homeland Defense and Security. Volume 7346.
Orlando, FL: SPIE Press.
Roy, J. (2008). Anomaly detection in the maritime
domain. In Proceedings of SPIE, Volume 6945,
69450W 1–14. Bellingham, WA: SPIE Press.
Seibert, M., Rhodes, B. J., Bomberger, N. A.,
Beane, P. O., Sroka, J. J., et al., & Tillson, R.
(2006). SeeCoast port surveillance. In Proceed-
ings of SPIE, Volume 6204: Photonics for Port
and Harbor Security II. Orlando, FL: SPIE Press.
Shearer, C. (2000). The CRISP-DM model: The
new blueprint for data mining. Journal of Data
Warehousing, 5(4), 13–22.
Teoh, S. T., Zhang, K., Tseng, S., Ma, K., & Wu,
S. F. (2004). Combining visual and automated data
mining for near-realtime anomaly detection and
analysis in BGP. In Proceedings of the 2004 ACM
Workshop on Visualization and Data Mining for
Computer Security, 35–44. New York: ACM Press.
Thomas, J., & Cook, K. (Eds.). (2005). Illumi-
nating the Path: The Research and Development
Agenda for Visual Analytics. Los Alametos, CA:
IEEE Computer Society.
Willems, N., Wetering, H. V. D., & Wijk, J. J.
V. (2009). Visualization of vessel movements.
Computer Graphics Forum, 28(3), 959–966.
doi:10.1111/j.1467-8659.2009.01440.x.
KEY TERMS AND DEFINITIONS
Anomaly Detection: Process of discovering
anomalies in a data set. Such process normally
compares the data of interest with a simplified
150
The Importance of Visualization and Interaction
description or model of the normality in order to
find mismatches.
Anomaly: In this chapter an anomaly is defined
from a user (operator or organization) point of
view, as exceptional objects, events or situations
that need to be detected and identified. We define
the term anomalous as a property, meaning “not
conforming to what might be expected because of
the class or type to which it belongs or the laws
that govern its existence, in a given situation or
context”.
Behavioral Anomaly: An anomaly that im-
plies a deviation from the normal behavior.
Predictive Data Mining: Class or type of data
mining processes used to predict some response
of interest. Predictive data mining is employed to
identify a model or a set of models from the data
that can be used to predict, for example, the value
of a particular attribute (Demšar, 2006). Statisti-
cal analysis, classification, and decision trees
techniques are used to produce such outcomes.
Predictive data mining techniques are used for
anomaly detection.
Visual Analytics: Analytical reasoning sup-
ported by highly interactive visual interfaces
(Thomas and Cook, 2005). Visual analytics strives
to facilitate the analytical reasoning process by
creating software that maximizes the human ca-
pacity to perceive, understand, and reason about
complex, dynamic data and situations.
... However, there is no exact definition of a maritime anomaly. Anomaly is usually associated with many terms from which terms like abnormality, abnormal, atypical, deviation, exceptional, illegal, inconsistent, irregular, benign, threat, not explained, incongruous, outlier, atypical, peculiar, rare, special, strange, threat, threatening, unusual, unnatural, improper, etc. could be mentioned here (Riveiro et al., 2008(Riveiro et al., , 2018Roy, 2008;Martineau and Roy, 2011;Riveiro, 2014). There could be various types of anomaly in marine traffic. ...
... Some could be related to abnormal movement like irregular, illegal and other anomalous appearances which are usually detected based on vessel trajectory analysis (Fu et al., 2017;Venskus et al., 2019). Amro et al. (2022) mentioned some anomalies like sudden unexpected or abnormal change in speed over ground (SOG), under-reporting, over-reporting, etc. Deviations from regular traffic routes are also considered anomalous (Riveiro, 2014). Maritime traffic anomaly can be attributed to a single ship, a convoy or all ships in a specified area (Riveiro, 2014). ...
... Amro et al. (2022) mentioned some anomalies like sudden unexpected or abnormal change in speed over ground (SOG), under-reporting, over-reporting, etc. Deviations from regular traffic routes are also considered anomalous (Riveiro, 2014). Maritime traffic anomaly can be attributed to a single ship, a convoy or all ships in a specified area (Riveiro, 2014). It can be anything abnormal or unwanted or illegal from the normal or desired ship activities (Liu, 2015). ...
Article
Full-text available
This study summarises the scenario of maritime traffic anomalies, like the increased congestion and U-turn of ships caused by the ship grounding in the Suez Canal in March 2021. Here, satellite automatic identification system based ship trajectories, and Sentinel-1 and Sentinel-2 images based ship positions are analysed after subdividing the study area into seas, lakes and canals. The results show that the blockage affected the maritime traffic for more than three weeks, waiting ship numbers increased from 5 to 122, and daily one to three ships made a U-turn between 23 and 31 March in the Gulf of Suez. Ship density also increased to more than double in Bitter Lakes with a minimum waiting time of 7 days. Hence, to avoid such prolonged waiting of ships, we propose a warning method based on the sharp speed decrease rate, U-turn and congestion.
Article
This article reviews the literature in the search for the theories and perspectives of knowledge discovery and data visualization. The literature review highlights the overview of knowledge discovery; Knowledge Discovery in Databases (KDD); Knowledge Discovery in Textual Databases (KDT); the overview of data visualization; the significant perspectives on data visualization; data visualization and big data; and data visualization and statistical literacy. Knowledge discovery is the process of searching for hidden knowledge in the massive amounts of data that individuals are technically capable of generating and storing. Data visualization is an easy way to convey concepts in a universal manner. Organizations, that utilize knowledge discovery and data visualization, are more likely to find both knowledge and information they need when they need them. The findings present valuable insights and further understanding of the way in which knowledge discovery and data visualization efforts should be focused.
Conference Paper
Full-text available
Neurobiologically inspired algorithms have been developed to continuously learn behavioral patterns at a variety of conceptual, spatial, and temporal levels. In this paper, we outline our use of these algorithms for situation awareness in the maritime domain. Our algorithms take real-time tracking information and learn motion pattern models on-the-fly, enabling the models to adapt well to evolving situations while maintaining high levels of performance. The constantly refined models, resulting from concurrent incremental learning, are used to evaluate the behavior patterns of vessels based on their present motion states. At the event level, learning provides the capability to detect (and alert) upon anomalous behavior. At a higher (inter-event) level, learning enables predictions, over pre-defined time horizons, to be made about future vessel location. Predictions can also be used to alert on anomalous behavior. Learning is context-specific and occurs at multiple levels: for example, for individual vessels as well as classes of vessels. Features and performance of our learning system using recorded data are described
Article
Full-text available
The goal of visual analytical tools is to support the analytical reasoning process, maximizing human perceptual, understanding and reasoning capabilities in complex and dynamic situations. Visual analytics software must be built upon an understanding of the reasoning process, since it must provide appropriate interactions that allow a true discourse with the information. In order to deepen our understanding of the human analytical process and guide developers in the creation of more efficient anomaly detection systems, this paper investigates how is the human analytical process of detecting and identifying anomalous behavior in maritime traffic data. The main focus of this work is to capture the entire analysis process that an analyst goes through, from the raw data to the detection and identification of anomalous behavior. Three different sources are used in this study: a literature survey of the science of analytical reasoning, requirements specified by experts from organizations with interest in port security and user field studies conducted in different marine surveillance control centers. Furthermore, this study elaborates on how to support the human analytical process using data mining, visualization and interaction methods. The contribution of this paper is twofold: (1) within visual analytics, contribute to the science of analytical reasoning with practical understanding of users tasks in order to develop a taxonomy of interactions that support the analytical reasoning process and (2) within anomaly detection, facilitate the design of future anomaly detector systems when fully automatic approaches are not viable and human participation is needed.
Article
Full-text available
SeeCoast extends the US Coast Guard Port Security and Monitoring system by adding capabilities to detect, classify, and track vessels using electro-optic and infrared cameras, and also uses learned normalcy models of vessel activities in order to generate alert cues for the watch-standers when anomalous behaviors occur. SeeCoast fuses the video data with radar detections and Automatic Identification System (AIS) transponder data in order to generate composite fused tracks for vessels approaching the port, as well as for vessels already in the port. Then, SeeCoast applies rule-based and learning-based pattern recognition algorithms to alert the watch-standers to unsafe, illegal, threatening, and other anomalous vessel activities. The prototype SeeCoast system has been deployed to Coast Guard sites in Virginia. This paper provides an overview of the system and outlines the lessons learned to date in applying data fusion and automated pattern recognition technology to the port security domain.
Conference Paper
In order to achieve greater situation awareness it is necessary to identify relations between individual entities and their immediate surroundings, neighboring entities and important landmarks. The idea is that long-term intentions and situations can be identified by patterns of more rudimentary behavior, in essence situations formed by combinations of different basic relationships. In this paper we present a rule based situation assessment system that utilizes both COTS and in-house software. It is built upon an agent framework that speeds up development times, since it takes care of many of the infrastructural issues of such a communication intense application as this is, and a rule based reasoner that can reason about situations that develop over time. The situation assessment system is developed to be simple, but structurally close to an operational system, with connections to outside data sources and graphical editors and data displays. It is developed with a specific simple Sea-surveillance scenario in mind, which we also present, but the ideas behind the system are general and are valid for other areas as well.
Article
Robust exploitation of tracking and surveillance data will provide an early warning and cueing capability for military and civilian Law Enforcement Agency operations. This will improve dynamic tasking of limited resources and hence operational efficiency. The challenge is to rapidly identify threat activity within a huge background of noncombatant traffic. We discuss development of an Automated Anomaly Detection Processor (AADP) that exploits multi-INT, multi-sensor tracking and surveillance data to rapidly identify and characterize events and/or objects of military interest, without requiring operators to specify threat behaviors or templates. The AADP has successfully detected an anomaly in traffic patterns in Los Angeles, analyzed ship track data collected during a Fleet Battle Experiment to detect simulated mine laying behavior amongst maritime noncombatants, and is currently under development for surface vessel tracking within the Coast Guard's Vessel Traffic Service to support port security, ship inspection, and harbor traffic control missions, and to monitor medical surveillance databases for early alert of a bioterrorist attack. The AADP can also be integrated into combat simulations to enhance model fidelity of multi-sensor fusion effects in military operations.
Article
Defence R&D Canada is developing a Collaborative Knowledge Exploitation Framework (CKEF) to support the analysts in efficiently managing and exploiting relevant knowledge assets to achieve maritime domain awareness in joint operations centres of the Canadian Forces. While developing the CKEF, anomaly detection has been clearly recognized as an important aspect requiring R&D. An activity has thus been undertaken to implement, within the CKEF, a proof-of-concept prototype of a rule-based expert system to support the analysts regarding this aspect. This expert system has to perform automated reasoning and output recommendations (or alerts) about maritime anomalies, thereby supporting the identification of vessels of interest and threat analysis. The system must contribute to a lower false alarm rate and a better probability of detection in drawing operator's attention to vessels worthy of their attention. It must provide explanations as to why the vessels may be of interest, with links to resources that help the operators dig deeper. Mechanisms are necessary for the analysts to fine tune the system, and for the knowledge engineer to maintain the knowledge base as the expertise of the operators evolves. This paper portrays the anomaly detection prototype, and describes the knowledge acquisition and elicitation session conducted to capture the know-how of the experts, the formal knowledge representation enablers and the ontology required for aspects of the maritime domain that are relevant to anomaly detection, vessels of interest, and threat analysis, the prototype high-level design and implementation on the service-oriented architecture of the CKEF, and other findings and results of this ongoing activity.
Article
Visualization has proved to be a suitable paradigm for the analysis and exploration of datasets. In the data mining cycle, visualization has been mainly focused on data visualization and output generation. However, besides datasets, many other entities need to be explored and understood by users and analysts. In this paper, we describe the role of visualization in the data mining process, and we present a model to support the interaction between users and data mining entities. We discuss visualizations of datasets, parameter spaces of data mining algorithms, models induced from datasets, and patterns generated by the application of data mining algorithms to datasets. We have developed a Java-based testbed, that implements the extended data mining model with visual support to interact with datasets, models, parameter spaces, and patterns. Experimental results based on several public datasets, data mining algorithms, multidimensional visualization techniques, and other novel visualizations, show clearly the benefits of the integration of visualization in the data mining process.
Article
From 28 May to 1 June 2007, a seminar on ‘Information Visualization–Human-Centered Issues in Visual Representation, Interaction, and Evaluation’ took place at the International Conference and Research Center for Computer Science, Dagstuhl Castle, Germany. One important aim of this seminar was to bring together researchers and practitioners from Information Visualization and related fields, as well as from application areas, for lively discussion and interaction. The seminar allowed critical reflection on actual research efforts, the state of field, evaluation challenges, and other important topics. This report summarizes the event.