Content uploaded by Grzegorz Szwoch
Author content
All content in this area was uploaded by Grzegorz Szwoch on Mar 25, 2019
Content may be subject to copyright.
A Framework for Automatic Detection of
Abandoned Luggage in Airport Terminal
Grzegorz Szwoch, Piotr Dalka, Andrzej CzyŜewski
Gdansk University of Technology, Multimedia Systems Department
80-233 Gdansk, Poland, Narutowicza 11/12
e-mail: greg@sound.eti.pg.gda.pl
Abstract A framework for automatic detection of events in a video stream
transmitted from a monitoring system is presented. The framework is based on the
widely used background subtraction and object tracking algorithms. The authors
elaborated an algorithm for detection of left and removed objects based on mor-
phological processing and edge detection. The event detection algorithm collects
and analyzes data of all the moving objects in order to detect events defined by
rules. A system was installed at the airport for detecting abandoned luggage. The
results of the tests indicate that the system generally works as expected, but the
low-level modules currently limit the system performance in some problematic
conditions. The proposed solution may supplement the existing monitoring sys-
tems in order to improve the detection of security threats.
1. Introduction
Automatic event detection in surveillance systems is becoming a necessity. Enor-
mous amount of video cameras used in facilities such as shopping malls, public
transport stations and airports makes it impossible for human operators to watch
and analyze all video streams in the real time. Such systems are typically used
only as a forensic tool rather than a preventive or interceptive tool. Therefore, it is
easy to miss harmful activities like theft, robbery, vandalism, fight or luggage
abandonment as well as frequent events that may be dangerous, like unauthorized
presence in restricted areas.
In the last few years many publications regarding automatic video surveillance
systems have been presented. These systems are usually focused on a single type
of human activity. Events regarding human behavior may be divided into three
main groups. The first group contains activities that does not involve other persons
or objects such as loitering [1] or sudden human pose changes like going from
standing to lying down that might indicate a pedestrian collapse [2]. The second
2
group includes neutral human interactions like walking together, approaching, ig-
noring, meeting, splitting [3] and violent ones, such as fist fighting, kicking or hit-
ting with objects [4]. The last group contains human activities that are related to
the environment. This includes intrusion or trespassing [5], wrong direction of
movement [6], vandalism [7] and luggage abandonment.
Determining object stationarity and finding an object left behind (e.g. backpack
or briefcase) is a critical task that leads to safety and security of all public trans-
port passengers or shopping mall customers. Abandoned luggage detection was
the main topic of PETS (Performance Evaluation of Tracking and Surveillance)
Workshop in 2006. The majority of papers presented there employ background
subtraction to detect foreground objects that are classified as newly-appeared sta-
tionary objects using simple heuristics [8] or Bayesian inference framework [9].
Other methods regarding this topic may be found in the literature. Spagnolo et al.
classify objects as abandoned or removed by matching the boundaries of static fo-
reground regions [10]. Another solution divides a video frame into blocks that are
classified as background and foreground; non-moving foreground block is as-
sumed to be stationary [5]. This method is robust against frequent dynamic occlu-
sions caused by moving people.
This paper presents a framework for detection a wide range of events in video
monitoring systems, using rules defined by the system operator. The framework is
based on the widely used background subtraction and object tracking algorithms
and adds procedures developed especially for the presented system, as described
in Section 2. An example application of the system at the airport for detection of
abandoned luggage is briefly presented in Section 3 and the paper ends with con-
clusions and indication of areas for the future developments.
2. Framework Design and Implementation
2.1 System Overview
The framework for automatic event detection is composed of several modules, as
depicted in Fig. 1. The low-level algorithms extract information on moving ob-
jects from camera images. The detailed explanation of these algorithms lies be-
yond the scope of this paper, hence only the general idea is described here. Mov-
ing objects are detected in the camera frames using the background modeling
method, based on Gaussian Mixture Model [11] (five distributions were used for
each pixel). The results of background modeling are processed by detecting and
removing shadow pixels (basing on the color and luminance of the pixels) and by
performing morphological operations on the detected objects in order to remove
small areas and to fill holes inside the objects. Next, movements of the detected
3
objects (blobs) are tracked in successive image frames using a method based on
Kalman filters [12]. A state of each tracked object (tracker) in each frame is de-
scribed by an eight-element vector describing its position, velocity and change in
the position and velocity. The state of each Kalman filter is updated for each im-
age frame, so the movement of each object is tracked continuously.
The next blocks of the analysis system are designed for the purpose of the pre-
sented framework. The task of the classification module is to assign detected mov-
ing objects (human, luggage, etc.). An event detection module analyses the current
and the past states of objects and evaluate rules in order to check if any of the de-
fined events occurred. The results of event detection are sent to the framework
output for further processing (visualization, logging, camera steering). The re-
maining part of this paper discusses relevant modules of the framework in detail.
Background
subtraction
Object
tracking
Object
Classification
Event
detection
Camera frames
Rules
Detection results
Fig. 1. General block diagram of the proposed framework for the automatic event detection
2.2 Resolving the Splitting Trackers
The main problem that had to be resolved in the object tracking procedure imple-
mented in the discussed framework was handling of ‘splitting objects’, e.g. if a
person leaves their luggage and walks away. In this situation, the tracker that was
assigned to a person carrying a luggage has to track further movements of the
same person and a new tracker has to be created for the luggage. A following ap-
proach is proposed for this task. First, groups of matching trackers and blobs are
formed. Each group contains all the blobs that match at least one tracker in the
group and all the trackers that match at least one blob in the group. The match is
defined as at least one pixel common to the bounding boxes of the blob and the
tracker. Within a single group, blobs matching each tracker are divided into sub-
groups (in some cases a subgroup may contain only one blob) separated by a dis-
tance larger than the threshold value. If all the blobs matching a given tracker
form a single subgroup, the state of the tracker is updated with the whole blob
group. If there is more than one subgroup of blobs matching the tracker, it is nec-
essary to select one subgroup and assign the tracker to it. In order to find the sub-
group that matches the tracker, two types of object descriptors are used – color
4
and texture. Color descriptors are calculated using a two-dimensional chrominance
histogram of the image representing the object. The texture descriptors are calcu-
lated from the same image, using a gray level co-occurrence matrix [13]. Five tex-
ture descriptors are used (contrast, energy, mean, variance and correlation) for
three color channels and four directions of pixel adjacency, resulting in a vector of
60 parameters describing single object’s appearance. In order to find a subgroup
of blobs matching the tracker, a vector of texture descriptors for the tracker D
T
is
compared with a corresponding vector D
B
calculated for the subgroup of blobs, us-
ing a formula:
( )
∑
=
−
−=
N
iBiTi
BiTi
T
DD
DD
N
S
1
,max
1
1
. (1)
where N is the number of elements in both vectors. The final similarity measure is
a weighted sum of texture similarity S
T
and color histogram similarity S
C
(equal to
a correlation coefficient of histograms of a tracker and a group of blobs):
S = W
T
S
T
+ W
C
S
C
(2)
where the values of weighting coefficients (W
T
= 0.75, W
C
= 0.25) were found
empirically. The subgroup of blobs with the largest S value is used to update the
state of the tracker. After each tracker in the group is processed, the remaining
(unassigned) blobs are used to construct new trackers. As a result, in case of a per-
son leaving a luggage, the existing tracker follows the person and a new tracker is
created for a left luggage.
2.3 Detection of Left or Removed Objects
For the purpose of event detection, each tracker has to be assigned to a proper
class (human, vehicle, luggage, etc.). In the test system intended to work at the
airport, simplified classification is used: each object is classified either as a human
or as a luggage, basing on analysis of objects velocity and their size and shape
variability. It is assumed that luggage (as a separate object) remains stationary and
do not change its dimensions significantly (some fluctuations in size and shape of
objects are inevitable due to inaccuracies of the background subtraction proce-
dure). In order to increase the accuracy of this simple classifier, a number of past
states of the object are taken into account, together with its current state. The aver-
aged parameters (speed, changes in size and shape) are compared with thresholds;
if their typical values are exceeded, the object is classified as human, otherwise it
is qualified as a luggage. In the further stages of system development, the classifi-
cation procedure will be expanded so that more object classes will be defined.
5
The main problem in this approach is that due to the nature of the background
subtraction algorithm, leaving an object in the scene causes the same effect as re-
moving an object that was a part of the background (e.g. a luggage that remained
stationary for a prolonged time). In both cases a new tracker is created, containing
either a left object or a ‘hole’ in the background (remaining after the object was
taken). The system has to decide whether the detected object was left or taken.
This is achieved by examining the content (image) of the newly created tracker. It
is expected that edges of the left object are located close to the blob’s border,
while no distinct edges are present in case of the taken object (provided that a
background is sufficiently smooth).
The proposed procedure works as follows. First, the grayscale image B of the
object (blob) and its mask M (having non-zero values for pixels belonging to the
blob and zero values otherwise) are processed by the Canny detector in order to
find the edges. The results of edge detection in the mask (E
M
) is processed by
morphological dilation in order to increase a detection margin:
SEEE
MMd
⊕
=
(3)
where SE is a 7 x 7 structuring element. Next, the result of dilation is combined
with E
B
– the result of edge detection in the image B:
BMd
EER
∩
=
(4)
and the resulting image is dilated using the same structuring element:
SERR
d
⊕
=
. (5)
Finally, a measure of difference between the object and the background is calcu-
lated as:
M
R
N
N
D=
(6)
where
(
)
dR
RN
CNZ
=
,
(
)
MdM
EN CNZ
=
(7)
and CNZ() is a function that counts the number of non-zero pixels in the grayscale
image. If the blob represents a left object, D is expected to be significantly larger
than for a removed object. Therefore, the analyzed object is classified as a left one
if D > T
cl
or as a taken object otherwise, where T
cl
is a detection threshold for clas-
sification. A proper selection of the threshold allows for accurate detection of
6
taken and left objects regardless of the background (which may also contain
edges) and errors in background subtraction.
Fig. 2 presents an example of the procedure described above, for left and re-
moved object cases (T
cl
= 0.6). It may be seen that the proposed procedure allows
for proper distinction of taken and left objects basing on the D measure value.
Fig. 2. Example of detection of left and taken objects using the procedure described in the text
2.4 Event Detection
The event detection module utilizes data acquired from previous modules in order
to detect events defined with rules. Detected events may refer to simple cases (ob-
ject entering or leaving an area, crossing a barrier, stopping, moving, etc) as well
as to more complex situations (abandoned luggage, theft and others). In this sec-
tion of the paper, detection of an abandoned luggage scenario is used as an exam-
ple.
The event detector stores a history of each tracker states (in the experiments,
last five states were used). The object state contains all the parameters needed for
event detection (position of an object, its velocity and direction of movement,
type, values of descriptors, etc.). Event rules are tested against all past states. If the
rule is fulfilled for a defined number of states (e.g. three of total five), an event is
detected. This approach allows for time-filtering of instantaneous events, reducing
the number of false-positive decisions. The framework allows an user to add new
rules; each rule is analyzed in parallel based on the same tracker data.
An example rule for detection of an abandoned luggage may be formulated in
plain English as follows: if an object of type ‘human’ leaves an object of the type
‘luggage’, the human moves away from the luggage at the distance d and does not
approach the luggage for the time period t, an event ‘abandoned luggage’ is de-
7
tected. An implementation of this rule in the event detector is presented in Fig. 3.
Only the objects classified as left luggage are processed by the rule. For each
frame, distance d between the luggage and its ‘parent’ (person that left the lug-
gage) is calculated. If d exceeds the threshold T
d
(or the parent left the screen) for
a time period T
t
, an alarm is sent to the system output. If the person goes back to
the luggage (d < T
d
) before the time T
t
passes, the counter t is reset.
The rule may be extended with additional conditions, e.g. the event may be de-
tected only if the person and/or the luggage are detected in the defined area or if
the person crosses the defined barrier. The main idea of the event detection mod-
ule presented here remains valid for more complex scenarios. Other rules (stolen
object, loitering, etc.) may operate in a similar manner.
Fig. 3. Block diagram of the detector of abandoned luggage described in text
3. Test Results
The framework for automatic event detection described in Section 2 was imple-
mented in C++ programming language, using an OpenCV library [14] for per-
forming low-level image operations and implementing Kalman filters. The first
8
version of the system for detection of an abandoned luggage is currently tested at
the Poznan-Lawica airport in Poland. A camera is mounted in the arrivals hall, at
the height of 3.20 meters, overlooking the area in which most of the abandoned
luggage cases were observed. The area visible by the camera is calibrated using
the Tsai’s method [15] in order to analyze distances between objects in meters in-
stead of pixels. The system runs on the PC with Fedora Linux operating system,
2.5 GHz quad-core processor and 4 GB of memory and is able to process 10 video
frames per second in real time (resolution 1600 x 900 pixels, fps limit imposed by
the camera).
The thorough evaluation of accuracy of the proposed abandoned luggage detec-
tor requires the ground data which has not been collected yet. Therefore, an exten-
sive quantitative analysis of the system performance remains to be evaluated in fu-
ture. Based on initial tests and visual evaluation of results provided by the system,
it may be stated that the accuracy of the detection is satisfactory in good and mod-
erate conditions (stable light, moderately busy scene). An example of proper de-
tection of the abandoned luggage is presented in Fig. 4. The lightning conditions
in this example are not optimal, therefore results of background subtraction and
object tracking are inaccurate to some degree. However, the algorithms described
in this paper generally work as expected. The tracker representing the person
walking away from the luggage (Fig. 4a and 4b) matches the two blobs separated
by distance larger than the threshold (30 pixels). Similarity factors calculated for
the tracker and both blobs (using Eq. 1 for texture similarity S
T
, correlation coeffi-
cient for color histogram similarity S
C
and Eq. 2 for the total similarity) are: S
T
=
0.83, S
C
= 0.86, S = 0.84 for the tracker compared with the ‘person blob’ and S
T
=
0.51, S
C
= 0.99, S = 0.64 for the tracker compared with the ‘luggage blob’. Al-
though the result calculated for color histograms is invalid in this case, the final
similarity measure is correct and allows for proper assignment of the existing
tracker to the person and a new tracker is created for the left luggage (Fig. 4c).
This new tracker is correctly classified as a left object, basing on the edge measure
calculated using Eq. 6: D = 0.71 (T
cl
= 0.6). After the distance between the person
and the luggage exceeds the defined threshold (d = 3 m) for a defined time period
(t = 15 s), the event detector detects that the luggage was abandoned (Fig. 4d) and
sends an alarm to its output.
Conditions that may cause event misdetection by the system are related mainly
to the inaccuracy of the background subtraction module due to the changes in
lighting. In the presented example, sunlight falling through glass walls of the hall
causes reflections on the floor that disturbed the background subtraction procedure
and resulted in creation of numerous false trackers. Another problem is related to
object tracking in case of crowded scenes, with a large number of moving objects
overlapping each other for a prolonged time. In such situations, the number of er-
roneous decisions made by the tracking procedure increases significantly. As a re-
sult, the event detector is fed with invalid data and fails to provide expected re-
sults. This kind of problems will be addressed in the future research.
9
Fig. 4. Example result of tests of the automatic detector of the abandoned luggage, performed at
the Poznan-Lawica airport: (a) person with luggage, (b) person leaving luggage, (c) new tracker
created for the abandoned luggage, (d) event rule matched – abandoned luggage detected
4. Conclusions
An extensible framework for the rule-based event detection was proposed and its
application in a real-life scenario at the airport was initiated. The hitherto per-
formed tests proved that the algorithms developed for this system (detection of
left and removed objects, event detection based on defined rules) allow for a de-
tection of defined events, such as an abandoned luggage, under normal conditions
(stable light, moderately crowded scene). The results of the tests performed so far
are promising, however, the accuracy of the system decreases in less favorable
conditions. Therefore, in order to develop a fully-working system, future research
will focus on improving the accuracy of background subtraction and object track-
ing algorithms in difficult conditions.. The fully developed system for abandoned
10
luggage detection may provide a helpful tool in improving the level of public
safety in airport terminals and other public areas. Moreover, the framework is de-
signed in a way that allows straightforward extensibility with other analysis mod-
ules, thus it may be implemented in both existing and new security systems, help-
ing to improve the level of safety in public facilities.
Acknowledgments Research is subsidized by the Polish Ministry of Science and Higher
Education within Grant No. R00 O0005/3 and by the European Commission within FP7 project
“INDECT” (Grant Agreement No. 218086). The authors wish to thank the staff of the Poznan-
Lawica airport for making the tests of the system possible.
References
[1] Bird ND, Masoud O, Papanikolopoulos NP, Isaacs A (2005) Detection of loitering individu-
als in public transportation areas, IEEE Trans. Intell. Transp. Syst.,6(2): 167–177.
[2] Lee MW, Nevatia R (2007) Body part detection for human pose estimation and tracking,
Proc. Motion Video Comput., p. 23.
[3] Blunsden S, Andrade E, Fisher R (2007) Non parametric classification of human interaction,
Proc. 3rd Iberian Conf. Pattern Recog. Image Anal., 347–354.
[4] Datta A, Shah M, Lobo NDV (2002) Person-on-person violence detection in video data. Proc.
of the 16th Int. Conf. on Pattern Recognition, 1:433–438.
[5] Black J, Velastin SA, Boghossian B (2005) A real time surveillance system for metropolitan
railways, Proc. IEEE Conf. Adv. Video Signal Based Surveillance, 189–194.
[6] Kang S, Abidi B, Abidi M (2004) Integration of color and shape for detecting and tracking
security breaches in airports, Proc. 38th Annu. Int. Carnahan Conf. Security Technol., 289–
294.
[7] Ghazal M, Vazquez C, Amer A (2007) Real-time automatic detection of vandalism behavior
in video sequences, Proc. IEEE Int. Conf. Syst., Man, Cybern., 1056–1060.
[8] Auvinet E, Grossmann E, Rougier C, Dahmane M, Meunier J (2006) Left-Luggage Detection
using Homographies and Simple Heuristics, Proc. of IEEE Int. Workshop on Performance
Evaluation of Tracking and Surveillance, 51-58.
[9] Lv F, Song X, Wu B, Singh VK, Nevatia R (2006) Left Luggage Detection using Bayesian
Inference, Proc. of IEEE Int. Workshop on Performance Evaluation of Tracking and Surveil-
lance, 83-90.
[10] Spagnolo P, Caroppo A, Leo M, Martiriggiano T, D’Orazio T (2006) An aban-
doned/removed objects detection algorithm and its evaluation on PETS datasets, Proc. IEEE
Int. Conf. Video Signal Based Surveillance, p. 17.
[11] KadewTraKuPong P, Bowden R. (2003) A real time adaptive visual surveillance system for
tracking low-resolution colour targets in dynamically changing scenes, J. of Image and Vi-
sion Computing, Vol 21(10), pp 913-929.
[12] Welch G, Bishop G. (2004) An introduction to the Kalman filter. Technical report, TR-
95041, Department of Computer Science, University of North Carolina.
[13] Hall-Beyer M. (2007) The GLCM Tutorial. Available online:
http://www.fp.ucalgary.ca/mhallbey/tutorial.htm
[14] Bradski G, Kaehler A (2008) Learning OpenCV: Computer Vision With the OpenCV Li-
brary, O’Reilly Media.
[15] Tsai R. (1987) A Versatile Camera Calibration Technique For High Accuracy 3d Machine
Vision Metrology Using Off-The-Shelf TV Cameras And Lenses. IEEE J. Robotics Auto-
mat., vol. RA-3, No. 4.