ArticlePDF Available

Localization and Trajectory Reconstruction in Surveillance Cameras with Nonoverlapping Views

Authors:

Abstract

This paper proposes a method that localizes two surveillance cameras and simultaneously reconstructs object trajectories in 3D space. The method is an extension of the Direct Reference Plane method, which formulates the localization and the reconstruction as a system of linear equations that is globally solvable by Singular Value Decomposition. The method's assumptions are static synchronized cameras, smooth trajectories, known camera internal parameters, and the rotation between the cameras in a world coordinate system. The paper describes the method in the context of self-calibrating cameras, where the internal parameters and the rotation can be jointly obtained assuming a man-made scene with orthogonal structures. Experiments with synthetic and real--image data show that the method can recover the camera centers with an error less than half a meter even in the presence of a 4 meter gap between the fields of view.
A preview of the PDF is not available
... Many methods are developed for the detection of motion in these videos. The existing methodologies discussed in this paper from [2], [3], [4], [5] and [6] are found with various difficulties and drawbacks. The fast biologically inspired algorithm in [6] has limited range of detectable velocities and recognition of single object only. ...
... The fast biologically inspired algorithm in [6] has limited range of detectable velocities and recognition of single object only. The trajectory and localization method in [3] has multiple camera usage, perimeter marking and single object detection. The tracking algorithm [4] has frame delay in detection. ...
... They all were designed with a learning process, i.e., a pattern will be designed and it will be taken for a learning process. The works include human action recognition [2], surveillance camera overlapping views [3], tracking algorithms [4], pedestrian detection [5] and the biologically inspired algorithm [6]. These methods are designed with respect to a particular scenario. ...
Article
Motion detection has been done for videos with various methodologies. The existing systems are based on edge detection and detect motion as a single object by taking the movement of edges into account. But in sensitive applications like satellite imaging systems, cancer cell or medical imaging systems, the sub objects movement is also taken into account for efficient decision making. So a new methodology has been developed in this project for the multiple objects and sub objects movement in sensitive video applications. A methodology was developed for static images by Fellenszwalb et al for multiple object detection. The Iterated Training Algorithm (ITA) used for the static images is implied in the case of videos. This algorithm has been modified for the case of videos. In this paper ITA is implied in the case of videos and the sub objects movements in the video are detected. Webcam video is fed as input and the performance measure of sensitivity and numbers of frames detected with motion are visualized. It is found from the performance measures that, the proposed ITA holds better than the existing methods. Multiple Instance method had better performance than ITA but in the case of training, Multiple Instance method needs more training than the proposed method. As of whole, this paper validates the advantages of the proposed methodology.
... Human Trajectory Prediction (TP) which aims to predict the movement of pedestrians has become a research hotpot recently. Many effective TP solutions have been proposed to resolve this task, and they have been widely applied to many applications, such as autonomous driving Luo, Cai, Bera, Hsu, Lee and Manocha (2018) and surveillance systems Pflugfelder and Bischof (2010). ...
Preprint
Full-text available
Trajectory Prediction (TP) is an important research topic in computer vision and robotics fields. Recently, many stochastic TP models have been proposed to deal with this problem and have achieved better performance than the traditional models with deterministic trajectory outputs. However, these stochastic models can generate a number of future trajectories with different qualities. They are lack of self-evaluation ability, that is, to examine the rationality of their prediction results, thus failing to guide users to identify high-quality ones from their candidate results. This hinders them from playing their best in real applications. In this paper, we make up for this defect and propose TPAD, a novel TP evaluation method based on the trajectory Anomaly Detection (AD) technique. In TPAD, we firstly combine the Automated Machine Learning (AutoML) technique and the experience in the AD and TP field to automatically design an effective trajectory AD model. Then, we utilize the learned trajectory AD model to examine the rationality of the predicted trajectories, and screen out good TP results for users. Extensive experimental results demonstrate that TPAD can effectively identify near-optimal prediction results, improving stochastic TP models' practical application effect.
... Some authors propose calibration method based on a compound target consisting of two planar calibration targets that are fixed together [92,93]. To calibrate multiple cameras some authors used the motion in scene [94][95][96][97][98][99][100]. In addition, in order to establish the correspondence between different cameras, they track the movement of targets in the scene [98,99]. ...
Article
Automated surveillance systems observe the environment utilizing cameras. The observed scenario is then analysed using motion detection, crowd behaviour, individual behaviour, interaction between individuals, crowds and their surrounding environment. These automatic systems accomplish multitude of tasks which include, detection, interpretation, understanding , recording and creating alarms based on the analysis. Till recent, studies have achieved enhanced monitoring performance along with avoiding possible human failures by manipulation of different features of these systems. This paper presents a comprehensive review of such video surveillance systems as well as the components used with them. The description of the architectures used is presented which follows the most required analyses in these systems. For the bigger picture and wholesome view of the system, existing surveillance systems were compared in terms of characteristics, advantages, and difficulties which are tabulated in this paper. Adding to this, future trends are discussed which charts a path into the upcoming research directions.
... 2) Testing a microphone array in a quiet environment to capture a drone acoustic emission for Time Delay beamforming and calculating its azimuth and elevation. 3) Using advanced acoustic analysis algorithms, such as ESPRIT [18], MVDR [19], and MUSIC [20] to calculate the azimuth and elevation of several drones flying near a microphone array. 4) Data extraction from drone acoustic emission, such as spectrogram or Mel Frequency Cepstral Coefficients (MFCC) [21] for identification of the drone vs. other noise sources in the environment and providing detection cues to the EO/IR system. ...
Experiment Findings
Use of a 360 hemispherical camera system with a mesh of sound intensity probes to coarsely locate drones and differentiate from other aerial objects.
Article
In many measurement applications using multi-cameras, there is often a non-overlapping field of view (nFOV) between multi-cameras. It benefits greatly from having all of the cameras unified in a single coordinate frame by means of common spatial constraints. However, it is very difficult to establish common spatial constraints for accurate and quick calibration of the extrinsic parameters between multi-cameras in the case of long working distance, asymmetric working angle or limited working space. To overcome these issues, we propose a flexible calibration method using a camera rig, which can be easily modeled as a hand-eye calibration problem and solved by only several sets of calibration images. To further improve the calibration accuracy, the camera intrinsic parameters, lens distortion coefficients, and extrinsic parameters are optimized simultaneously, similar to binocular calibration. However, this causes instability of multi-parameters. We extend the epipolar constraint to two images with nFOV and add it into the optimization objective function together with the re-projection error, which ensures the basic consistency of intrinsic parameters before and after optimization, and reduce the fluctuation caused by the number of images involved in calibration. Experiments and application have proved that the proposed new methods are feasible and effective.
Chapter
Multi-view multi-object tracking algorithms are expected to resolve multi-object tracking persistent issues within a single camera. However, the inconsistency of camera videos in most of the surveillance systems obstructs the ability of re-identifying and jointly tracking targets through different views. As a crucial task in multi-camera tracking, assigning targets from one view to another is considered as an assignment problem. This paper is presenting an alternative approach based on Unbalanced Optimal Transport for the unbalanced assignment problem. On each view, targets’ position and appearance are projected on a learned metric space, and then an Unbalanced Optimal Transport algorithm is applied to find the optimal assignment of targets between pairs of views. The experiments on common multi-camera databases show the superiority of our proposal to the heuristic approach on MOT metrics.
Chapter
In this paper, we propose a novel sensor-fusion-based method to eliminate errors of MEMS IMUs, and reconstruct trajectory of quadrotor drones. MEMS IMUs are widely equipped in quadrotor drones and other mobile devices. Unfortunately, they carry a lot of inherent errors, which cause poor results in trajectory reconstruction. To solve this problem, an error model for accelormeter signals in MEMS IMUs is established. In this model, the error is composed of a bias component and a noise component. First, a low-pass filter with downsampling is applied to reduce the noise component. Then, the bias component is detected and eliminated dynamically with the assistance of other sensors. Finally, the trajectory of the drone is reconstructed through integration of the calibrated accelormeter data. We apply our trajectory reconstruction method on Parrot AR.Drone 2.0 which employs a low-cost MEMS IMU. The experimental results prove its effectiveness. This method can theoretically be applied to any other mobile devices which are equipped with MEMS IMUs.
Conference Paper
Full-text available
This paper addresses two issues related to the simultaneous calibration of a network of imaging sensors and the recovery of the trajectory of a single target moving among them. The non-overlapping fields of view for the cameras do not cover the entire scene, resulting in times for which no measurements are available. A Bayesian framework is imposed on the problem in order to compute the MAP (maximum a posteriori) estimate for both the trajectory of the target and the translation and rotation of each camera within the global scene. First, three model order reduction techniques that decrease the dimension of the search space and the number of terms in the objective function are presented, thereby reducing the computational requirements of the search algorithm used to solve the optimization problem. Next, the problem of finding a solution that is consistent with the set of observation times is addressed, so that the target's estimated state does not fall within the field of view of the sensor network at a time for which no measurement is available. Three techniques that treat the missing measurements as additional inequality or equality constraints within the MAP optimization framework are presented.
Chapter
Full-text available
This chapter introduces intrinsic camera calibration with focus on video surveillance. Calibration of camera-specific parameters such as the focal length is mandatory for metrological problems, for example, measuring a vehicle’s speed. However, it also improves target classification, target detection, and target tracking. Geometry has become important in multi-camera systems that hand over and track objects across cameras. We present the basic geometric concept behind calibration and show which information about the cameras, the scene, and the images is necessary to realize automatic methods. Self-calibration will be a key technology for the practical deployment of future smart video cameras.
Conference Paper
Full-text available
In this paper, we present a wide area surveillance system that detects, tracks and classifies moving objects across multiple cameras. At the single camera level, tracking is performed using a voting based approach that utilizes color and shape cues to establish correspondence. The system uses the single camera tracking results along with the relationship between camera field of view (FOV) boundaries to establish correspondence between views of the same object in multiple cameras. To this end, a novel approach is described to find the relationships between the FOV lines of cameras. The proposed approach combines tracking in cameras with overlapping and/or non-overlapping FOVs in a unified framework, without requiring explicit calibration. The proposed algorithm has been implemented in a real time system. The system uses a client-server architecture and runs at 10 Hz with three cameras.
Article
This paper describes a visual object detection framework that is capable of processing images extremely rapidly while achieving high detection rates. There are three key contributions. The first is the introduction of a new image representation called the "Integral Image" which allows the features used by our detector to be computed very quickly. The second is a learning algorithm, based on AdaBoost, which selects a small number of critical visual features and yields extremely efficient number of critical visual features and yields extremely efficient classifiers [6]. The third contribution is a method for combining classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising object-like regions. A set of experiments in the domain of face detection are presented. The system yields face detection performace comparable to the best previous systems [18, 13, 16, 12, 1]. Implemented on a conventional desktop, face detection proceeds at 15 frames per second.