Conference PaperPDF Available

Obstacle Detection Using Millimeter-Wave Radar and Its Visualization on Image Sequence

Authors:

Abstract and Figures

Sensor fusion of millimeter-wave radar and a camera is beneficial for advanced driver assistance functions such as obstacle avoidance and stop&go. However, millimeter-wave radar has low directional resolution, which engenders low measurement accuracy of object position and difficulty of calibration between radar and camera. In this paper, we first propose a calibration method between millimeter-wave radar and CCD camera using homography. The proposed method does not require the estimation of rotation and translation between them, or intrinsic parameters of the camera. Then, we propose an obstacle detection method, which consists of an occupancy-grid representation, and a segmentation technique, which divides data acquired by radar into clusters (obstacles); thereafter, we display them as an image sequence using calibration results. We demonstrate the validity of the proposed methods through experiments using sensors that are mounted on a vehicle.
Content may be subject to copyright.
Obstacle Detection Using Millimeter-wave Radar
and Its Visualization on Image Sequence
Shigeki SUGIMOTO, Hayato TATEDA, Hidekazu TAKAHASHI, and Masatoshi OKUTOMI
Department of Mechanical and Control Engineering,
Graduate School of Science and Engineering, Tokyo Institute of Technology,
2-12-1 O-okayama, Meguro-ku, Tokyo, 152-8552 Japan
Abstract
Sensor fusion of millimeter-wave radar and a camera
is benefici al for advanced driver assistance functions such
as obstacle avoidance and Stop&Go. However, mil limeter-
wave radar has low directional resolution which engenders
low measurement accuracy of object position and difficulty
of calibrati on between radar and camera.
In thi s paper, we first propose a calibration method be-
tween millimeter-wave radar and CCD camera using ho-
mography. The proposed method does not require est i ma-
tion of rotation and t ranslation between them, or intrinsic
parameters of the camera. Then, we propose an obsta-
cle detection method which consists of an occupancy-grid
representation, and a segmentation technique which divides
data acquired by radar i nt o clusters(obstacles); thereafter
we display them as an image sequence using calibration re-
sults. We demonstrate the validit y of t he proposed methods
through experiments using sensors t hat are mounted on a
vehicle.
1. Introduction
In recent years, radar-based driver assistance systems
such as Adaptive Cruise Control (ACC) have been intro-
duced to the market by several car manufacturers. Most
of these systems rely on millimeter-wave radar for obtain-
ing information about the vehicles environment. In general
use, a millimeter-wave radar is mounted on the front of a
vehicle. It measures distance and relative velocity to targets
at the front of the vehicle by scanning in a horizontal plane.
Compared with other long range radars (e.g., laser radar),
millimeter-wave radar offers advantages of higher reliabil-
ity in bad weather conditions.
Notwithstanding, most of these systems are designed
for high-speed driving. A millimeter-wave radar provides
relatively high distance resolution, but it has low direc-
tional (azimuth/elevation) resolution. Directional resolution
is sufficient for the ACCs for high-speed driving because it
can be assumed that the vehicle is cruising in a low traffic
density area. Furthermore, the positions of objects observed
by the radar are limited to the space in front of the vehicle.
Many moving objects such as vehicles, pedestrians, bicy-
cles, and so on exist in crowded urban areas. It is extremely
difficult to detect these objects and measure their accurate
positions by radar with low directional resolution.
In contrast to millimeter-wave radar, a camera provides
high spatial resolution but low accuracy in estimation of the
distance to an object. The high spatial resolution of the cam-
era can support the low directional resolution of the radar,
and the high distance resolution of the radar can support the
low accuracy in distance estimation of the camera. Thereby,
millimeter-wave radar and camera can be mutually support-
ive: their sensor fusion offers benefits for more advanced
driver assistance functions such as obstacle avoidance and
Stop&Go.
For sensor fusion of millimeter-wave radar and camera,
calibration of their locations is an important issue because
flexible sensors’ locations are required for car design. A
calibration method should be simple and easy for mass pro-
duction. However, in past research for the sensor fusion
of radar and camera (e.g., [1][2]), the sensors’ locations are
constrained strictly and the calibration method is not explic-
itly mentioned.
We propose a calibration method between millimeter-
wave radar and a CCD camera. Generally, calibration of
radar and a camera requires estimation of the transforma-
tion between sensors’ coordinates. The proposed method
simply estimates the homography that describes transfor-
mation between a radar plane (which is scanned by radar)
and an image plane. Using the calibration result, we can
visualize the objects’ information acquired by the radar on
an image sequence.
We also propose an obstacle detection method which
consists of an occupancy-grid representation of radar data
and their segmentation where the resultant clusters corre-
spond to the obstacles. Subsequently, the cluster informa-
0-7695-2128-2/04 $20.00 (C) 2004 IEEE
milliwave radar
camera
r
(u , v)
x
y
y
x
z
r
r
c
c
c
( x , y , z )
r
r
r
z
r
R,t
H
A
Ǚ
radar plane
u
v
Ǚ
image plane
r
i
θ
Figure 1. Geometry of radar and camera
tion, including its distance, width, and relative velocity, are
displayed on the corresponding image frames.
In the remainder of this paper, the calibration method is
described in Section 2. Sections 3 and 4 explain our method
of radar data segmentation and visualization, respectively.
Section 5 shows experimental results.
2. Calibration between Radar and Camera
We suppose that the radar scans in a plane, called
the radar plane’. As shown in Fig.1, let
x
r
y
r
z
r
and
x
c
y
c
z
c
be the radar and the camera coordinates respec-
tively, and
u v be the image plane coordinates. Using
homogeneous coordinates, we can describe the equation of
transformation between
x
r
y
r
z
r
1 and u v 1 as follows.
ω
u
v
1
P
x
r
y
r
z
r
1
P A R t (1)
In the above equation, the 3
3matrixR and the 3 1 vector
t denote, respectively, the rotation and translation between
the sensors coordinates; the 3
3matrixA denotes intrin-
sic camera parameters, and the
ω
is an unknown constant.
Generally, calibration between the two sensors requires es-
timation of the 3
4matrixP, or all of the R,t,and A.Onthe
contrary, we describe the transformation between the radar
plane Π
r
and the image plane Π
i
, as described below.
Considering that all radar data come from somewhere on
the radar plane
y
r
0 , the equation (1) is converted such
that
ω
u
v
1
H
x
r
z
r
1
(2)
where H is the 3
3 homography matrix. By estimating the
H, the transformation between the radar plane Π
r
and the
image plane Π
i
is determined without solving R, t,andA.
r
ǰ
t
t
radar
refrector
acquired frame data
extraction of
maximum intensity
from each frame
intensity
intensity sequence
Figure 2. Decision process for the radar plane
We use the least squared estimation using more than four
data sets of
u v and x
r
z
r
for estimating the H.
Determination of Corresponding Data Sets
Generally, a millimeter-wave radar has an az-
imuth/elevation beam width of more than several degrees,
which may result from its antenna directivity. It causes low
directional resolution of the radar. Therefore, determining
accurate reflection positions is difficult work. However,
we can expect the beam center has maximum amplitude;
that is, an object in the crossing point of the radar plane
encounters maximum reflection intensity.
We use a millimeter-wave radar which outputs radial
distance r, angle
θ
, relative radial velocity v, and reflec-
tion intensity for every reflection and acquires many data of
radar reflections for each scan. As shown in Fig.2, we ob-
serve radar reflections and acquire frame data while mov-
ing a small corner reflector up and down so that it crosses
the radar plane. For determining the reflector’s reflection
point in each acquired frame, a signal with maximum inten-
sity is extracted (its radial distance r and angle
θ
are also
recorded); thereby we obtain an intensity sequence. With
the intensity sequence, we detect local intensity peaks for
deciding crossing points of the radar plane. Corresponding
radii and angles to the intensity peaks are converted into
Cartesian coordinates by x
r
r cos
θ
and z
r
r sin
θ
.
The image sequence is acquired simultaneously by the
camera. We extract image frames which correspond to in-
tensity peaks. Then, the reflector’s position
u v on each
image frame is estimated by a template-matching algorithm.
In this way, data sets of
u v and x
r
z
r
are obtained. They
represent positions on the image plane and the radar plane
in eq. (2).
3. Segmentation of Radar Data
Data in each radar frame are sparse and spread on the
radar plane. They include a lot of errors caused by diffrac-
0-7695-2128-2/04 $20.00 (C) 2004 IEEE
relative velocity
cluster
distance
p
osition
Figure 3. Cluster visualizati on
tions, multiple reflections, and Doppler shift calculation
failures. In addition, slanted or small objects might be
missed because of the weak reflections. For robust object
detection in such conditions, we process the radar data by
the following process.
Occupancy Grid Representation
We use an occupancy grid representation for reducing
the influence of the errors. The radar plane is separated into
a small grid which has two values: a value that represents an
existence probability of an object occupying the grid, and a
relative velocity. The probability is calculated from a nor-
malized intensity of the signal lying in the grid. The errors
by various influences become inconspicuous when taking
account of neighboring and past grid values.
Segmentation at Each Frame
After removing grids which have small existence proba-
bility, segmentation at each radar frame is accomplished by
a nearest neighbor clustering method in a 3-D feature space
which is defined by the grid position(r,
θ
) and its relative
velocity(v).
Tracking Clusters
A segmented cluster is tracked over time based on the
overlap of clusters in consecutive frames. That is, two clus-
ters which share a significantly large number of the grids
are related to each other. Before relating them, the posi-
tion of the previous cluster can be updated by a prediction
method such as the Kalman Filter applied to millimeter-
wave radar’s data by [3].
4. Visualization
Radar reflections come from various objects in a scene.
By visualizing the information about the clusters extracted
by the above method, we can easily understand the objects,
e.g. their nature, location, and velocity.
As shown in Fig.3, the clusters in every radar frame
are visualized by drawing semitransparent rectangles on the
Figure 4. Car-mounted sensors
corresponding image frame. Object information is repre-
sented by the following elements.
Object position Rectangle position, which is de-
cided by the transformed position of the cluster.
Distance to the object Rectangle height, which is
decided by a value that is inversely proportion to the
distance of the cluster.
Object width — Rectangle width, which is decided by
the left-most and right-most signals of the cluster.
Relative velocity of the object — Length and direction
of an arrow at the lower part of the rectangle; the length
is determined by the relative velocity of the cluster,
while upward and downward arrows represent leaving
and approaching objects, respectively.
5. Experiment al Results
We mount the radar and the camera at the front of the ve-
hicle as shown in Fig.4. This section presents a calibration
result between the sensors along with segmentation and vi-
sualization results using real radar/image frame sequences
observed in urban areas.
5.1. Calibration
Fig. 5(a) shows an example of the intensity sequence
described in Section 2. The 46 data sets, which represent
positions on the radar plane and the image plane, are shown
in Fig. 5(b) and Fig. 5(c), respectively. We estimated the
homography matrix H using the data sets. Fig.5(d) shows
transformed positions (the radius between 10–50m and the
angle between -10–1) on the radar plane to the image
plane using the H.
Fig.5(a) indicates that the radar fails to acquire the cor-
rect reflection intensity of the reflector at some frames,
which represents the lacking stability of radar observation.
Extracted points on both planes are influenced by the lack-
ing stability. However, the calibration result in Fig.5(d) rea-
sonably indicates the actual sensors’ arrangement, i.e. the
radar is located above the camera, and scanning directions
of the radar are nearly parallel to the y axis of the image.
0-7695-2128-2/04 $20.00 (C) 2004 IEEE
100
120
140
160
180
200
250 300 350 400 450 500
frames
power
0
2
4
6
8
10
-3 -2 -1 0 1 2 3
07
08
09
10
11
12
13
14
x[m]
y[m]
(a) intensity sequence (b) calibration points (radar)
(d) calibration result(c) calibration
p
oints (ima
g
e)
0 100 200 300 400 500 600
x[pix]
y[pix]
10[m]
20[m]
30[m]
40[m]
50[m]
0q
10q
-10q
0
50
100
150
200
250
300
350
400
450
0 100 200 300 400 500 600
07
08
09
10
11
12
13
14
x[pix]
y
[pix]
0
50
100
150
200
250
300
350
400
450
Figure 5. Calibration result
5.2. Segmentation and Visualization
Fig. 6 and 7 show examples of acquired radar/image
frames for low speed driving in urban areas. Each left figure
shows acquired radar data and segmentation results (clus-
ters are indicated by ellipses) in Cartesian coordinates. Each
corresponding image frame to the radar frame is shown on
the right. Two vertical lines on the left and right parts of the
image frame indicate the right-most and the left-most limits
of the radar’s scanning angle, respectively. We processed
radar data which were within 30m of the vehicle.
In the image of Fig.6, there are four vehicles (leaving,
standing, and two oncoming). Their radar reflections are
divided into clusters correctly; the clusters are visualized
effectively on the image. The arrows at the lower part of the
rectangles indicate the correct direction and relative velocity
of the objects. In the image of Fig.7, three objects (a parked
vehicle, a walking girl, an obstacle) are also detected and
visualized satisfactorily.
In the image frame of Fig.6, the cluster of the standing
vehicle seems too wide; this results from the low directional
resolution of the radar. If the larger threshold of signal in-
tensity is defined for removing noise, the cluster’s width
could be made smaller. However, we use a small threshold
because reflection intensities from pedestrians are relatively
weak and the small data number in the cluster tends to cause
tracking error.
Visualization results show that the positions of the clus-
ters, which are transformed on the images by the homogra-
phy matrix H, do not always represent the correct objects
position on the image. However, by visualization of the
cluster’s information, we can easily understand not only its
Figure 6. Segmentation and visualization (Scene 1)
Figure 7. Segmentation and visualization (Scene 2)
position, relative velocity, and size, but what exists there.
6. Summary and Future Work
We proposed a calibration method between millimeter-
wave radar and a CCD camera using homography. We also
segmented radar data into clusters and visualized them on
an image sequence. In experimental results, we obtained a
good calibration result and the clusters were segmented and
visualized effectively on images.
Observation errors in radar data increase especially at
low speed driving in crowded areas. Therefore, image pro-
cessing approaches such as region and/or motion segmen-
tation would be necessary for accurate obstacle detection
for urban driving. Future work will develop sensor fusion
techniques using these proposed calibration method and the
image processing approaches.
References
[1] Aufrere R., Mertz C. and Thorpe C., Multiple Sensor Fusion
for Detecting Location of Curbs, Walls, and Barriers, IEEE
Intelligent Vehicle Symposium, pp. 126–131, 2003
[2] Mockel S., Scherer F. and Schuster P.F, “Multi-Sensor Obsta-
cle Detection on Railway Tracks, IEEE Intelligent Vehicle
Symposium, pp. 42–46, 2003
[3] Meis U. and Schneider R., Radar Image Acquisition and In-
terpretation for Automotive Applications, IEEE Intelligent
Vehicles Symposium, pp. 328–332, 2003
0-7695-2128-2/04 $20.00 (C) 2004 IEEE
... b https://orcid.org/0000-0001-8122-6789 c https://orcid.org/0000-0002-5708-6023 ity and orientation, etc. (Sugimoto et al., 2004;Wang et al., 2011;Wang et al., 2014;Kim and Jeon, 2014). The tasks of sensor fusion however are preceded by a necessary calibration that aligns the data from all sensors in a common reference frame so that data association is done correctly. ...
... Whereas the 2D affine transformation calibration estimates six out of nine transformation parameters, the 2D projective transformation method estimates the complete 3 × 3 homography between the radar and camera planes. Sugimoto et al. (Sugimoto et al., 2004) and then Wang et al. (Wang et al., 2014) use the projective transformation method for camera-radar calibration. Even though this method can provide more accurate results for the calibration than affine transforms, the calibration disregards the 3D representation of the data and only provides point correspondences from the radar plane to the camera plane. ...
... since φ is usually unknown, output can only be interpreted in 2D in other approaches (Sugimoto et al., 2004;Wang et al., 2011;Wang et al., 2014;Kim and Jeon, 2014;Peršić et al., 2019;Domhof et al., 2019) and is assumed that φ = π/2. ...
Conference Paper
Full-text available
The use of cameras and radar sensors is well established in various automation and surveillance tasks. The multimodal nature of the data captured by those two sensors allows for a myriad of applications where one covers for the shortcomings of the other. While cameras can capture high resolution color data, radar can capture the depth and velocity of targets. Calibration is a necessary step before applying fusion algorithms to the data. In this work, a robust extrinsic calibration algorithm is developed for camera-radar setups. The standard geometric constraints used in calibration are extended with elevation constraints to improve the optimization. Furthermore, the method does not rely on any external measurements beyond the camera and radar data, and does not require complex targets unlike existing work. The calibration is done in 3D thus allowing for the estimation of the elevation information that is lost when using 2D radar. The results are evaluated against a sub-millimeter ground truth system and show superior results to existing more complex algorithms. https://github.com/mahdichamseddine/CaRaCTO.
... A more common choice is corner reflectors as its specific design ensures the reflected waves to propagate along their incident direction and return to their source, i.e., corner reflectors can provide reliable measurements with higher RCS values comparing with other objects. Sugimoto et al. [17] move the corner reflector up and down to find the measurements with local maximum in intensities and use these measurements to optimize a homography transformation matrix. ...
Article
Full-text available
While automotive radars are widely used in ADAS and autonomous driving, extrinsic and temporal calibration of automotive radars with other sensors is still daunting due to the sparsity, uncertainty, and missing elevation angles of automotive radar measurements. We propose a target-based calibration approach of 3D automotive radar and 3D LiDAR that performs extrinsic and temporal calibration in both factory and on-road settings. In factory calibration settings, a map is built with precise target poses; target trajectories are estimated based on map-based target localization in which the accuracy of both nearby and faraway target pose estimates can be ensured. The spatial and temporal relationships between radar and LiDAR measurements are established with target trajectories to accomplish extrinsic and temporal calibration. The proposed data collection procedure provides sufficient motion for analyzing time delay between sensors and can significantly reduce the data collection effort and time. There is 52.3 % distance error reduction after time delay compensation in the experiment, which shows the improvements of temporal calibration. In on-road calibration settings, the metal objects with semantic labels, such as traffic signs, are selected as calibration targets. Although there could be insufficient correspondences to infer the missing dimension of planar radar for six DoF extrinsic calibration as demonstrated in factory calibration settings, the three extrinsic parameters and the time delay are shown still to be accurate. We validated the proposed method using the nuScenes datasets, which provide sensor measurements, poses, and HD map. With twenty-two data logs, each has over 1000 correspondences, the result of extrinsic parameters reaches centimeter-level accuracy compared with the offered benchmark. The time delay compensation improves 1 meter error for radar tracking in a 20 m/s vehicle case and improves mapping quality in real world data.
Conference Paper
In order to realize autonomous driving in snow country, IoT based road sate sensing system by combination of various environmental sensors is introduced. Using this system, not only various road states, such as dry, wet, slush, snowy, icy states are correctly identified, but the friction rates and roughness on road surface are quantitatively calculated so that the dangerous locations are identified and shown on the road map. Those road state data are also integrated and processed on MEC or cloud system to predicate the future road state in time and used for safe autonomous driving. In this paper, the proposed road state sensing system is designed and implemented as an experimental prototype to evaluate the performance and feasibility of autonomous driving in winter of a small snow city.
Article
This paper introduces MMW-Carry, a system designed to predict the probability of individuals carrying various objects using millimeter-wave radar signals, complemented by camera input. The primary goal of MMW-Carry is to provide a rapid and cost-effective preliminary screening solution, specifically tailored for non-super-sensitive scenarios. Overall, MMW-Carry achieves significant advancements in two crucial aspects. Firstly, it addresses localization challenges in complex indoor environments caused by multi-path reflections, enhancing the system’s overall robustness. This is accomplished by the integration of camera-based human detection, tracking, and the radar-camera plane transformation for obtaining subjects’ spatial occupancy region, followed by a zooming-in operation on the radar images. Secondly, the system performance is elevated by leveraging long-term observation of a subject. This is realized through the intelligent fusion of neural network results from multiple different-view radar images of an in-track moving subject and their carried objects, facilitated by a proposed knowledge-transfer module. Our experiment results demonstrate that MMW-Carry detects objects with an average error rate of 25.22% false positives and a 21.71% missing rate for individuals moving randomly in a large indoor space, carrying the common-in-everyday-life objects, both in open carry or concealed ways. These findings affirm MMW-Carry’s potential to extend its capabilities to detect a broader range of objects for diverse applications.
Conference Paper
Sensor fusion in mobile robots requires proper extrinsic and intrinsic sensor calibration. Robots in the search and rescue robotics domain are often equipped with multiple range sensor modalities such as radar and lidar due to the harsh environmental conditions. This article presents a method to easily calibrate a 2D scanning radar and a 3D lidar without the use of special calibration targets. Therefore, it focuses on the improvement of the feature extraction from the environment by applying filtering algorithms to remove noise and improve the signal-to-noise ratio. Additionally, a second optimization stage is introduced to propagate measurement uncertainties of the lidar to the calibration result. The results are compared to the previous version of the algorithm as well as to the ground truth parameters. Furthermore, statistical tests are performed to confirm the validity of the calibration results.
Article
Autonomous vehicles (AVs) fuse data from multiple sensors and sensing modalities to impart a measure of robustness when operating in adverse conditions. Radars and cameras are popular choices for use in sensor fusion; although radar measurements are sparse in comparison to camera images, radar scans penetrate fog, rain, and snow. However, accurate sensor fusion depends upon knowledge of the spatial transform between the sensors and any temporal misalignment that exists in their measurement times. During the life cycle of an AV, these calibration parameters may change, so the ability to perform in-situ spatiotemporal calibration is essential to ensure reliable long-term operation. State-of-the-art 3D radar-camera spatiotemporal calibration algorithms require bespoke calibration targets that are not readily available in the field. In this paper, we describe an algorithm for targetless spatiotemporal calibration that does not require specialized infrastructure. Our approach leverages the ability of the radar unit to measure its own ego-velocity relative to a fixed, external reference frame. We analyze the identifiability of the spatiotemporal calibration problem and determine the motions necessary for calibration. Through a series of simulation studies, we characterize the sensitivity of our algorithm to measurement noise. Finally, we demonstrate accurate calibration for three real-world systems, including a handheld sensor rig and a vehicle-mounted sensor array. Our results show that we are able to match the performance of an existing, target-based method, while calibrating in arbitrary, infrastructure-free environments.
Conference Paper
A multi-sensor obstacle detection system for the use on railway tracks was specified, implemented and tested. The applied look-ahead sensors are: Video cameras (optical passive) and LIDAR (optical active). The objects delivered by the sensors were fused, classified and their description is sent to the central vehicle unit. It has been shown that the fusion of active and passive optical sensors and a railway track data base lead to very robust system performance. The overall detection performance has shown to be comparable to that of a human driver. We have successfully demonstrated a multi-sensor obstacle detection system prototype having an up to 400 m look-ahead range under typical operating conditions. The prototype was tried out on a test vehicle (Train Control TestCar) driving up to 120 km/h over long distances across Germany. Future steps are the optimization, miniaturization and the integration of the active and passive sensor components of the obstacle detection system. The computational optimization of the object detection algorithms is another important step in order to reduce necessary computing resources.
Conference Paper
Knowledge of the location of curbs, walls, or barriers is important for guidance of vehicles or for the understanding of their surroundings. We have developed a method to detect such continuous objects alongside and in front of a host vehicle. We employ a laser line stripper, a vehicle state estimator, a video camera, and a laser scanner to detect the object at one location, track it alongside the vehicle, search for it in front of the vehicle and eliminate erroneous readings caused by occlusion from other objects.
Conference Paper
Future safety and comfort systems will require clearly enhanced capabilities for environmental sensing and traffic scene understanding. This paper points out the potential of millimeter-wave radar sensors with high spatial resolution and high sensitivity for this demanding task, and addresses some aspects of practical implementation. The enhanced perception capabilities of imaging radar in traffic environment are demonstrated, and methods of image processing based automatic scenario interpretation are described. This comprises road course prediction and object segmentation with corresponding applications.