Content uploaded by Jiang Yu Zheng
Author content
All content in this area was uploaded by Jiang Yu Zheng on Mar 11, 2016
Content may be subject to copyright.
Vis Comput
DOI 10.1007/s00371-013-0911-4
ORIGINAL ARTICLE
Railroad online: acquiring and visualizing route panoramas
of rail scenes
Shengchun Wang ·Siwei Luo ·Yaping Huang ·
Jiang Yu Zheng ·Peng Dai ·Qiang Han
© Springer-Verlag Berlin Heidelberg 2013
Abstract A patrol type of surveillance has been performed
everywhere from police city patrol to railway inspection. Dif-
ferent from static cameras or sensors distributed in a space,
such surveillance has its benefits of low cost, long distance,
and efficiency in detecting infrequent changes. However, the
challenges are how to archive daily recorded videos in the
limited storage space and how to build a visual representa-
tion for quick and convenient access to the archived videos.
We tackle the problems by acquiring and visualizing route
panoramas of rail scenes. We analyze the relation between
train motion and the video sampling and the constraints such
as resolution, motion blur and stationary blur etc. to obtain
a desirable panoramic image. The route panorama gener-
ated is a continuous image with complete and non-redundant
scene coverage and compact data size, which can be eas-
ily streamed over the network for fast access, maneuver, and
automatic retrieval in railway environment monitoring. Then,
we visualize the railway scene based on the route panorama
This work is supported by National Nature Science Foundation of
China (61272354, 61273364,61105119) and Fundamental Research
Funds for the Central Universities (2012JBM039, 2011JBZ005).
S. Wang ·S. Luo ·Y. H u a n g ( B)
Beijing Key Lab of Traffic Data Analysis and Mining,
Beijing Jiaotong University, Beijing, China
e-mail: yphuang@bjtu.edu.cn
J. Y. Zheng
Department of Computer and Information Science,
Indiana University Purdue University Indianapolis,
Indianapolis, USA
e-mail: jzheng@cs.iupui.edu
P. Dai ·QHan
Infrastructure Inspection Research Institute,
China Academy of Railway Sciences, Beijing, China
e-mail: daipeng_iic@qq.com
rendering for interactive navigation, inspection, and scene
indexing.
Keywords Route panorama ·Video visualization ·
Forward motion video ·Railway safety
1 Introduction
Nowadays, the patrol train has been widely utilized for ensur-
ing the safety of railway. The cameras installed in the vari-
ous spots of a moving train can be used for different railway
inspection tasks such as the track profile measurement, rail
track defect detection and bolts detection, and the inspection
of overhead contact system [1,6,7,11]. However, as far as we
know, few works emphasize on railway environmental sur-
veillance. The trains run in closed environments fenced by
guardrails for security assurance, and any unexpected mat-
ters such as missing communication units and bolts on the
track, broken fences, unpredictable objects falling into the
rail area or hanging on wires on the top of rails will lead
to disastrous consequences. It is an extremely urgent task to
ensure the trains free from accidents.
It is desirable to install a camera in front of the patrol train
to capture the whole video along the railway for environ-
ment surveillance. Therefore, multiple videos are captured
from the entire railway during the daily patrol, the chal-
lenge is how to archive recorded videos in the small storage
space. Imagine that every inspector has to review dozens of
hours of video gallery when they start work, which is time-
consuming, labor-intensive and high expense. In addition,
the post video analysis is still infeasible other than man-
ual searching and discrimination. Therefore, the goal of our
research is to acquire critical information from such large
capacity videos automatically for proper prognosis.
123
S. Wang et al.
Fig. 1 Video visualization for the railway inspection
Video visualization, which aims at revealing useful infor-
mation from raw captured video, provides a viable solu-
tion and generates a new visual representation on which we
can perform more simplified analysis (See Fig. 1). With the
development of computing power and graphics technology,
video visualization has played an important role in many
application domains, such as sports, entertainment, medi-
cine and surveillance, which makes the video examination
quicker, more accurate and concise. Video visualization does
not mean full automation for video content decisions, but it
predigests the content of the video, exacts the characteristics
and event we concern about, then provides an auxiliary util-
ity to reduce users’ burden of browsing the video and assist
users to make decisions [2]. In most cases, the output data of
video visualization are in the format of a large collection of
images such as video abstract or a single composite image
such as panorama. Hence, it is more feasible to implement
the automatic decision algorithm on the visual output than
the video itself.
Currently, the railway environment is examined manually
by trained inspectors who view surveillance videos. Human
inspection is slow, subjective, and has locational and tempo-
ral restrictions. On the other hand, video visualization gen-
erates a compact data set. It is more likely to stream the
small-size output onto network under the current network
bandwidth. The capability of the network sharing allows the
remote access of data whenever and wherever possible, and
the inspection process could be shared more flexibly among
the inspectors.
This work tackles the visualization of the forward motion
video by rendering a full route panorama around the rail-
way into a virtual interactive scene. The video frames are
sampled with four sampling strips continuously at a velocity
calculated according to the train speed and other obtained
parameter. The route panorama generated is a continuous
image with complete and non-redundant scene coverage and
compact data size as compared to video so that it can serve
as an index to help rapid localization of objects on the rail-
way and fast stream over the network for resource sharing.
The virtual representation has many advantages in the data
storage, browsing, and examination. It will be used for rail-
way safety checking, railway facility inspection and virtual
sightseeing from the train in the future.
The rest of the paper is organized as follows. Related
works are described in Sect. 2. Section 3summarizes the
overall framework of the video visualization. Section 4ana-
lyzes the panorama acquiring and formulates the panorama
sampling. The route panorama-based modeling and render-
ing are given in Sect. 5. Experimental results are demon-
strated in Sect. 6. Section 7concludes the paper and gives
the further work.
2 Related works
2.1 The visualization along long route
Panorama imaging is an effective approach to acquire a wide-
angle view of physical space, and it is now easy and more
realistic to obtain to enable users browsing virtually and expe-
riencing immersed scenes, widely used in video conferenc-
ing, aerial photography, military monitoring and virtual view
rendering.
Panorama was first put forward by Irish painter Robert
Barker, while digital panoramas were generated in 1990s [13,
14]. There are different versions on this concept according to
applications and production methods. Local panoramas with
wide fields of view are generated from an imaging device
with a single optical center. They can be composed from
a rotating video or stitched from images [3] to yield 360◦
cylindrical or spherical views. Walk-through systems such as
Google StreetView have mapped panoramic texture to cubes
or mapped panoramas onto the polygonal geometry of 3D
city models [8].
Another panorama is generated by moving a camera along
a path. Such a route panorama [15] extends the image unlim-
itedly in space. It can be extracted from a video by connecting
pixel lines from consecutive frames with a fixed slit (a pixel
line)aspush-broom [4,9,10,13,14], or with a dynamic X-slit
[12,17]. The fixed slit achieves parallel-perspective projec-
tion, while the dynamic slit generates a multi-perspective
or near-perspective projection depending on the depth from
the camera path. However, how to generate the virtual scene
from a forward motion video is rarely addressed in the previ-
ous development. Zheng et al. [18] further produced a com-
plete scene tunnel to show the entire scenes around the path
for urban area visualization, which can be applied to city
navigation, monitoring, and inspiration for the goal of this
work. Such a method requires a high frame rate of camera or
a slow vehicle speed. An alternative approach for the route
123
Railroad online
panorama is to stitch wider strips [16] or strips with a dynamic
width [5] for one side view of street front. It requires a domi-
nant depth layer for the matching and stitching of consecutive
patches, but fails at scenes with a variety of depths. In addi-
tion, computing such variation of strip width requires suffi-
cient amount of features in the scenes for matching frames in
the depth estimation, which is not always true for open and
wide railway scenes. The complex processing and costs are
not suitable for fast train motion either.
2.2 Our contribution
Our work to acquire route panorama is the first one to apply
the route panorama to railway, which is similar as the scene
tunnel [18]. Special conditions on railways such as fast train
speed, smooth camera motion, and the interested area of rail-
way are considered in building the route panorama. The cam-
era faces forward to record a video with lower image veloci-
ties than that from a side viewing camera at an ordinary frame
rate (25 fr/s), to satisfy the fast movements of the trains and
reduce motion blur in the video. We determine the strip width
in video frames according to the train speed and known geo-
metric constraints to avoid matching frames, because images
may not have sufficient salient features for stitching due to
the simplicity of railway scenes and repetitive patterns on
the rail bed. Generating a route panorama over a long dis-
tance is also very challenging. The longest video of a train
runs 2,000 km in 8h. Hence, the sampling process has to be
robust. The method we propose here is fast and concise for
a desirable route panorama without performing the complex
image matching and stitching, but relies on estimated and
calibrated train speed known from other sensors.
3 The overview of online visualization system
3.1 The structure of the real railway environment
The railway infrastructure provides the properties as follows:
(1) the train moves at almost a constant speed locally on
a smooth track; (2) the monitoring environments such as
poles, fences, and track in the rail area have almost standard
depths, intervals, and structure; and (3) landscapes outside
rail area and weather conditions are less important than rails,
but provide additional information for reference. In addition,
the camera FOV is sufficiently large to contain information
surrounding the rail. The camera parameters including focal
length and image resolution are known or calibrated.
The discrete video frames contain overlapping scenes
with non-uniformed resolutions at different depths. Highly
repeated rail patterns in consecutive frames confuse the cur-
rent monitoring part with the examined part of track. Search-
ing video itself for finding suspect spots on the track and
Video
Sequence
Strip
Extraction
Strips
……
Stitching
……
Fig. 2 Diagram of the route panorama generation
surrounding is inefficient. Therefore, it is a feasible solution
to generate the route panorama from the video and render it
to 3D virtual scene for further examination.
3.2 The generation of route panorama
The general process of route panorama is shown in Fig. 2,
the just-sampling [19] strip is extracted from each frame of
the video sequence, and the strips from consecutive frames
are stitched into a panoramic image. The “just-sampling”
means the perfect connection of scenes between two succes-
sive frames, yielding no information loss as well as pixel
overlap. The target of panorama generation from railway
scene will be achieved in accordance with the stripe stitching
method, but it is for the forward motion video at a relatively
high speed.
To acquire the route panorama of railway scene, we cap-
ture a forward motion video sequence and sample a rectan-
gular ring with a dynamic width so as to cover four side
scenes of the rail as colored in Fig. 3. Four route panoramas
including areas of sky-wires, left and right fence-poles, and
ground-rails are obtained for constructing a box tube along
the rail direction.
3.3 The construction of online virtual railway scene
As depicted in Fig. 3, to visualize railway scenes from a
long video, the streaming server will transmit the panorama
data into network according to the request from the clients,
and then render a virtual railway scene on the client by pro-
jecting them onto a box tube that allows for a global view,
random access of spots, free speed maneuver and fast stream-
ing shorter than real running time of train. The fast rendering
also serves as a visual index for further zoom into a particular
frame for examining details.
The generated panoramas are stored into database with an
index of geographical position for fast information locating,
and are transmitted over the network as indexing requested
123
S. Wang et al.
Streams
Panorama Database
Network
Rendering Client
Streaming Server
Fig. 3 Structural diagram of the visualization from forward motion
video
for the visualization of the railway scene. The scheme is
efficient for sharing the virtual railway online, since the route
panorama is characterized by small size, abundant informa-
tion and rendering easily.
4 Route panorama sampling
4.1 Fix vanishing point
As depicted in Fig. 4, the camera is fixed on the train. From
a close range of the tracks and fences in the camera view, a
vanishing point, Q, or accordingly focus of expansion (FOE)
of optical flow is detected by extending track lines and road
edges, even if the track may be curved at distance ahead on a
turning rail. This vanishing point detection can be performed
with a long video sequence and the accurate position of Q
can be voted from the results of all the frames.
Besides the structure lines stretching ahead, we also focus
on another two sets of structure lines, i.e., vertical structure
such as poles, and horizontal lines orthogonal to the track
such as sleepers under the tracks. If the camera is not exactly
set in the forward direction, vanishing point Qis not at the
image center. If the camera faces down slightly for observing
a larger road area, vertical poles will not be parallel in the
image frame and their converged vanishing point, Qv, will be
far outside the frame at a non-infinite position. By using the
Pan
Tilt
Roll
x
o
y
z
Nth frame
3rd frame
2nd frame
1st frame
Fig. 4 Route panorama rendering on a tube box. Rectangular rings in
the image will be converted to skewed 3D patches
position of Qthat indicates the absolute heading direction,
and camera focal distance f, we estimate image position
of Qvaccording to the property of perspective projection.
Moreover, if the camera is facing aside slightly for observing
nearby rails, the lines orthogonal to the track will have the
third vanishing point Qhon their extensions.
For these reasons, the standard structural lines in the 3D
rail space may not be horizontal and vertical in the video
frame, but have small angles from image axes. In principle,
rim edges of the sampling ring are ideal to align with the pro-
jections of structure lines through vanishing points Qvand
Qh, respectively. However, if the train speed is not extremely
high and thus the rims are not very wide, the rim edges are
still reasonable to be approximated as parallel.
We further design our sampling region simply as a rec-
tangular ring for the route panoramas, rather than trying to
align the ring with the structure lines in the frame. We leave
the correction of distortion in the stage of post modeling
and rendering. This avoids the camera pose estimation at the
beginning and the pixel sampling at an irregular shape in the
frame so as to achieve real-time data collection.
As depicted in Fig. 5a, through the point Q, four radial
lines are located to pass through the top and bottom points
of the poles on both sides. They divide four regions in the
frames corresponding to vertical side planes and horizontal
planes of ground and sky, respectively. The sampling region is
composed of two rectangles named outer rectangle and inner
rectangle. The selection of outer rectangle should balance the
motion blur and resolution of the resulting route panorama,
and the position of the inner rectangle must guarantee that
123
Railroad online
(a)
Closed environment for sampling the rectanglar patches
(b)
For a wide environment at railroad switch where the rectangle is
not exactly on the 3D box.
Qm
Fig. 5 The construction of the just-sampling stitching area
the strips sampled from consecutive frames have a perfect
coverage at the rail area neither overlapping nor missing
scenes, i.e., just-sampling, during the train motion. In the
other words, the 3D position sampled by inner rectangle will
have to move onto the position sampled by outer rectangle
during the time of frame forward when the train is moving.
4.2 Construct sampling region
One problem in constructing the rectangular sampling region
is the existence of various depths in each frame. As shown
in Fig. 6, we divide them into three ranges, i.e. closer range,
middle range and far range. If we construct thejust-sampling
region on the middle range, then nearby objects such as
fences and poles are narrow in time for uncovered scene,
while distant scenes such as trees and mountains become
stretched for the stripe overlaps. The distortion introduced
by multi-perspective sampling is called stationary blur [19]
on distance scenes. Now that the objects we most concern
about are all at close range, we should find the interested
layer to perform the just-sampling. In general, it is tedious to
segment such a layer in the video. Nevertheless, we can for-
tunately solve the problem using the geometric priors such
as known position of poles and fences.
Fig. 6 Stationary blur introduced by various depth sampling
The image velocity von the rail track will be obtained
from train speed Vto ensure this just-sampling, as will be
detailed in next section. After determining the bottom line
of inner rectangle, other three lines are fixed with the radial
lines accordingly. The four strips between two rectangles,
denoted as St,Sb,Sl,and Sr, are sampled from the frame
as shown in Fig. 5a. At two positions of the vertical rims
of the sampling rectangle, the angles between the vertical
rims and the slanted projection of vertical features such as
poles, denoted by βland βr, are also computed according to
vanishing point Qvand the rim positions. They will be used
in route panorama rendering.
4.3 Image velocity estimation for just-sampling rings
As shown in Fig. 7,O-XYZ is the Train coordinates system
where the Zaxis is in the train moving direction translating
at speed V,and the Yaxis is perpendicular to the ground.
The camera coordinates system is o-xyz with tilt, pan, and
roll changes from the directions of O-XYZ. Usually, the cam-
era axis may direct slightly to a side with another rail track to
obtain a wider view of the rail bed. We capture a video seg-
ment as the train moves on straight and smooth tracks. The
camera roll can be considered as zero under such an ideal
situation. We determine the image velocity, v, at the bottom
strip over the rail track for the just-sampling span. This will
further guarantee the just-sampling spans of other sides of
the sampling rectangle.
From vanishing point Q(x0,y0,f)of the rail (or focus of
expansion of train motion in the video frame), the camera
directions including tilt ϕand pan θfrom the train moving
direction are calculated as
123
S. Wang et al.
Fig. 7 The relationship between image velocity vand train speed V
to realize the just-sampling requirement on the ground
ϕ=arctan y0
f,θ=arctan x0
f2+y2
0
(1)
where fis the calibrated camera focal length and C(0,0,f)
is the image center. The outer sampling line l1and inner
sampling line l2are determined as
l1:A1(xa1,y1,f)B1(xb1,y1,f);
l2:A2(xa2,y2,f)B2(xb2,y2,f). (2)
under the camera coordinate system o-xyz. The two image
lines in the train coordinates system O-XYZ are denoted as
l1:A1(Xa1,Y1,Za1)B1(Xb1,Y1,Zb1);
l2:A2(Xa2,Y2,Za2)B2(Xb2,Y2,Zb2). (3)
which can be obtained via their transformation from system
o-xyz to system O-XYZ through
⎧
⎪
⎪
⎨
⎪
⎪
⎩
A1:(Xa1,Y1,Za1)=(xa1,y1,f)M(ϕ)M(θ )
B1:(Xb1,Y1,Zb1)=(xb1,y1,f)M(ϕ)M(θ )
A2:(Xa2,Y2,Za2)=(xa2,y2,f)M(ϕ)M(θ )
B2:(Xb2,Y2,Zb2)=(xb2,y2,f)M(ϕ)M(θ)
(4)
where M(ϕ) and M(θ ) are rotation matrixes as
M(ϕ)=⎡
⎣
10 0
0 cos ϕsin ϕ
0−sin ϕcos ϕ
⎤
⎦M(θ)=⎡
⎣
cos θ0−sin θ
010
sin θ0 cos θ
⎤
⎦
(5)
Theplane of sight through l1is thus determined from its nor-
mal N1=OA
1×OB
1=(Xa1,Y1,Za1)×(Xb1,Y1,Zb1)
and camera focus O. The plane has an intersection line L1
with the rail surface on the ground, which can be obtained
by enforcing Y=−H, where His the height of the camera
from ground.
⎧
⎪
⎪
⎨
⎪
⎪
⎩
[XYZ
]⎡
⎣
Y1Zb1−Za1Y1
Za1Xb1−Xa1Zb1
Xa1Y1−Y1Xb1
⎤
⎦=0;
Y=−H.
(6)
This is further detailed by computing (4) from the image
coordinates of l1as
⎧
⎪
⎪
⎨
⎪
⎪
⎩
[XYZ
]⎡
⎣
(y1cos ϕ−fsin ϕ) sin θ
−(y1sin ϕ+fcos ϕ)
(y1cos ϕ−fsin ϕ) cos θ
⎤
⎦=0;
Y=−H.
(7)
which can be expended as the following to yield its direction
and intercept with the rail track (X=0).
Z=−tan θ·X+fcos ϕ+y1sin ϕ
(fsin ϕ−y1cos ϕ) cos θ·H(8)
In the same way as (8), the inner rectangle has the image
distance vfrom the outer rectangle (i.e., y2=y1+v) at the
bottom lines, which is the image velocity to shift between
consecutive frames when the train moves at speed V.The
bottom line l2is projected to rail and ground surface as L2,
⎧
⎪
⎪
⎨
⎪
⎪
⎩
[XYZ
]⎡
⎣
(y2cos ϕ−fsin ϕ) sin θ
−(y2sin ϕ+fcos ϕ)
(y2cos ϕ−fsin ϕ) cos θ
⎤
⎦=0;
Y=−H.
(9)
The 3D distance that the train moves between two successive
frames is V/R, where Ris the frame rate of the camera
(25 fps). This must be equivalent to the intercept difference
DL1L2of lines L1and L2on the Zaxis, and is computed as,
DL1L2=fcos ϕ+(y1+v) sin ϕ
[fsin ϕ−(y1+v) cos ϕ]cos θ
−fcos ϕ+y1sin ϕ
[fsin ϕ−y1cos ϕ]cos θH
=V/R(10)
From this relation, the image velocity at l1can be calculated
from the known train speed Vas
v=(fsin ϕ−y1cos ϕ)2HR cos θ·V
fHR +(fsin ϕ−y1cos ϕ) cos ϕcos θ·V(11)
The camera height, H, is a constant converted from the
width of rail track Dboth in 3D (nationwide standard) and
in the image, after the camera with focal length fis set. As
shown in Fig. 8,theouter rectangle intersects with two rails
at point mand nin the image. The plane of sight through
line mn has an intersection line MN with the rail track in 3D
space. It is not difficult to derive that the angle between the
plane and ground is ϕ+arctan(y1/f), and line MN has angle
123
Railroad online
Fig. 8 Estimation of the camera height Hfrom rail width D
θwith respect to the Xdirection (rail sleepers aligned with).
His determined as
H=f·D/cos θ
l·sin(ϕ +arctan y1/f)
=f·Dsin(ϕ +arctan(y1/f))
lcos θ(12)
Substitute the Eq. 12 into Eq. 11, the final result is obtained
as
v=(fsin ϕ−y1cos ϕ)2Rcos θ·V
fR+[(fsin ϕ−y1cos ϕ)lcos2θcos ϕ·V]/[f·Dsin(ϕ +arctan(y1/f))](13)
We call f, R, V, and Das apparatus parameters which are
obtained from the camera and train, ϕand θare setting
parameters calculated from the vanishing point, and y1,l
are parameters measured from the image directly. Most
of these parameters are known and are invariant to train
speed V.
Therefore, the relationship between image velocity vand
train speed Vcan be plotted in Fig. 9. The image velocity
converges to a fixed value with as the train speed increases.
The inner rectangle is determined for sampling pixel
stripes between two rectangles. The position of inner rec-
tangle can be located directly after we obtain the image
velocity. As shown in Fig. 7, after bottom line l2of inner
rectangle is determined, corners A2and B2of the inner rec-
tangle are further determined on the radial lines from the
vanishing point as in Fig. 5. Side strips, Sl, and Srare thus
determined in width from the corners, which guarantee the
just-sampling on the rail side infrastructures including fence,
pole, and so on. Beyond the depths of rail side, landscapes
have overlapped-sampling [19] which causes stationary blur.
However, it is not our current focus to represent landscapes
outside the railway areas. Although the railway space closer
than the side planes is under-sampled, there are no obstacles
0
5
10
15
20
115 29 43 5771 8599
Image velocity
(mm/frame)
Train speed (m/s)
Fig. 9 The relationship between the train speed (m/s) and image veloc-
ity at the rail track (mm/frame) for a forward camera (θ,ϕ =0)
to be inspected in the box tube. The top strip, St, is also deter-
mined to scan wires above the train after the side strips are
located precisely.
The relation between the vand train speed Vis obtained
using a sample video with a varied speed (Fig. 9). The result-
ing data are stored in a lookup table for the rectangle selection
in the video scanning.
5 Route panorama-based modeling and rendering
5.1 Dealing with resolution and motion blur
The strip sampling of the route panorama needs to consider
the scene resolution and possible motion blur, in addition to
the achievement of just-sampling at the railway environment.
This can be converted to the selection of outer sampling rec-
tangle.
1. If the corners of the outer rectangle are located on the
radial lines from vanishing point Q, horizontal scenes
are sampled by horizontal strips and vertical scenes are
sampled by vertical strips, which will preserve the shape
to the maximum extent. However, for a possible non-
symmetrical setting of the frame with respect to the Q
(train moving direction), the outer rectangle limited at the
radial lines may not achieve the large size of sampling
strips as shown in Fig. 5b so that its resolution becomes
low. We therefore set rectangle to reach its largest size
for the best resolution. As a shortcoming, there may be
horizontal scenes in 3D space sampled by the vertical
strip, or vertical scenes sampled by the horizontal strip.
Such improper sampling causes shape distortion in the
route panorama.
2. If the train speed is high and exposure time of the camera
is long, the motion blur also appears at the positions with
large motion in the frame. The marginal positions cap-
123
S. Wang et al.
turing scenes more sideways have the maximum optical
flow. Although a smaller outer rectangle (more forward
direction) can reduce the motion blur, the resulting strip
patches only provide limited resolution. Therefore, an
optimal position or size of the sampling ring should be
determined to guarantee the sharp scenes according to
the train speed.
The first issue above will produce unfavorable results for a
camera setting observing wide rail (Fig. 5b. If we locate the
outer rectangle as the dash line in Fig. 5b, the generated
route panorama will suffer from a much lower resolution.
Therefore, we add an adjustment depicted as the solid line
in Fig. 5b. This may bring in some structure distortion in the
route panorama due to an improper sampling, i.e., horizontal
line samples vertical 3D features, or vertical line samples hor-
izontal ground. Under such circumstances, the features with
depth changes from the camera path appear as hyperbola in
the route panorama [18], i.e., the linearity is not preserved in
this parallel-perspective projection. However, such deforma-
tions will not bring visual discomfort in the rendered route
panorama, because human understanding of real 3D space
also relies on their subjective priori as well as the spatial
context of referential objects. It will affect the shape if the
route panoramas are composed for display of the tube box in
forward direction.
We select a simple approach to improve this setting. If a
horizontal region such as rail bed and ground without stand-
ing objects is sampled with a vertical line, the lower part
in the route panorama will be warped through a non-linear
transform to convert the hyperbola structure back to linear
structure on the ground. This modification will not be done
if such a region contains many standing objects; we would
not let vertical objects sleep down.
To obtain a good sampling position preserving the sharp-
ness of the scenes, we examine the motion in the image frame.
As shown in Fig. 7, the sampling strip for track and ground
is located at y1, under the train speed V, the width of the
just-sampling strip is vas in (11).
The sampled strip will further be normalized to the stan-
dard length V/Rin the 3D space; the scaling factor is then
γ=V/(R·Irv) where γis related to Vand y1as
γ=fHR−y1V
Iry2
1R(θ, ϕ =0)(14)
where we assume θ, ϕ =0 for simplified analysis, and Iris
the image resolution of a frame representing the amount of
pixels per unit length on the frame image. γrepresents the
spatial resolution of panorama, which is the real 3D sampling
distance per unit pixel.
The motion blur is modeled roughly by convolving the
image distribution with a rectangular pulse in the radial direc-
tion. The motion blur shifts the distribution within a short
exposure time τ(e.g., 1/125 s corresponding to 1/5 frame
interval at the frame rate of 25 fr/s). The rectangle pulse with
length I=τ·Irv(e.x., for Irv= 5 pixels, the length
is 1 pixels) is convolved with the image, which is denoted
as
I=τ·Iry2
1V
fHR−y1V(θ, ϕ =0)(15)
5.2 Homography transform for tube box model of scenes
Virtual scene generation is an essential part for the fast data
browsing and interactive display. We implement the virtual
scene observation by allowing speed control, and realize per-
spective transformation by rendering four panoramas on a
tube box.
The pixels on the sampling patch are mapped onto the
real box of scene tunnel (Fig. 3). The sampled strips from
each frame are transformed through a homography change
to obtain the patches on the route panoramas. Depending
on the installed camera direction panned and tilted from the
forward translation direction of train/camera, the sampling
rectangles are not parallel to the projections of 3D vertical
features in the image (those projected lines converge to van-
ishing point Qvin the image plane). The vertical features
thus are slanted in the generated route panoramas. The route
panoramas have to be skewed according to βland βr.Inthe
same way, we can skew the top and bottom route panoramas
containing rail and sky for distortion correction. This cor-
responds to mapping strips onto the real tube with skewed
box rings shown in Fig. 4. The rectified route panoramas
thus form a tube along the path for scene tunnel visualiza-
tion.
The panorama is constructed in data acquisition and
stored as a specific big-data file which encapsulates the
image data by bit stream using C++ Programming. The
big-data file appends an indexing file for fast location
retrieval. We can use the environment mapping method
supported by general graphics hardware for texture map-
ping with route panorama. A point of panorama is trans-
formed into the coordinates in projection space. Then, the
transformed coordinates can be used to index the route
panorama. In this way, every polygon of the frustum can
receive projection from route panorama, respectively. As
shown in Fig. 4, algorithm of Route Panorama-Based Ren-
dering (RPBR) for constructing the virtual scene is as
below,
123
Railroad online
Algorithm: Route Panorama-Based Rendering
Input: A railway video captured in the forward direction.
Output: Virt ual Railway Scene.
1: for each frame m, m=1 …N, from the vide o
2: Extract four stripes Stm, Sbm, Slm, and Srm
3: Copy and transform four strips in each frame to four
Panoramas, RPt, RPb , RPl , RPrconsecutively.
4: end for
5: Skew transformation of Route Panoramas
6: Use four polygons to construct a tube box scene model for
receivi ng the projection from route panorama.
7: Set a view point in the box model for scene observation.
8: for each pixel point Pibelong to the panorama RPi
9: Perform texture mapping to project route panorama onto
tube model
10: end for
The virtual scene can be rendered fast based on the
panorama image fast and this received a considerable effect
with real sensation, as the route panorama has a small amount
of data and broad extension of perspective. The key proce-
dure of the rendering process is how to acquire the route
panorama as the projection source, under the condition that
the passed distance can be obtained from the train and GPS.
6 Experiments and discussion
6.1 Experimental data and preconditioning
This work aims at obtaining a complete archive of the route
scenes along a railway. The entire video has been taken from
a patrol train in the forward direction with the headlight of
the train always on. As shown in Fig. 10, two examples of
forward motion video are captured on a smooth path at a
relatively constant speed and an ordinary frame rate (25 f/s).
A closed railway environment and wide road switch scene
are recorded at the speed of 150 and 50 km/h, respectively.
The statistics of the speed variation are shown in Fig. 11.
Obviously, a train keeps an even speed in most time sec-
tion. Therefore, we pre-assign the video into several segment
according to the speed (see the vertical dash line in Fig. 11),
and only calculate the just-sampling region once for the most
sections with even speed. Only for a few sections with various
speeds, we perform the calculation towards each frame. This
(a) Closed environment (b) Wide railway switch
Fig. 10 Video frames captured in the forward direction
0
50
100
150
200
115294357
Speed (km/h)
Time (min.)
Fig. 11 Speed variation of the train. Blue closed environment. Brown
wide road switch
preconditioning eliminates the redundant computing task and
greatly improves the efficiency of panorama acquiring.
For the most video segment with an even speed, another
important pre-process is to fix the location y1of the outer
rectangle for obtaining the best results. The selection of outer
rectangle should balance the motion blur and resolution of
the resulting route panorama. According to the Eqs. (14),
(15), their variations with the sampling location y1 can be
directly obtained as long as V is fixed.
6.2 The results of panorama acquisition
As shown in Fig. 12, the panorama resolution and the level
of motion blur show the opposite changes with the sampling
location. The motion blur model generated by the forwarded
motion is very complex, each pixel involves different length
and orientation of pixel shifting. However, they all have the
(a)
(b)
0
2
4
6
8
115294357
Δ
I (mm)
|y1| (mm)
0
5
10
15
20
1 15294357
|y1| (mm)
γ
(mm/pixel)
Fig. 12 a The relationship between the value of pixel shifting and
sampling location; bThe relationship between the panorama resolution
and sampling location
123
S. Wang et al.
(a)
left panorama
(b)
right panorama
(c)
bottom panorama
(d)
top panorama
Fig. 13 Four panoramas generated from closed railway scene
same blur component Iperpendicular to the sampling strip.
Hence, we can use Ias an effective reference for estimating
the motion blur. 6.2 The results of panorama acquiring
Partial results of route panorama acquisition are displayed
in Figs. 13 and 14. Figure 13 is obtained from a low quality
video (720 ×576) captured at a high speed (150 km/h). How-
ever, our method still generates visible panorama under such
a poor condition. Taking left side panorama as an example,
we can observe that the fences and poles close to the cam-
era clearly; no information is lost and the distortion is also
small. The sights more distant are suffered from an apparent
distortion, which is consistent with previous analysis. The
distortion of the distant scenery can be further solved for
virtual sightseeing from the train in the future by reducing
the sampling rate because distant scenes have lower image
velocities.
The results of Fig. 14 are obtained from a higher quality
video (1,280 ×720) captured at a lower speed (50 km/h).
Obviously, the panoramas are in good display. The tracks,
poles, wires and fences, communication units and bolts are
visible to browsing. This makes the route panoramas a fea-
sible data source to perform automatic detection algorithms
for railway inspection.
Some interesting effects are observed in the route
panorama. The unevenness of the track shown at the end
of the image is due to the fact that the train as well as the
(a)
left panorama
(b)
right panorama
Fig. 14 Four panoramas generated from road switch scene
123
Railroad online
(c)
bottom panorama
(d)
top panorama
Fig. 14 continued
camera experiences a pitch shaking at a non-smooth con-
nection spot of the rail tracks. Such a location needs to be
examined further to ensure the safety of the rail. Second,
the route panoramas include rich scenes outside the rail-
way area. Such scenes vary significantly as the train travels a
long distance. The challenge now is the structure extraction
under various illuminations and background scenes outside
the track area, which has not been explored on the generated
scene.
(a) Close layer display
(b) Far layer display
Fig. 15 The panorama display according to different H
6.3 The compression ratio of panorama to video
The two videos we use are both close to 1 hour long and with
3,024 and 1,054 MB size, respectively. Compared with the
original video with avi compression format, the generated
panoramas have smaller size with jpg format which is one
eighth of the video size. The small data format with sufficient
information can be transmitted and released in cyberspace
for quick and convenient access. The detailed comparison is
shown in Table 1.
6.4 Stationary blur and multi-layer display
In the previous Sect. 4.2, we have explored the cause of sta-
tionary blur, which is introduced by the various depth layers
in the scenes. This phenomenon can also be observed in (11),
where only the surface with distance Hfrom the camera cen-
ter can be assured to the just-sampling. To solve the problem,
we can perform a multi-layer display to reduce the station-
ary blurring effect, if we divide the image into several depth
layers. Taking the Fig. 6as a typical example, we can divide
the scenes into three layers, i.e., close, middle and far layers,
for the just- sampling, respectively.
Figure 15 shows the two left panorama generated as dif-
ferent H. We can observe that the stretched pole marked red
box in (a) displays proper in (b), while the proper fence in (a)
shows squeezing in (b). Therefore, the close layer display (a)
will be taken if we focus on the close scene and vice versa.
Tabl e 1 Thesizecomparison
between video and panorama Category Resolution Data rate
(kbps)
Vid eo s iz e
(MB)
Panorama
size (MB)
Compression
ratio (MB)
Video-1 720 ×576 7,545 3,024 86 35:1
Video-2 1,280 ×720 2,602 1,054 127 8:1
123
S. Wang et al.
(a)
Virutal scene for closed raliway environment and left rotation
(b)
Virutal scene for wide road switch and left rotation
Fig. 16 Panoramic scene rendered from four generated panoramas
6.5 The results of panorama rendering
The rendering result is shown in Fig. 16. A virtual interactive
environment is constructed for virtual browsing, in which the
viewer can move back and forth freely, and rotate left or right
through keyboard interaction. It is easy to examine the rail
with a fast scrolling and a zoom-in function for long scenes.
This facilitates a coarse-to-fine investigation further down to
video frames. On the other hand, an automatic method has
been developed to scan the railway instruments appearing at
a regular interval.
7 Conclusion
The panoramic visualization of the railway environment is
first introduced in this work. We acquired the route panorama
from the forward motion video. The acquiring algorithm is
realized by extracting a just-sampling region according to the
specific structure of railway and train speed. The constraints
such as resolution, motion blur and stationary blur etc. have
been considered to generate a desirable panoramic image.
We also proposed an effective scene rendering method based
on route panoramas. As the projection sources, the four route
panoramas are directly and seamlessly projected onto a tun-
nel with a tube shape. The panoramic virtual scene success-
fully archived the route scenes in a compact format suitable
for sharing and virtual browsing online.
Our future works will focus on three aspects. Image analy-
sis algorithm such as depth estimation and object extraction
will be considered for better panorama display. The scene
change detection is essential for the automatic comparison
of route panoramas generated at different days and periods.
We will further improve the panorama rendering for a more
realistic train sightseeing through appending lights, blending,
shading, anti-aliasing and so on.
References
1. Alippi, C., Casagrande, E., Scotti, F.: Composite real-time image
processing for track profile measurement. IEEE Trans. Instrum.
Meas. 49(3), 559–564 (2000)
2. Borogo, R., Chen, M., Daubney, B.: State of the art report on video-
based graphics and video visualization. Comput. Graph. forum
31(8), 2450–2477 (2012)
3. Chen, S.E.: QuickTime VR-an image-based approach to vir-
tual environment navigation. In: Proceedings of SIG-GRAPH’95,
pp. 29–38 (1995)
4. Gupta, R., Hartley, R.I.: Linear pushbroom cameras. IEEE Trans.
PAMI 19(9), 963–975 (1997)
5. Kopf, J., Chen, B., Szeliski, R., Cohen, M.: Street slide: browsing
street level imagery. ACM Trans. Graph. 29(4):96:1–96:8 (2010)
6. Lin, J., Luo, S.W., Li, Q.Y.: Real-time rail head surface defect detec-
tion: a geometrical approach. In: Proceedings of IEEE International
Symposium on Industrial Electronics, pp. 769–774 (2009)
7. Li, Q.Y., Ren, S.W.: A visual detection system for rail surface
defects. IEEE Trans. Syst. Man Cybern. 42(6), 1531–1542 (2012)
8. Micusik, B., Kosecka, J.: Piecewise planar city 3D modeling from
street view panoramic sequences. In: Proceedings of IEEE Confer-
ence on CVPR, pp. 2906–2912 (2009)
9. Peleg, S., Herman, J.: Panorama mosaics by manifold projection.
In: Proceedings of IEEE CVPR, pp. 338–343 (1997)
10. Peleg, S., Rousso, B., Rav-Acha, A., Zomet, A.: Mosaicing on
adaptive manifolds. IEEE Trans. PAMI. 22(10), 1144–1154 (2000)
11. Rubaai, A.: A neural-net-based device for monitoring Amtrak rail-
road track system. IEEE Trans. Ind. Appl. 39(2), 374–381 (2003)
12. Roman, A., Garg, G., Levoy, M.: Interactive design of multi-
perspective images for visualizing urban landscapes. In: Proceed-
ings of IEEE Visualization, pp. 537–544 (2004)
13. Zheng, J.Y., Tsuji, S.: Panoramic representation for route recogni-
tion by a mobile robot. Int. J. Comput. Vis. 9(1), 55–76 (1992)
14. Zheng, J.Y., Tsuji, S.: Generating dynamic projection images
for scene representation and understanding. Comput. Vis. Image
Underst. 72(3), 237–256 (1998)
15. Zheng, J.Y.: Digital route panoramas. IEEE MultiMed. 10(3),
57–67 (2003)
16. Zhu, Z., Hanson, A.R., Riseman, E.M.: Generalized parallel-
perspective stereo mosaics from airborne video. IEEE Trans. PAMI
26(2), 226–237 (2004)
17. Zomet, A., Feldman, D., Peleg, S.: Mosaicing new views: the
crossed-slit projection. IEEE Trans. PAMI 25(6), 741–754 (2003)
18. Zheng, J.Y., Zhou, Y., Mili, P.: Scanning scene tunnel for city tra-
versing. IEEE Trans. Vis. Comput. Graph. 12(2), 155–167 (2006)
19. Zheng, J.Y., Shi, M.: Scanning depth of route panorama based on
stationary blur. Int. J. Comput. Vis. 78(2–3), 169–186 (2008)
123
Railroad online
Shengchun Wang was born in
1985. He received B.S. degree
from the School of Computer and
Information Technology, Beijing
Jiaotong University in 2008. He
is currently a Ph.D. Candidate
in Computer Application Technol-
ogy at School of Computer and
Information Technology, Beijing
Jiaotong University. His research
interests include scene represen-
taion for railway environment,
high resolution reconstruction and
object detection.
Siwei Luo was born on Decem-
ber 23, 1943. He obtained his
Ph.D. degree in Computer Science
form Shinshu University, Japan, in
1984. He is currently a Profes-
sor and Doctoral Supervisor of the
School of Computer and Informa-
tion Technology, Beijing Jiaotong
University. His research interests
include neuro-computing, neural
networks, pattern recognition, and
parallel computing.
Yaping Huang was born in 1974.
She received her B.S., M.S. and
Ph.D. degree from Beijing Jiao-
tong University in 1995, 1998 and
2004, respectively. Since 2012, she
has been a professor in the institute
of computer and information tech-
nology at Beijing Jiaotong Univer-
sity. Her research interests include
computer vision, pattern recogni-
tion, and machine learning.
Jiang Yu Zheng received the
B.S. degree in Computer Science
from Fudan University, China, in
1983, and the M.S. and Ph.D.
degrees in Control Engineering
from Osaka University, Japan in
1987 and 1990, respectively. From
1990, he was with ATR Commu-
nication Systems Research Lab-
oratory as research associate. He
worked at Kyushu Institute of
Technology, Japan from 1993 to
2001 as an associate professor.
Currently he is a professor at the
Dept. of Computer and Informa-
tion Science, Indiana University Purdue University Indianapolis.
His current research interests include 3D measuring and modeling,
dynamic image processing and tracking, scene representation for indoor
and urban environments, digital museum, sensor network and combin-
ing vision with graphics and human interface.
Peng Dai received the BS, MS
and PhD degrees from the Depart-
ment of Control Science and Engi-
neering of the Harbin Institute
of Technology. He worked as an
associate research at Infrastruc-
ture Inspector Center of China
Academy of Railway Science. His
research interests include statisti-
cal machine learning, visual detec-
tion and its application to High-
speed railway.
Qiang Han received the BS and
MS degrees from the School
of Science of the Beijing Jiao-
tong University. He worked as an
assistant research at Infrastructure
Inspector Center of China Acad-
emy of Railway Science. His work
in the field of laser and photoelec-
tric detection.
123