DataPDF Available
EXPOSING VIDEO INTER-FRAME FORGERY BASED
ON VELOCITY FIELD CONSISTENCY
Yuxing Wu, Xinghao Jiang*, Tanfeng Sun and Wan Wang
School of Electronic Information and Electrical Engineering,
Shanghai Jiao Tong University, Shanghai, China
{wuyuxing, xhjiang, tfsun, wangwan}@sjtu.edu.cn
ABSTRACT
In recent years, video forensics has become an important
issue. Video inter-frame forgery detection is a significant
branch of forens ics. In this paper, a new algorithm based on
the cons istency of velocity field is propos ed to detect v ideo
inter-frame forgery (i.e., cons ecutive frame deletion and con-
secutive frame duplication). The generalized extreme
studentized deviate (ESD) test is applied to identify the for-
gery types and locate the manipulated positions in forged
videos. Experiments show the effectiveness of our algorithm.
Index Terms Video forensics, Inter-frame forgery de-
tection, Velocity field, Generalized extreme s tudentized
deviate
1. INTRODUCTION
Nowadays, surveillance camera sys tems have been widely
deployed in many circumstances to monitor illegal activities.
Surveillance videos have already been regarded as the judi-
cial proofs in the court. However, with the development of
advanced video editors, their integrity cannot be guaranteed
anymore. Therefore, how to authenticate the surveillance
videos has become a significant issue.
So far, many video forensics techniques have been
studied [1]. [2]-[4] propos ed to detect double compress ion,
[5]-[7] detected video forgery with sensor noise patterns,
and [8]-[10] expos ed forgery based on the videos’ content.
In the aspect of inter-frame forgery detection, Wang and
Farid [2] first exposed the frame deletion or insertion by pre-
diction error. They discovered that frames moving from one
group of picture (GOP) to another will have larger motion
estimation errors. However, their method would fail if a com-
plete GOP is deleted. Mondaini et al. [5] proposed to detect
frame insertion/duplication by the photorespons e non-
uniformity noise (PRNU) fingerprinting technique. Chao et al.
[10] propos ed to detect frame deletion and ins ertion through
* Corresponding Author is Xinghao Jiang
optical flow. They found that inter-frame forgery operations
would cause discontinuity in optical flow sequence.
In this paper, we propos e a new approach to detect s u r-
veillance video inter-frame forgery bas ed on the cons istency
of velocity field. This method is able to distinguish the tam-
pered video, identify the forgery types (i.e., consecutive
frame deletion, consecutive frame duplication) and locate the
manipulated pos itions in forged videos as well. Our algo-
rithm follows three steps. Firs t, obtain velocity field
sequence by applying block-based cross correlation. Then,
calculate the corresponding relative factor sequence from
velocity field s equence. Finally, determine the authenticity,
the forgery type and manipulated locations with generalized
extreme studentized deviate (ESD) algorithm.
2. VELOCITY FIELD IN VIDEO FORGERY DETECTION
Velocity field is a term induced from Particle Image Veloci-
metry (PIV) technique [11]. The key point of PIV is to
compare adjacent video frames and estimate their displace-
ments caus ed by time separation. It is considered that any
inter-frame operations , like frame deletion and duplication
will enlarge the displacements . In this s ection, we will show
how to fo rm the velocity field s equence and illustrate traces
left in it after different forgery operations.
2.1. Velocity field sequence estimation
The velocity field computation is done by PIVlab [12]. Its PIV
algorithm is set to FFT window deformation with one-pass
16 16
pixel interrogation window and 75% overlap factor. (1)
and (2) are the mathematical descriptions of the computation
process .
 
 
 
 
11*
C
R u ,v I i , j,t I i , j,t



F F F
(1)
 
 
C
u,v
arg maxRe R u,v
(2)
where
 
,,I i j t
and
 
, , 1I i j t
are the interrogation windows
at
 
,ij
location in
t
and
 
1t
frame respectively.
,
1
F
are 2-D Fourier trans form operator and inverse Fourier trans-
form operator respectively, * is the complex conjugate
2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP)
978-1-4799-2893-4/14/$31.00 ©2014 IEEE 2693
function,
Re
obtains the real part of its parameter. Ac-
cording to thes e formulas,
 
u,v
is regarded as the
displacement (also called velocity vector) between the two
interrogation windows. To express accurately, we denote
 
u,v
as
 
 
, , , , ,u i j t v i j t
, which indicates the velocity vector
at
 
,ij
location of
t
frame. Therefore, we can define the
velo city field intensity (VFI) as follows:
   
, , ; , ,
hv
i j i j
VFI t u i j t VFI t v i j t  
 
(3)
where
 
h
VFI t
,
 
v
VFI t
indicate the horizontal and vertical
velocity field intens ity res pectively. We then denote
 
 
 
1, 1
h
VFI t t L
and
 
 
 
1, 1
v
VFI t t L
as the horizon-
tal and vertical VFI sequence, where
L
is the number of
frames .
The max-s ample technique is used to exclude thos e
frames with extremely low VFI. The low VFI is probably
caus ed by the s imilarity of two neighbor frames in data,
which was introduced by camera coding error. Every three
frames are sampled into one frame with the maximum VFI.
And the sample process starts at the position where the
number of the remaining frames can be divided by 3. Then
we have the new VFI sequences as follows:
 
 
 
 
[1,T] , [1,T]
hv
SVFI t t SVFI t t 
where
 
tSVFI
denotes VFI sequence after max-sampling,


represents round down function and
 
1 / 3L
T


.
The consistency of the VFI sequences in both d irec-
tions will be destroyed if the video is manipulated by some
inter-frame forgery operations . Therefore, the relative factors
h
RF
and
v
RF
are defined to reveal thes e changes .
     
     
11
11
hh
hh
hh
SVFI t SVFI t
RF t SVFI t
SVFI t SVFI t
 

 
(4)
     
   
11
11
vv
vv
vv
SVFI t SVFI t
RF t SVFI t
SVFI t SVFI t
 

 
(5)
In the relative factor sequences
 
 
 
2,T 1
h
RF t t 
and
 
 
 
2,T 1
v
RF t t 
, the discontinuity peaks introduced by the
forgery operations will be obviously highlighted.
2.2. Traces in relative factor sequence
In this paper, two types of forgeries, consecutive frame dele-
tion and consecutive frame duplication are considered.
Different forgery operations will introduce different numbers
of disco ntinuous peaks in the relative factor sequence. Fig. 1
shows the corresponding relative factor sequences of a giv-
en video before and after manipulation .
2.2.1. Original video
Thes e videos are directly from surveillance cameras without
any modifications . And there is no disco ntinuous peak in the
relative factor sequence in this type of video.
(a)
(b)
(c)
Fig. 1. The horizontal (left) and vertical (right) relative factor se-
quences. (a) original video; (b) frame deletion video; (c) frame
duplication video.
(a)
(b)
(c)
(d)
Fig. 2. Four representative frames of a video. Over 600 consecutive
frames have been deleted between (a) and (d) to cover a s usp icious
man walking out of the elevator. There will be no visual differences
before and aft er forgery process.
2.2.2. Consecutive frame deletion video
Thes e videos are tampered by deleting cons ecutive frames.
After the forgery proces s , two originally unrelated frames
have become neighbors, generating a salient increase in the
VFI sequence. Therefore, one discontinuous peak would be
obs erved in the relative factor sequence.
2.2.3. Consecutive frame duplication video
Thes e videos are modified by duplicating consecutive
frames from one time point to another. Hence, two discontin-
uous peaks would be obs erved.
Again, note that we only consider the videos recorded
by s tatic surveillance cameras. That is to say, only the inter-
frame forgery operations will introduce the obvious disco n-
tinuity in the relative factor sequence. In addition, we focus
on detecting videos with meaningful forgeries, which mean
no visual differences will be perceived before and after for-
gery proces s. Fig. 2 demonstrates an example of meaningful
consecutive frames deletion forgery.
2694
3. VIDEO FORGERY IDENTIFICATION
The discontinuous peaks in the relative factor sequence are
regarded as the evidence of video forgery. The generalized
ESD test has been applied to extract the peaks and identify
the forgery types . The detail of the identification algorithm is
des cribed in this section.
3.1. Generalized ES D test
We find that the probability distribution of the relative factor
sequence follows an approximate normal distribution. Hence,
Generalized ESD test [13] is able to be employed in our iden-
tification algorithm.
There are two important parameters in the tes t, the upper
bound number of outliers
r
and significance level
. First
compute
1
R
from
max /
i i i
R x x s
(6)
where
x
and
s
denote the mean and s tandard deviation of
the
n
samples respectively. Remove the obs ervation that
maximizes
/
i
x x s
and then re-compute the above statistic
with
1n
obs ervations . Repeat this proces s until
12
, ,..., i
R R R   
have all been computed. Finally pick the corre-
sponding
r
critical values
i
at the chosen confidence level
. The number of outliers is determined by finding the larg-
est
i
such that
ii
R
.
In order to determine the exact number of peaks in the
relative factor sequence, we have fine-tuned the critical val-
ues
i
by multiplying a coefficient
, and the new definition
is as follows:
 
 
 
 
 
,1
'
2
,1
11
p n i
i
p n i
t n i
n i t n i



 
    
(7)
 
121
pni
  
(8)
where
 
,1p n i
t
is the pth percentile of a
t
distribution with
 
1ni
degrees of freedom.
Some fake peaks might be found in the relative factor
sequence. Comparing with real forgery peaks, thes e peaks
are with relatively low intens ities, which were probably in-
troduced by camera noise or video encoding. The fake peaks
would be determined as outliers with the original
i
, while
the fine tuning, which slightly raises the critical values is
helpful to refus e these fake peaks and pick out the forgery
peaks accurately as well.
3.2. Identification algorithm
According to the des cription in section 2.2, there are at most
two discontinuity peaks in the relative factor s equence,
hence we set the upper bound number of outliers
2r
.
Moreover, the generalized ESD test is carried out on both
horizontal and vertical relative factor sequences , which helps
to improve the identification accuracy. Let
h
N
and
v
N
de-
note the detected number of the discontinuity peaks in
horizontal and vertical sequence, respectively. The flowchart
of the identification algorithm is given in Fig. 3.
Video clip
Generate
relative factor sequence
Generalized ESD test
Duplication
Deletion
1
hv
NN
Original
Y
Y
N
N
Y
N
1; 1
hv
NN
2
hv
NN
Fig. 3 Flowchart of the identification algorithm
Finally, the tampered location range is determined based
on the relative factor sequence generation proces s in section
2. The upper bound of the range is
 
1 3 mod ,3
u VFI
R P L  
,
where
P
is the location of the detected peaks in relative fac-
tor sequence, mod is modulo operation, and
VFI
L
is the
length of the corresponding VFI sequence. Hence, the tam-
pered range is
 
2,
uu
RR
.
(a)
(b)
(c)
(d)
Fig. 4. Four source videos with different scenes. (a) scene 1; (b)
scene 2; (c) scene 3; (d) scene 4.
4. EXPERIMENTS
4.1. Video database
To the bes t of our knowledge, there is no open databas e for
detecting video inter-frame forgery. Therefore, we have in-
vited s ome volunteers to build one. Our four different s cenes
source videos (see in Fig. 4) are downloaded from TRECVID
surveillance event detection evaluation [14]. Each source
video split out 10 video clips. Each clip contains about 3000
frames with
720 576
resolution. Then the 40 video clips
were delicately tampered to generate 40 frame deletion vide-
os and 40 frame duplication videos (defined in s ection 2.2).
Hence, there are totally 120 video clips in our final inter-frame
forgery detection database. Note that all the tampered video
clips were MPEG-2 re -coded with the same coding standard
and parameters as the source videos.
2695
4.2. Results and analysis
The configurations of identification algorithm are as follows.
The upper bound number of outliers is
2r
, the signifi-
cance level is
0.05
and the coefficient of the critical
values is
  
.
4.2.1. Detection accuracy under random deletion
This experiment is to test the s ensitivity of our algorithm by
computing the detection accuracies when frames were ran-
domly deleted. Table I shows the detection accuracies for
randomly deleting 1 frame, 3 cons ecutive frames and 5 con-
secutive frames from original videos . The res ult illustrates
that our algorithm could have good accuracy when detecting
frame deletion forgery with a few frames removed.
4.2.2. Identification accuracy under meaningful forgery
The confus ion matrices for the four scenes of v ideo clips
and the overall accuracy are given in Table II. The relatively
low accuracies for frame duplication identification are due to
the large intens ity gaps between their two detected peaks, all
the incorrect identification videos are identified as frame d e-
letion forgery. However, the result demons trates the
effectiveness of our algorithm with overall 90.0%, 85.0% and
80.0% accuracies for identifying original video, frame dele-
tion video and frame duplication video. If we only consider
whether a video is tampered or not, the overall identification
accuracy for the tampered videos is 96.3%, with 10% false
pos itives . We did not do comparison experiments becaus e
no papers were found on identifying cons ecutive frame dele-
tion and duplication forgeries.
4.2.3. Location accuracy under meaningful forgery
The location is con s idered to be incorrectly identified if one
of the detected peaks in both horizontal and vertical VFI s e-
quences is not in the expected range described in section 3.2.
The location accuracies for correctly identified forged videos
are given in Table III. All the locations of detected peaks in
forged videos are correctly identified due to the s tatistics-
bas ed generalized ESD algorithm.
4.3. Robustness against compress ion
The robustness agains t lossy compression is tested in this
experiment. Each video clips was re-compressed by ffmpeg
software with different Qscales (a parameter to control video
quality). The identification results with Qscale=1 (loss less
compress ion), 2, 3 are s hown in Table IV. When re-
compress ing with Qscale=2, the bit rate has averagely de-
creased by 3%, so the accuracy is the same with the result of
Qscale=2. While when re-compress ing with Qscale=3, the bit
rate dropped a lot (with 30%), the duplication identification
accuracy have s lightly decreased. The reason is that the
intens ity gap between the two detected peaks enlarged after
re-compression, which makes it easy to be identified as frame
deletion forgery. A nyway, the accuracies report the robust-
nes s of our algorithm to some degree of compression.
5. CONCLUSION
We have propos ed a new algorithm to detect video inter-
frame forgery. This method is bas ed on the cons istency of
velocity field. With cons ecutive frame deletion and frame
duplication forgery operations, some discontinuity peaks
can be obs erved in VFI sequence. And the generalized ESD
tes t is applied to extract the peaks and identify the forgery
type. Experiments s how the effectivenes s of our algorithm.
ACKNOWLEDGMENT
The work of this paper is spons ored by the National Natural
Science Foundation of China (No.61272439, 61272249), the
Specialized Research Fund for the Doctoral Program of High-
er Education (No.20120073110053), and the Fund of State
Key Laboratory of Software Engineering, W uhan University
(No.SKLSE2012-09-12). It was also under the Project of In-
ternational Cooperation and Exchanges supported by
Shanghai Committee of Science and Technology (No.
12510708500).
Table I Detection accuracies for random deletion.
Deleted frame number
1
3
5
Accuracy
40%
65%
80%
Table III Location accuracies under meaningful forgery.
Forgery type
deletion
duplication
Accuracy
100%(34/34)a
100%(32/32)
a (n/m) indicates n of m locations are correctly identified.
Table IV Detection accuracies under different Qscales (%).
Qscale
1
2
3
original
90.0
90.0
90.0
deletion
85.0
85.0
85.0
duplication
80.0
80.0
62.5
Table II Confusion matrix for each scene and their overall accuracy (%). denotes value 0.
Video
Scene 1
Scene 2
Scene 3
Scene 4
Overall
Forgery type
oria
delb
dupc
ori
del
dup
ori
del
dup
ori
del
dup
ori
del
dup
ori
80.0
10.0
10.0
90.0
10.0
-
100
-
-
90.0
10.0
-
90.0
7.5
2.5
del
20.0
70.0
10.0
-
90.0
10.0
10.0
90.0
-
-
90.0
10.0
7.5
85.0
7.5
dup
-
30.0
70.0
-
20.0
80.0
-
30.0
70.0
-
-
100
-
20.0
80.0
a Original video typ e. b Frames deletion video ty p e. c Frames duplication video typ e.
2696
REFERENCES
[1] Milani S., Fontani M ., Bestagini P. et al., “An overview on
video forensics, APSIPA Transactions on Signal and Information
Processing, 2012.
[2] Wang W. H. and Farid H., Exp osing digital forgeries in video
by detecting double MPEG compression, In p roceedings of the 8th
ACM workshop on M ultimedia and Security , p p .37-47, 2006.
[3] Chen W. and Shi Y. Q., Detection of double MEPG video
compression using first digits statist ics, Digital Watermarking,
Springer Berlin Heidelberg, pp.16-30, 2009.
[4] Wang W. H. and Farid H., Exp osing digital forgeries in video
by detecting double quantization, In proceedings of the 11th ACM
Workshop on Multimedia and Security, pp.39-47, 2009.
[5] Mondaini N., Caldelli R., Piva A. et al., Detection of malevo-
lent changes in digital video for forensic applications, In
proceedings of the society of p hoto-op tical instrumentation engi-
neers, vol.6505, pp.T5050-T5050, 2007.
[6] Hsu C.C., Hung T.Y., Lin C.W. et al., Video forgery detection
using correlation of noise residue, IEEE 10th Workshop on M ulti-
media Signal Processing, p p .170-174, 2008.
[7] Kobayashi M., Okabe T., Sato Y., Detecting Forgery From
Static-Scene Video Based on Inconsistency in Noise Level Func-
tions, IEEE Transactions on Information Forensics and Security ,
vol.5, pp .883-892, 2010.
[8] Wang W. H. and Farid H., Exposing Digit al Forgeries in Video
by Detecting Dup lication, In proceedings of the 9th ACM Work-
shop on M ultimedia and Security , pp.35-42, 2007.
[9] Conotter V., O'Brien J.F., Farid H., Exp osing Digital Forgeries
in Ballistic Motion, IEEE Transactions on Information and Securi-
ty , vol.7, pp.283-296, 2012.
[10] Chao J., Jiang X. H. and Sun T. F., A novel video inter-frame
forgery model detection scheme based on optical flow consisten-
cy, International Workshop on Digital Forensics and Watermaking,
pp .267-281, 2012.
[11] Grant I., Particle image velocimetry: A review, In proceed-
ings of the Inst itut ion of M echanical Engineers, vol. 211, p p .55-76,
1997.
[12] Thielicke W., Stamhuis E. J., PIVlab- Time-Resolved Digital
Particle Image Velocimetry Tool for M ATLAB version 1.32, 2010.
[13] Iglewicz B., Hoaglin D.C., How to Detect and Handle Out li-
ers, Milwaukee (Wisconsin): ASQC Quality Press, vol. 16, 1993.
[14] TREC Video Ret rieval Evaluation, http ://trecvid.nist.gov/
2697
ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
Full-text available
In this paper, a novel video inter-frame forgery detection scheme based on optical flow consistency is proposed. It is based on the finding that inter-frame forgery will disturb the optical flow consistency. This paper noticed the subtle difference between frame insertion and deletion, and proposed different detection schemes for them. A window based rough detection method and binary searching scheme are proposed to detect frame insertion forgery. Frame-to-frame optical flows and double adaptive thresholds are applied to detect frame deletion forgery. This paper not only detects video forgery, but also identifies the forgery model. Experiments show that our scheme achieves a good performance in identifying frame insertion and deletion model.
Conference Paper
Full-text available
Validating a given multimedia content is nowadays quite a hard task because of the huge amount of possible alterations that could have been operated on it. In order to face this problem, image and video experts have proposed a wide set of solutions to reconstruct the processing history of a given multimedia signal. These strategies rely on the fact that non-reversible operations applied to a signal leave some traces ("footprints") that can be identified and classified in order to reconstruct the possible alterations that have been operated on the original source. These solutions permit also to identify which source generated a specific image or video content given some device-related peculiarities. The paper aims at providing an overview of the existing video processing techniques, considering all the possible alterations that can be operated on a single signal and also the possibility of identifying the traces that could reveal important information about its origin and use.
Article
Full-text available
Recently developed video editing techniques have enabled us to create realistic synthesized videos. Therefore, using video data as evidence in places such as courts of law requires a method to detect forged videos. In this study, we developed an approach to detect suspicious regions in a video of a static scene on the basis of the noise characteristics. The image signal contains irradiance-dependent noise the variance of which is described by a noise level function (NLF) as a function of irradiance. We introduce a probabilistic model providing the inference of an NLF that controls the characteristics of the noise at each pixel. Forged pixels in the regions clipped from another video camera can be differentiated by using maximum a posteriori estimation for the noise model when the NLFs of the regions are inconsistent with the rest of the video. We demonstrate the effectiveness of our proposed method by adapting it to videos recorded indoors and outdoors. The proposed method enables us to highly accurately evaluate the per-pixel authenticity of the given video, which achieves denser estimation than prior work based on block-level validation. In addition, the proposed method can be applied to various kinds of videos such as those contaminated by large noise and recorded with any scan formats, which limits the applicability of the existing methods.
Conference Paper
Full-text available
We propose a new approach for locating forged regions in a video using correlation of noise residue. In our method, block-level correlation values of noise residual are extracted as a feature for classification. We model the distribution of correlation of temporal noise residue in a forged video as a Gaussian mixture model (GMM). We propose a two- step scheme to estimate the model parameters. Consequently, a Bayesian classifier is used to find the optimal threshold value based on the estimated parameters. Two video inpainting schemes are used to simulate two different types of forgery processes for performance evaluation. Simulation results show that our method achieves promising accuracy in video forgery detection.
Article
We describe a technique for detecting double quantization in digital video that results from double MPEG compression or from combining two videos of different qualities (e.g., green-screening). We describe how double quantization can in-troduce statistical artifacts that while not visible, can be quantified, measured, and used to detect tampering. This technique can detect highly localized tampering in regions as small as 16 × 16 pixels.
Conference Paper
With the advent of high-quality digital video cameras and sophisticated video editing software, it is becoming increas- ingly easier to tamper with digital video. A common form of manipulation is to clone or duplicate frames or parts of a frame to remove people or objects from a video. We describe a computationally ecient technique for detecting this form of tampering.
Conference Paper
With the advent of sophisticated and low-cost video editing software, it is becoming increasingly easier to tamper with digital video. In addition, an ever-growing number of video surveillance cameras is giving rise to an enormous amount of video data. The ability to ensure the integrity and au- thenticity of this data poses considerable challenges. Here we begin to explore techniques for detecting traces of tam- pering in digital video. Specifically, we show how a doubly- compressed MPEG video sequence introduces specific static and temporal statistical perturbations whose presence can be used as evidence of tampering.