ArticlePDF Available

Database watermarking, a technological protective measure: Perspective, security analysis and future directions

Authors:

Abstract and Figures

Digital Databases dynamically generates a major proportion of the internet content. The databases are created, stored and accessed digitally and transmitted through computer networks. This has grown the potential, sizes and performance of databases in exponential magnitudes. Thus, the need to protect digital databases arises due to the increased vulnerability to copyright and piracy threats originating from the Internet. Both legal and technological measures must be utilized in a synergetic manner to ensure an adequate level of protection. TPMs backed by legal anti-circumvention measures offer a cost-effective solution to database protection. We provide the current state-of-art and analyses of the arena of digital database protection from a combined legal and technical perspective. Our work is more focused on security analysis of the work done so far and provides readers with detailed discussion on the future directions in the domain of digital watermarking of databases.
Content may be subject to copyright.
1
DATABASE WATERMARKING, A TECHNOLOGICAL PROTECTIVE MEASURE:
PERSPECTIVE, SECURITY ANALYSIS AND FUTURE DIRECTIONS
Vidhi Khanduja
Department of Computer Engineering, Netaji Subhas Institute of Technology, Delhi, India
vidhikhanduja9@gmail.com
Abstract
Digital Databases dynamically generates a major proportion of the internet content. The databases are
created, stored and accessed digitally and transmitted through computer networks. This has grown the potential,
sizes and performance of databases in exponential magnitudes. Thus, the need to protect digital databases arises
due to the increased vulnerability to copyright and piracy threats originating from the Internet. Both legal and
technological measures must be utilized in a synergetic manner to ensure an adequate level of protection. TPMs
backed by legal anti-circumvention measures offer a cost-effective solution to database protection. We provide
the current state-of-art and analyses of the arena of digital database protection from a combined legal and technical
perspective. Our work is more focused on security analysis of the work done so far and providing readers with
detailed discussion on the future directions in the domain of digital watermarking of databases.
Keywords: Information Security, Digital Watermarking, Right Protection, Tamper Detection.
1. INTRODUCTION
Databases are repertoires of knowledge garnered by the collective efforts of mankind through ages and
across regions. Digital Databases dynamically generates a major proportion of the internet content. The databases
are created, stored and accessed digitally and transmitted through computer networks. This has grown the
potential, sizes and performance of databases in exponential magnitudes. Whether they are sold in pieces for data
mining applications such as stock market data, consumer behaviour data, power consumption data and weather
data or maintained in-house such as product data by e-commerce sites and medical history of patients by hospitals,
databases play a pivotal role in all aspects of society.
As end users demand more and more information to be available on the net either at low or no cost, database
developers are interested in generating revenues from creating databases as this involves intellectual and financial
inputs. The need to protect databases arises due to the increased vulnerability to copyright and piracy threats
originating from the Internet [1]. Developers have the responsibility to not only supply accurate data, but also
ensure its security against illegal copying, hacking or tampering. The digital watermarking of copyrighted works
augmented by legal means is fast emerging as an effective and efficient means to protect shared and outsourced
databases from infringement.
Section 2 introduces the process of digital watermarking in databases. In section 3, we analyze the arena of
digital database protection from a combined legal and technical perspective. Section 4 throws the light on existing
work in literature; Section 5 presents the security analyses of the various techniques and Section 6 discusses the
future directions. Finally, section 7 concludes the paper.
2. AN OVERVIEW OF DIGITAL DATABASE WATERMARKING
Digital watermarking is a viable and cost-effective technological method that protects digital documents such
as images, video and databases by marking them with some digital pattern. Watermarking algorithms for digital
2
databases invariably introduce small changes in the data being watermarked with an objective of inserting the mark,
but without altering the database in any significant way. Watermarking does not completely prevent piracy.
However, it does provide a means to establish the true identity of the owner and deters attempts to plagiarize or
distort it [2]. Fig.1 illustrates the process of watermarking databases. The robust watermarking resolves ownership
issues while fragile watermarking is for integrity constraints.
Fig. 1 Watermarking Relational Databases
Unlike encryption and hashing techniques, watermarking does not attempt to hide the data, but instead
infuses a kind of ownership proof in the data. Encryption and hashing provide protection to the content by making
the information indecipherable by an attacker. On the contrary, digital watermarking works on the principle of
modifying the content in a manner so that the usability of the data is retained fully. In fact, an attacker or observer
has no way to decipher that a watermark is actually present in the database. However, the watermark does remain
within the content inseparably providing a proof of the ownership or a signature to detect tampering or to track the
people who may have obtained the content legally and are illegally redistributing it [3].
Many researchers have contributed significantly in the area of watermarking multimedia data such as image,
audio and video sources [3-6]. The technique to conceal watermark within databases differs significantly with that
of multimedia data [2]. As compared with data in relational databases, multimedia data is voluminous and has a
large bandwidth to hide watermarks in a redundant manner. Moreover, the relative spatial positioning of different
parts of multimedia data such as image or video is not affected significantly when watermark bits are inserted.
Therefore, the quality of image is largely retained.
Relational databases on the other hand, comprise concrete tuples and attributes, each tuple representing a
distinct entity. Firstly, we need to disperse the watermark bits across multiple tuples to achieve redundancy. There
is no particular ordering between these tuples, so even a subset of tuples can be used. Secondly, embedding of
watermark bits invariably introduces perturbations within a database. These perturbations may adversely affect
the usability of the database. Therefore, a database watermarking technique must ensure that the usability
constraints of the attributes stay within limits when watermark bits are inserted. Usability constraints are the
limitations imposed on each attribute. They are decided by the database owner or designer and depend upon its
Watermark Extraction
Robust: Watermark act as a Proof
of Ownership
Fragile: Watermark used to detect
tampered database.
On Web
Either outsourced, published online.
Watermark Insertion
1. Prepare watermark 2. Watermark the database
3
specific application(s). For example, attribute value should be unique; classification range must remain same
before and after concealing watermark etc. [7].
3. DATABASE WATERMARKING AS A TECHNOLOGICAL PROTECTION
MEASURE WITH LEGAL PROTECTION
The foundation of modern society is intelligible information compiled in innumerable databases.
Government departments, corporations, multinational companies, information bureaus and research centers
produce databases of government records, medical and legal case records, web pages and collection of literary
and artistic works.
Article 1 (2) of Directive 96/9/EC of the European Parliament defines a database as a collection of
independent works, data or other materials arranged in a systematic and methodical way and individually
accessible by electronic or other means” [8]. According to the Black’s Law Dictionary, a database is defined as
“a compilation of information arranged in a systematic way and offering a means of finding specific elements it
contains, often today by electronic means” [9]. The relevance of digital databases is explicitly highlighted in this
definition. The distinguishing feature of digital databases is that they can be created, downloaded, value added
and digitally re-transmitted with great flexibility and speed.
It requires significant efforts in terms of money, manpower and creative inputs to build high-quality
databases. They are thus Intellectual Property in their own right. Given their rich informational content and the
ready availability of advanced technologies to communicate and modify them with relative ease, it is imperative
to protect digital databases against potential misuse. Technological methods that are designed to protect digital
content of any form like text, images and databases are called “Technological Protection Measures (TPM). They
include the technologies that can control access to copyrighted digital content or can prevent users from copying
such protected content. Watermarking of digital databases is one such TPM that has emerged as an effective means
to protect shared and outsourced databases from infringement.
The United States initially followed the “Sweat of the Brow” doctrine to protect databases, which did not
require many creative skills such as catalogs and directories. In the landmark case of Feist Publications v. Rural
Telephone Service Co., the Supreme Court overturned this and stipulated a modicum of creativity to admit any
database under Copyright protection [10].
In Europe before 1996, there was no uniformity in laws regarding protection of databases in member states of
the EU. However, with the implementation of European legislation EC Directive 96/9/EC, the treatment of
copyrightable databases was harmonized [8]. More significantly, it introduced a new sui generis right to bring non-
copyrightable databases into the ambit of legal protection.
Focusing on the national scenario, India is a common law country. Databases which have been prepared with
cognizable creative efforts in the systematic arrangement of facts are protected under copyright law [11]. However,
courts have consistently relied on the “sweat of the brow” doctrine [12]. Some courts have also categorically
rejected this doctrine and emphasized on the aspect of creativity in its ad-jurisdiction [13]. India has taken laudable
steps to codify significant portions of its traditional knowledge base such as Ayurvedic, Unani, Sidhha etc. by
codifying it in the Traditional Knowledge Digital Library (TKDL) [14]. This initiative has succeeded in blocking
several false claims of traditional Indian knowledge by individuals and groups to seek patents. World Intellectual
4
Property Organization (WIPO) and India have ventured together to see how the TKDL model can be emulated by
other countries to stop a misappropriation of traditional knowledge [15].
In India, Copyright Amendment Bill 2012, under Section 65A introduces the concept of TPM, as a measure
used to enforce restrictions on the use of copyrighted material [16]. Criminal and monetary liabilities are subjected
to any person who circumvents the TPMs, with the intention of infringing rights.
Both legal and technological measures must be utilized in a synergetic manner to ensure an adequate level
of protection. TPMs backed by legal anti-circumvention measures offer a cost-effective solution to database
protection [16-19].
4. AN OUTLINE OF DATABASE WATERMARKING RESEARCH
An extensive literature survey is conducted in this domain. In this section we highlight only the major
points from contributions exists in literature in the domain of watermarking databases as few survey papers
already exists in the literature [20-23]. However, they only discuss their techniques and classify them
accordingly. We provide the lacunae and advantages of primary research in this domain. Additionally, our
work is more focused on security analysis of the work done so far and providing readers with detailed
discussion on the future directions. Fig. 2 and Fig. 3 illustrate the features of watermarking techniques. The
watermark is either randomly selected binary stream, or Object Identifier selected by owner or based on
owner’s biometric trait. The watermark can be embedded in all tuples (AT) or few selected tuples (MT).
Similarly, within a tuple, single (SA) or multiple attributes (MA) can be selected for embedding watermark
bit. Finally, within an attribute either single bit (SA) is modified or multiple bits (MB) depending on various
watermarking approach [21].
Fig. 2 Features of watermarking techniques
Content of Watermark
Random (No Information)
Related to owner/Owner
specific information (OID)
Created from database
(Signature)
Target
Numeric
Text
Categorical
FEATURES
5
Fig. 3 Various Granularity levels
Table 1 and 2 shows the comparison of various existing watermarking techniques targeting numeric as
well as other attributes. Table 3 enlists various fragile watermarking techniques. These techniques are able to
detect any temperedness introduced in the dataset. The watermark acts as a signature and is prepared from the
dataset. It is either embedded making a distortion in the dataset, or saved with trusted third party making a
distortion-free technique. The localization of position where error has occurred varies with technique.
Based on our analysis and past researchers work; we found that most of the work is focused on traditional
relational databases. The existing techniques have served the basic purpose of technological protection of
digital content well. But they need to be further improved by increasing the degree of robustness against
malicious attacks, minimizing the distortions produced due to the process of watermarking itself and enhancing
the reliability with which the proof of ownership is established in case of conflicts.
Table 1: Summary and comparison of robust watermarking techniques targeting numeric
attributes.
Proposed
Schemes
Features
Weakness
Agrawal et. al.
[2][24]
1. Identified need of
watermarking
Relational databases
2. Single attribute and
single bit (SASB)
selection approach.
1. Single bit-watermark
2. Primary Key (P.K) dependent
technique
3. Not resilient to Subset deletion,
attribute re-order Attack
Xinchun et.al.
[25]
1. Weighted algorithm
based attribute
selection
2. SASB approach.
1. Prone to Bit- attacks
2. P.K dependent technique
Farfoura et.al.
[26]
1. Reversible technique
2. Use of Time stamping
protocol
3. Single attribute and
multiple bit (SAMB)
selection
1. P.K dependent technique
2. Not resilient to attribute Re- order
attack
3. Linear Transformation attack not
addressed
Zhou et. al.[27]
1. Random changes not
visible
2. Use of TTP
3. SASB selection
approach.
1. Not resilient to Subset deletion
attack
2. Prone to Bit-attacks
Tuple
All
Multiple but
not all
Attribute
Single
Multiple
Bits
Single
Multiple
6
Sun et.al.[28]
1. Multiple images are
embedded
2. SASB selection
approach.
1. P.K dependent technique
2. Recognizable Common pattern
/correlation in selection (Same value
used for tuple, attribute and bit
selection.)
Wang et. al. [29]
1. Use of Owner’s voice
as watermark
2. SASB selection
approach.
3. Biometrics is not systematically
addressed
2. Recognizable Common pattern in
selection (Same parameter used for
tuple, attribute & bit selection.)
3. No technique outlined for handling
false claims of ownership
Sion et. al. [7]
1. Partition Statistics are
used
2. Usability Constraints
considered
3. P.K independent
4. SAMB selection
1. Prone to Synchronization errors as
marker tuples are used
2. Not resilient to Subset Deletion and
Alteration attack
3. No clear systematic approach for
data manipulation
Shehab et.al. [30]
1. Partitioning without
marker tuples
2. Optimization based
technique.
3. Multiple attributes and
multiple bits (MAMB)
selection
1. Computationally not efficient
2. Linear Transformation attack not
addressed
3. P.K dependent technique
Khanduja et.al.
[31]
1. Optimization based
technique.
2. Use of Bacterial
Foraging algorithm
3. MAMB selection
1. Technique applicable to only
Numeric attributes.
Iftikar et.al.[32]
1. Reversible
watermarking
technique
2. Optimal watermark
creation through
Genetic Algorithm
3. All tuples, MAMB
selection
1. Semi-blind technique.
2. Not resilient to attribute deletion
attack.
Table 2: Summary and comparison of robust watermarking techniques targeting other data
types.
Proposed
Schemes
Target
Water-
mark
Features
Weakness
Odeh et. al.[33]
Time
Image
1. Large bit capacity for
watermark embedding
as SAMB selection
2. No effect on usability
of data
1. Not resilient to Subset
deletion attack as Marker
tuples are used
2. Not resilient to Subset
Alteration attack
Al-Haj et.al.[34]
Text (Non-
Numeric)
Image
1. Large Bit capacity for
watermark insertion
1. Not resilient to Subset
alteration attack as double
spaces are added
7
2. Extra spaces
embedded
3. SAMB selection
2. Security issues not
addresses like secret key
not used
Hanyurwimfura
et.al. [35]
Non-
numeric
Random
bit-
stream
1. Watermark bit is
embedded in
multiword attribute
with the lowest
Levenshtein Distance
2. Large bit capacity as
MAMB selection
1. No discussion on watermark
2. P.K dependent technique
3. Not so secure technique as is
easily detectable
Sion et. al.[36-
37]
Robust
Bit
string
1. Identified the need of
watermarking
categorical data
2. Error Correcting Code
is applied
3. SAMB selection
1. Introduce distortions in
categorical data
2. P.K dependent technique
3. Less Resilient against
malicious attacks
Table 3: Summary and comparison of fragile watermarking techniques.
Proposed
Schemes
Watermark
Data Type
Features
Weakness
Li et.
al.[38]
Signature
Categorical
1. Distortion free
technique
2. Localization up to
group level
1. Not resilient to Tuple re-
ordering attack
2. P.K dependent technique
Guo et.al.
[39]
Signature
Numeric
1. Partitioning based
technique
2. Localization up to
tuple level
1. Perturbations in LSBs not
detectable
2. P.K dependent technique
Khan et.al.
[40]
Signature
Categorical
Distortion free
technique
1. P.K dependent technique
2. Prone to Attribute-value
substitution attack
Khataeimara
gheh et.al.
[41]
Signature
Numeric
1. Localization of the
tampered data
2. Technique to
recover data
1. Fails to localize for two or
more than two updations in
different tuples/ attributes
2. Fails to recover when an
attribute is deleted.
3. No focus on information
recovery
L.Camara,
et.al. [42]
Signature
Watermark
not
embedded in
database
Distortion-free
technique
1. No discussion on watermark
preparation
2. P.K dependent technique
Kamel [43]
Secret
number
Indirect
embedding
Distortion-free
technique based on
reordering of tuples
1. Not resilient to tuple
alteration attack
2. No discussion on watermark
preparation.
5. SECURITY ANALYSIS
The techniques proposed numerous processes of concealing watermarks. In most of them, the watermarks
are concealed within a database. We now analyse the security of such techniques considering generalized
8
situation. The analysis on various attacks is presented in [21]. In this work, we analyse an important aspect
of security by calculating the probability to find the potential locations. If an attacker, say Mallory, tries to
destroy the watermark to claim the database to be hers. Under such circumstances, we analyse the difficulty
level of Mallory to find the positions of embedded bits to alter the watermark. Another important concern is
False Hit Rate. Any watermarking model is considered to more robust if its False Hit Rate is minimum.
Section 5.1 analyses the security by calculating the probability to find potential locations and in Section 5.2
below we discuss the False Hit Rate for various watermarking techniques.
5.1. Probability to find Potential locations
We calculate the probability to find the potential locations based on granularity levels of various embedding
procedures as illustrated in fig. 3 and table 1 and 2. The watermark is converted to binary and then bit/s of
watermark is embedded in the attribute value (converted to binary form). Let the number of bits in a watermark
be . The process requires maximum of attributes to completely embed watermark once considering a single
bit, concealed within an attribute. Positions where watermark is to be concealed is securely selected (i.e. using
secret parameters). Let be number of attributes in a database and  is number of permissible attributes where
watermark can be concealed such that  .  is decided by the owner of the database considering
usability constraints. Let, the attribute length be and number of permissible bits within , where watermark
bit can be embedded without violating the usability constraint of an attribute is  such that  . Table 4
enlists the other symbols used herewith.
Table 4: Notations
Symbol
Meaning /Explanation
Length of watermark (in bits)
Length of an attribute (in bits)

Number of permissible bits out of , in an attribute where watermark can be embedded.
Number of attributes in a database

Number of permissible attributes out of where watermark can be embedded.
Number of tuples in a database
Multiple attributes selected out of ; attributes where embedding occurs. 
Multiple tuples out of where watermark is embedded, .
Multiple bits within a selected attribute to conceal watermark bits, .
Let us first calculate the probability to correctly select target location within a single tuple.
The probability to correctly select single watermarked attribute out of 
  
(1)
The probability to correctly select bit position within a selected attribute   
(2)
In certain techniques, multiple attributes are selected per tuple [44-45]. In such cases, probability to select
multiple attributes say, attributes out of 
9
 
 (3)
In certain techniques, multiples bits positions are selected within a selected attribute to embed multiple bits
say , of watermark [44]. In such cases, the probability to correctly select embedding locations is given
by:   
 (4)
Total probability to correctly choose single watermarked location within a single earmarked attribute of a
tuple is calculated using Eq(1) and Eq(2) as
  
(5)
Now, considering entire database comprising tuples, the probability in Eq (5) becomes:
 
(6)
This is the probability when all the tuples are used to embed the watermark [30, 34]. This produces more
alterations in the database at the cost of more potential locations for concealing watermark bits [21].
Let us suppose watermark is not embedded in all the tuples. Certain tuples say, is selected for embedding
watermarks using owner selected secret parameters [2, 24-26, 29, 45]. We now calculate probability in each of
the following cases:
i. Case MTSASB: The probability to correctly select tuples out of (considering single bit and
single attribute selection Procedure) using Eq (1) and (2) is given by:





 
 
(7)
ii. Case MTSAMB: The probability to correctly select tuples out of (considering multiple bits
and single attribute selection Procedure) using Eq (1) and (4) is given by: PMTSAMB

 

  

 

 
 
 
  
   (8)
iii. Case MTMAMB: Total probability to correctly select all the watermarked locations considering
multiple bits per attribute and multiple attribute selection within selected tuples is given by:
10

 
  

 
  

 
  


 
  


 

 
  
  (9)
Probability to detect potential locations in all the four above cases i.e. ATSASB, MTSASB, MTSAMB
and MTMAMB is calculated and values are plotted fig. 4. The graph shows that the probability to find
potential locations by an attacker without the knowledge of secret parameters decreases as we move from Eq.
(7) for MTSASB to Eq.(8) for MTSAMB and then (9) for MTMAMB and finally we get least probability in
case of Eq. (6) for ATSASB. The simulation parameters taken are enlisted in table 5.
Table 5: Simulation values
Simulation Parameters
Values
1000

5 (Varied from 3 to 7 )
2

4 (Varied from 2 to 8)
2
50% of (Varied from 30% to 80%)
Fig. 4 Probability to select potential locations in case of ATSASB, MTSASB, MTSAMB and MTMAMB.
11
Probability to detect potential locations for single attribute, single bit selection in all tuples of the database
is calculated using Eq(6). The variations in probability is recorded in table 6 by varying , . As
number of tuples increases, number of potential location increases thereby, decreasing the probability. Table 7
contains the probability in case of MTSASB. The values are recorded at different values of .
Considering =50% of we also plotted the graph showing variation in probability calculated using
Eq. 6, 7, 8 and 9 respectively shown in fig. 5. It shows the decrease in probability to detect potential locations
with the increase in. Increase in the number of tuples causes increase in the number of potential locations thus;
probability to detect them correctly further decreases in all four cases.
Table 6 : Probability to detect potential locations for single attribute, single bit selection in all tuples of the database
by varying , 
P ATSASB
=100
=500
=1000
=1500
=5000
  4)
1.9E-145
2.6E-724
6.9E-1448
1.8E-2171
1.6E-7236
PATSASB  4)
7.8E-131
3.0E-651
9.3E-1302
2.8E-1952
7.0E-6506
PATSASB  4)
1.2E-108
2.5E-540
6.5E-1080
1.6E-1619
1.2E-5396
PATSASB  8
6.2E-161
9.3E-802
8.7E-1603
8.1E-2404
5.0E-8011
PATSASB  2)
1.0E-100
1.0E-500
1.0E-1000
1.1E-1500
1.0E-5000
Table 7: Probability to detect potential locations for single attribute, single bit selection in selected tuples of the
database by varying ,  and a) =80% b) =50% =30%
b) Considering =50%
MTSASB, =50%
=100







=5000

PMTSASB  )
3.8E-117
5.6E-587
2.9E-1174
1.0E1761
2.2E-5874
PMTSASB  4)
8.8E-95
3.4E-475
1.1E-950
2.3E-1426
1.7E-4756
PMTSASB 
7.8E-110
1.9E-550
3.4E-1101
3.9E-1652
4.7E-5509
PMTSASB  
1.0E-79
6.2E-400
3.7E-800
1.3E-1200
6.6E-4004
PMTSASB  
1.0E-83
1.0E-419
9.5E-840
5.7E-1260
7.4E-4202
PMTSASB  
4.3E-102
1.0E-511
9.7E-1024
5.9E-1536
8.4E-5122
=30%
MTSASB, =30%
=100
30






=5000

MTSASB, =80%
=100







=5000

PMTSASB  )
2.6E-161
2.2E-807
4.2E-1615
6.2E-2423
3.1E-8078
PMTSASB  4)
1.5E-125
1.6E-628
2.2E-1257
2.4E-1886
1.3E-6289
PMTSASB 
1.2E-149
6.2E-749
3.4E-1498
1.4E-2247
1.0E-7493
PMTSASB  
1.8E-101
4.1E-508
1.5E-1016
4.1E-1525
1.7E-5085
PMTSASB  
8.7E-108
8.8E-540
6.8E-1080
4.0E-1620
3.3E-5402
PMTSASB  
3.1E-137
5.7E-687
2.8E-1374
1.0E-2061
4.0E-6874
12
PMTSASB  )
1.2E-78
3.4E-394
6.4E-789
1.0E-1183
1.4E-3947
PMTSASB  4)
3.2E-65
4.1E-327
9.0E-655
1.8E-982
7.9E-3277
PMTSASB 
2.9E-74
2.8E-372
4.4E-745
6.2E-1118
2.2E-3728
PMTSASB  
3.4E-56
5.8E-282
1.8E-564
5.2E-847
2.7E-2825
PMTSASB  
1.4E-58
7.8E-294
3.2E-588
1.2E-882
4.6E-2944
PMTSASB  
1.3E-69
4.9E-349
1.3E-698
3.1E-1048
5.0E-3496
Fig. 5 Effect of total number of tuples on probability to detect watermarked locations
Similarly, the effect of on probability is plotted in fig. 6. As number of marked tuples, increases,
the number of potential location increases, thereby decreasing the probability in all the three cases. However,
MTMAMB is more dependent on as compared to MTSASB. Probability in case of ATSASB does not depend
on . It was also observed that this probability also depend on . Fig. 7 depicts the effect of  on the
probability. The value of  is varied from 3 to 7 and probability is observed in each of the four cases. Probability
in case of MTMAMB shows maximum dependency on , as it shows variation/decrease in probability.
Next, the effect of  on probability is observed and plotted in fig. 8. The value of  is varied from 2
to 8 and the respective probability is recorded in each of the four cases. Variation in probability is almost equal in
case of MTSAMB and MTMAMB showing equal dependency which can be verified from Eq(8) and Eq(9).
13
Fig. 6 Effect of total number of marked tuples on probability to detect watermarked locations
Fig. 7 Effect of total number of potential attributes on probability to detect watermarked locations
14
Fig. 8 Effect of total number of permissible bits within a selected attribute on probability to detect
watermarked locations
5.2. False Hit Rate
We now calculate the False Hit Rate to analyse the security.  is the probability of extracting a valid
watermark from a non-watermarked database. Our analysis is grounded upon the approach followed in [2]. In
[2] each watermark bit is embedded once. However, all the recent work embeds single watermark bit repeatedly
to make the system more robust against malicious attacks such as subset alteration, insertion and deletion attacks.
The final watermark bit is decided using majority voting among multiple extracted bits. In this work, we extend
the calculation done in [2] for repetitive embedding of a watermark bit.
Let a watermark bit be embedded times in a database. Each bit is extracted from a database with
the probability of  
to match with the original embedded watermark bit. The final watermark bit is decided
based on majority voting. We define as the probability of correctly extracting one watermark bit after majority
voting by sheer chance. This is equivalent to saying that  is the probability that at-least one more than half of
bits can be detected from non-watermarked relation by sheer chance. We now calculate using binomial
distribution considering independent trials [46].
   



(10)
Where,  be the probability of obtaining success in Bernoulli trials with probability for success and
for failure given by equation (11).
15
 

(11)
We use  referred as cumulative binomial probability representing the probability of obtaining at-least
success from Bernoulli’s trials.


(12)
For a watermark of length , the false hit rate  is given by:
 
(13)
Where, is length of watermark in bits and is watermark threshold such that if watermark bits
correctly matches the original watermark out of bits; we can claim that the suspected database is ours. The
value of will vary for different embedding schemes as follows.
In case of ATSASB, 
. (14)
For MTSASB,
. (15)
For, MTSAMB, 
. (16)
And in case of MTMAMB, 
(17)
We took   and plotted the graph by varying. Fig. 9 illustrates
that false hit rate is monotonically decreasing with increase in threshold. We observed the effect of on 
and plotted the values in same graph. It is revealed that  increases with increase in. Table 7 shows the
values obtained. is maximum in case of MTMAMB, resulting in maximum redundancy and thus  also gets
maximum value. The watermarking model is considered best , if it has minimum False Hit Rate. False Hit Rate
can be decreased by lowering
Table 7. Values of and False Hit Rate
at =0.6
ATSASB
0.0051
MTSASB
0.00108
MTSAMB
0.0051
MTMAMB
0.0149
16
Fig. 9 Effect of threshold on False Hit Rate  for all four watermarking approach.
6. FUTURE DIRECTIONS
Databases that populate the web need reliable technological measures for their protection. We have
identified four security aspects that need powerful combative measures for protection.
(i) Ownership Proof: To provide evidence in ownership disputes, the watermark is securely concealed within
the database. Concealed watermark should be robust to various malicious attacks.
(ii) Tamper Detection: To detect the occurrence of perturbations in the database, watermark is concealed
within it. Embedded watermark should be fragile to various malicious attacks.
(iii) Information Recovery: The information contained within a database must be protected. The watermark
must contain the recovery information to restore the lost information from a database.
(iv) Authentication: To prove the authenticity of the sender, an identity of a sender/owner is embedded as a
watermark within the database. This is achieved by embedding biometric trait of the owner as a watermark.
Significant work is done in the field of digital watermarking of databases for protection of first two security
aspects; ownership proof and tamper detection [24-40]. We observed that the statistical watermarking techniques
need to be improved for better applicability [7, 30, 31]. We have identified lacunae in existing technique and
enumerated them in table 1, 2, and 3. We found that it is worth examining these techniques for the challenging
application of ownership proofs. This has given us an objective of increasing the reliability of existing robust
watermarking technique.
There is an ample opportunity to explore new digital watermarking techniques that cater to the requirements
for emerging applications. We found that a watermark can be fruitfully applied as an information carrier to address
17
the other two security issues namely, information recovery and reliable authentication of content providers.
Therefore, we venture to address these additional security aspects.
Some databases such as meteorological department, medicine, military, transportation etc. contain critical
attributes. These high risk databases need to be preserved. Moreover, the information in the critical attribute is
subject to various malicious attacks resulting in threat to their ownership. This motivated to highlight the need to
develop a scheme that recovers lost information and also resolves ownership issues.
A fragile watermarking technique for recovering data is proposed by Khataeimaragheh et.al. [41]. The
proposed scheme can detect and correct perturbations in RDBs. However, it was found that the probability of
accurately rectifying errors reduces drastically as the number of errors exceeds two in [41]. In another work in
literature, proposed technique deciphers data in terms of information it represents [44]. It recovers the information
lost from altered as well as deleted data irrespective of number of errors occurred in database [44].
We move ahead in same direction, highlighting the utility of watermark as carrier of the information to
address another security issue, i.e. to authenticate the owner or content provider of a database. Scant attention is
given towards exploring the potential of biometrics to attain the ownership in case of digital databases. Wang
et.al. suggested a use of owner’s voice to establish ownership [29]. This technique was improvised in [45]. In
[45] a robust technique that protects the ownership rights with a high degree of accuracy is proposed. Their
technique proposes embedding watermarks in multiple attributes thus, enabling better resilience in comparison
with [29]. The relative evaluation approach is applied by comparing scores of voice samples of the contenders;
thus identification of the owner is achieved even if a watermark extracted is of degraded quality. Thrust lies in
investigating multiple biometrics for improvising the reliable ownership proof in noisy environments.
Further, as we move towards distributed, real time and mobile applications, the time taken to embed a
watermark must be reduced. The challenge is to consider the protection of emerging web databases such as object
oriented and object-relational web databases that have started gaining popularity and are outsourced or shared
on the web. Further dynamic web databases such as NoSql databases are being widely adopted for large-scale
data applications and social networking sites. Since the traditional database systems weren’t designed or
optimized for such enormous size of data with complex data types and structures, the need of Not Only SQL
(NoSQL) databases increased. These databases may or may not have features of traditional relational databases
in favour of better horizontal scaling facilities, which is a problem for RDMS or having schema-less document
based objects, which allow capturing complex structures such as data coming from several different sensors and
also allowing faster access in some cases. They need urgent protection against piracy and integrity losses as it
seemed to have escaped the attention of researchers.
7. CONCLUSION
The area of digital watermarking is rife with challenges, and ample research is still ongoing. We
emphasized granularity level and thus, categorized watermarking techniques into four types, i) ATSASB ii)
MTSASB iii) MTSAMB and iv) MTMAMB. We have analysed the security of watermarking techniques
w.r.t. ability to locate watermarked positions within a database without knowledge of secret parameters. We
have theoretically analysed its dependency on various parameters; i) ii) iii) iv) . Increase in
the number of tuples causes increase in the number of potential locations thus; probability to detect them
correctly further decreases in all four cases. For a secure system, this probability should be least. We further
18
calculated the false hit rate of the watermarking techniques based on their categories. More the redundancy,
more the false hit rate and less robust the approach.
Our research shows the multiple directions which are worth addressing in future endeavours.
1. There is a need to design suitable watermarking techniques targeting emerging web databases to provide
protection against various security aspects.
2. Future work can focus on exploiting watermarking for extracting and thus, regenerating the
lost/perturbed information concealed within a database.
3. Extending the use of watermarking to address additional security issues such as authentication and
information recovery.
Given these promising research directions, the domain of watermarking can be further enriched with a thrust
on database security.
References
[1] Gupta V, (2000) Legal Protection of Databases, Malaysian J. of Library & Information Science, 5 (2), 19-
29.
[2] Agrawal R, Haas PJ, Kiernan J (2003) Watermarking relational data: framework, algorithms and analysis,
The VLDB J., , 12 (2), 157-169.
[3] Cox IJ, Miller ML, Bloom JA (2002) Digital Watermarking, Morgan Kaufmann Publ., by Academic Press.
[4] Chan C, Cheng L (2004) Hiding Data in Images by simple LSB substitution, Pattern recognition, 37(3),
469-474. doi: 10.1016/j.patcog.2003.08.007.
[5] Lie W, Chang L (2006) Robust and high quality time domain audio watermarking based on low frequency
amplitude modification, IEEE Transactions on Multimedia, 8(1), 46-59. doi: 10.1109/TMM.2005.861292.
[6] Chan PW, Lyu MR (2003) A DWT based Digital Video watermarking scheme with error correcting code,
In proceedings of Inf. & comm. secur., LNCS, 2836, 202-213.
[7] Sion R, Atallah M, Prabhakar S (2004) Rights Protection for Relational Data, IEEE Transactions on Knowl.
& Data Eng. 16(12), 1509-1525.
[8] Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection
of databases Official Journal L 077, 27/03/1996 pp: 20-28.
Availableat:http://eurlex.europa.eu/LexUriServ/LexUriServ.do?Uri=CELEX:31996L009: EN:HTML.
[9] Black’s Law Dictionary,(2009) 9th Edition, Database, ©thompson Reuters.
[10] United States Copyright Office, “Legal protection for databases”, August 1997. Available at:
http://www.copyright.gov/reports/dbase.html.
[11] Maggon H. Legal Protection of Databases: An Indian Perspective, 11, Journal of INTELL. PROP.
RIGHTS, 140, pp: 140-144, 2006, http://nopr.niscair.res.in/bitstream
/123456789/3559/1/JIPR%2011(2)%20140-144.pdf, http://papers.ssrn.com/sol3/papers. cfm?abstract_id
=1398303.
[12] Saksena, Hailshree, Doctrine of Sweat of the Brow (May 3, 2009). Available at SSRN:
http://ssrn.com/abstract=1398303 or http://dx.doi.org/10.2139/ssrn.1398303.
[13] Sinha S, Bench S, Sikri A. Eastern Book Company And Ors. vs D.B. Modak & Ors. & Mr. Navin J. on 27
September, 2002, http://www.indiankanoon.org/doc/377266/.
[14] Traditional Knowledge Digital Library, Collaborative project of CSIR & Dept. of AYUSH. Available at:
http://www.tkdl.res.in/tkdl/langdefault/common/Home.asp?
[15] WIPO News and Events, WIPO and India Partner to Protect Traditional Knowledge from
Misappropriation, PR/2011/682, Geneva/Delhi, 22.03.2011, Available
at:http://www.wipo.int/pressroom/en/articles/2011/article_0008 .html
[16] Indian Copyright act amendment (2012) www.wipo.int/edocs/lexdocs/laws/en/in/ in066een.pdf.
[17] WIPO, WIPO Copyright Treaty, Adopted in Geneva on Dec 20, 1996. Available at:
http://www.wipo.int/treaties/en/ip/wct/trtdocs_wo033.html
19
[18] Digital Millennium Copyright Act, To amend title 17, United States Code, to implement the World
Intellectual Property Organization Copyright Treaty and Performances and Phonograms Treaty, and for
other purposes, Public Law 105-308, Oct-28, 1998.
[19] Australian Government ComLaw, Copyright amendment act-2006, C2006A00158, Available at:
http://www.comlaw.gov.au/Details/C2006A00158
[20] Halder R, Pal S, Cortesi A (2010) Watermarking techniques for relational databases: Survey, classification
and comparison, J. of Universal Comp. Science, 16(21), 31643190.
[21] Khanduja V, Chakraverty S, Verma OP (2016) Ownership and Tamper detection of Relational Data:
Framework, Techniques and Security Analysis, published as the chapter in the book titled: Embodying
Intelligence in Multimedia Data Hiding at Science gate Publishing, pp:21-36,
DOI: 10.15579/gcsr.vol5.ch2.
[22] Mehta BB, Aswar HD (2014) Watermarking for security in database: A review, In Proceedings of IEEE
IT in Business, Industry & Government (CSIBIG), pp:16.
[23] Xie MR, Wu CC, Shen JJ, Hwang MS (2016) A Survey of Data Distortion Watermarking Relational
Databases, International Journal of Network Security, vol. 18(6),1022-1033.
[24] Agrawal R, Kiernan J (2002) Watermarking Relational Databases, In proceedings of 28th VLDB
conference, China, 155-166.
[25] Xinchun C, Xiaolin Q, Gang S (2007) A Weighted Algorithm for Watermarking Relational Databases,
Wuhan University J. of Natural Sciences, 12(1), 7982.
[26] Farfoura ME, Horng SJ, Lai JL, Run RS, Chen RJ, Khan MK (2012) A blind reversible method for
watermarking relational databases based on a time-stamping protocol, Expert Systems with Applications,
39(3), 31853196.
[27] Zhou X, Huang M, Peng Z (2007) An additive-attack-proof watermarking mechanism for databases’
copyrights protection using image, In Proceedings of ACM symposium on applied computing :254258.
[28] Sun J, Cao Z, Hu Z (2008) Multiple watermarking relational databases using image, In Proceedings of
IEEE Multimedia and Inf. Technology :373-376.
[29] Wang H, Cui X, Cao Z (2008) A Speech Based Algorithm for Watermarking Relational Databases, In
Proceedings of Int. Symposium on Inf. Processing, 603606.
[30] Shehab M, Bertino E, Ghafoor A (2008) Watermarking Relational Databases Using Optimization-Based
Techniques, IEEE Transactions on Knowl. & Data Eng., 20(1): 116-129.
[31] Khanduja V, Verma OP, Chakraverty S (2015) Watermarking Relational Databases using Bacterial
Foraging Algorithm, Multimedia tools & Applications, Springer, 74(3): 813-839, DOI: 10.1007/s11042-
013-1700-9.
[32] Iftikhar S, Kamran M, Anwar Z (2015) RRW-A robust and reversible watermarking technique for
relational data, IEEE Transactions on Knowledge and Data Engineering, 27(4):11321145.
[33] Odeh A, Al-Haj A (2008) Watermarking Relational Database Systems, In Proceedings of the Applications
of Digital Inf. & Web Tech. :270-274.
[34] Al-Haj Ali, Odeh A (2008) Robust and blind watermarking of relational database systems, J. of Computer
Science, 4(12),1024-1029.
[35] Hanyurwimfura D, Liu Y, Liu Z (2010) Text format based relational database watermarking for non-
numeric data, In Proceedings of IEEE ICCDA: 312-316.
[36] Sion R (2004) Proving ownership over categorical data, In Proceedings of ICDE: 584595.
[37] Sion R, Atallah M, Prabhakar S (2005) Rights protection for categorical data, IEEE Transactions on Knowl.
& Data Eng.g, 17:912926.
[38] Li Y, Guo H, Jajodia S (2004) Tamper detection and localization for categorical data using fragile
watermarks, In Proceedings of ACM workshop on Digital Rights Management:73-82.
[39] Guo H, Li Y, Lui, Jajodia S(2006) Fragile watermarking scheme for detecting malicious modifications of
database, Elsevier, Information Sciences, 176(10):13501378.
[40] Khan A, Husain SA (2013) A fragile zero watermarking scheme to detect and characterize malicious
modifications in database relations, The Scientific World J., Article ID 796726: 1-16.
[41] Khataeimaragheh H, Rashidi H (2010) A Novel Watermarking Scheme for Detecting and Recovering
Distortions in Database Tables, Int. J. of Database Management Systems, 2(3): 1-11.
[42] Camara L, Li J, Li R, Xie W (2014) Distortion-Free Watermarking Approach for Relational Database
Integrity Checking, Hindawi Publishing Corporation, Mathematical Probl. in Eng.:1-10 DOI:
http://dx.doi.org/10.1155/2014/697165.
20
[43] Bhattacharya S, Cortesi A (2009) A distortion free watermark framework for relational databases, In
Proceedings of Software & Data Technologies, 2: 229-234.
[44] Khanduja V, Chakraverty S, Verma OP (2016) Enabling Information Recovery with Ownership using
Robust Multiple Watermarks, J. of Inf. Secur. & Applications, Elsevier, 29: 80-92. DOI:
10.1016/j.jisa.2016.03.005.
[45] Khanduja V, Chakraverty S, Verma OP, Singh N (2014) A Scheme for Robust Biometric Watermarking in
Web Databases for Ownership Proof with Identification, In Proceedings of Active Media Technology,
LNCS 8610, 212-215.
[46] Schneier B. (2008) Applied Cryptography, protocols, algorithms and source code in C, 2nd edn. Wiley-
India.
... While demand for the use of databases is growing, pirated copying has become a severe threat to such databases due to the low cost of copying and the high values of the target databases. In that context, the security of published or shared data has become a great concern for data owners as the creation of database involves intellectual and financial effort [1]. ...
... Most of the distortion-embedding watermarking schemes are irreversible. at is, they do not allow original data recovering from the watermarked relation [1][2][3][4]. erefore, although some distortion-free schemes exist in the literature [1][2][3][4][5][6][7][8][9], there is a need to develop a new generation of watermarking techniques that could efficiently address the abovementioned issues. To achieve that, reversible or lossless digital watermarking has been extended from multimedia assets to relational data to allow original data recovery at watermark detection. ...
... at is, they do not allow original data recovering from the watermarked relation [1][2][3][4]. erefore, although some distortion-free schemes exist in the literature [1][2][3][4][5][6][7][8][9], there is a need to develop a new generation of watermarking techniques that could efficiently address the abovementioned issues. To achieve that, reversible or lossless digital watermarking has been extended from multimedia assets to relational data to allow original data recovery at watermark detection. ...
Article
Full-text available
The protection of database systems content using digital watermarking is nowadays an emerging research direction in information security. In the literature, many solutions have been proposed either for copyright protection and ownership proofing or integrity checking and tamper localization. Nevertheless, most of them are distortion embedding based as they introduce permanent errors into the cover data during the encoding process, which inevitably affect data quality and usability. Since such distortions are not tolerated in many applications, including banking, medical, and military data, reversible watermarking, primarily designed for multimedia content, has been extended to relational databases. In this article, we propose a novel prediction-error expansion based on reversible watermarking strategy, which not only detects and localizes malicious modifications but also recovers back the original data at watermark detection. The effectiveness of the proposed method is proved through rigorous theoretical analysis and detailed experiments.
... It gives proof of ownership in the data so one can not trust the data if watermarks are tampered with [72]. According to the type of document to be watermarked, watermarking techniques may be classified into four categories: Text watermarking, Image watermarking, Audio watermarking, Video watermarking. ...
... Robust watermark: The watermarks are embedded such that they are robust against various attacks and can be easily recovered back at receiver end. They act as proof of ownership in case of any dispute among parties [72]. Fragile watermark: The watermarks are embedded such that even with slightest change in data, they get tampered and thus are used to provide integrity of data [73]. ...
Article
Full-text available
Despite the global telemedicine market’s significant growth from USD 27.04 Billion in 2019 to an anticipated USD 171.81 Billion by 2026, challenges such as secure data transfer, interoperability,data storage, integrity, cost, and complexity persist. In this manuscript, a structured and systematic review of techniques used to implement secure telemedicine is carried out.The main aim of this manuscript is to acquire knowledge about the latest technologies or researches which are used in the domain of telemedicine to make it more robust and secure. To the best of our knowledge, bifurcation based on security goals is carried out for the first time in the domain of telemedicine. We systematically classify and summarize existing research on techniques for achieving five key security goals: Authentication, Data Confidentiality, Authorization, Data Integrity, and Data Storage. Each category is further subdivided based on the applied techniques. Research gaps have been identified in the current state of the art and future steps to secure medical data and to achieve data integrity by following a pursuit of Cloud Storage and Blockchain network have been suggested.This comprehensive survey seeks to thoroughly investigate many scientific concepts, state-of- the-art, and innovative methodologies and implementations.
... A number of survey papers [3]- [11] already exist in the literature, which provides a comprehensive summary of different techniques and their comparison. Authors in [3] elaborated the features of the relational databases, application of digital watermarking, attack analysis of the then existing distortion-based and distortion-free watermarking techniques. ...
... A recent survey on multimedia and database watermarking is reported in [7] where, in addition to different multimedia artifacts, a comparative summary of only nine existing database watermarking techniques is presented. Other significant works related to the survey of relational database watermarking include [8]- [11]. ...
Article
Full-text available
Digital watermarking is considered one of the most promising techniques to verify the authenticity and integrity of digital data. It is used for a wide range of applications, e.g., copyright protection, tamper detection, traitor tracing, maintaining the integrity of data, etc. In the past two decades, a wide range of algorithms for relational database watermarking has been proposed. Even though a number of surveys exist in the literature, they are unable to provide insightful guidance to choose the right watermarking technique for a given application. In this paper, we provide an exhaustive empirical study and thorough comparative analysis of various relational database watermarking techniques in the literature. Our work is different from the existing survey papers as we consider both distortion-based and distortion-free techniques along with a rigorous experimental analysis demonstrating a detailed comparison on robustness, data usability, and computational cost with considerable empirical evidence.
... Khanduja [35] in this work has focused on security analysis of the work done in the field of database watermarking techniques. Here author has categorized the watermarking systems into four kinds, (i) ATSASB (all tuples, single attribute and single bit), (ii) MTSASB (multiple tuples, single attribute and single bit), (iii) MT-SAMB (multiple tuples, single attribute, and multiple bits) and (iv) MTMAMB (multiple tuples, multiple attributes, and multiple bits. ...
... Multi-attribute techniques, a common limitation is that often they define a fixed number of attributes for embedding the marks [68]. Large volume and redundant data is the major challenge for database watermarking [35]. Difference expansion based watermarking techniques is one of the major watermarking technique for database and is unable to increase the capacity of the relation without distortion tolerance. ...
Article
Full-text available
In today’s digital era, it is very easy to copy, manipulate and distribute multimedia data over an open channel. Copyright protection, content authentication, identity theft, and ownership identification have become challenging issues for content owners/distributors. Off late data hiding methods have gained prominence in areas such as medical/healthcare, e-voting systems, military, communication, remote education, media file archiving, insurance companies, etc. Digital watermarking is one of the burning research areas to address these issues. In this survey, we present various aspects of watermarking. In addition, various classification of watermarking is presented. Here various state-of-the-art of multimedia and database watermarking is discussed. With this survey, researchers will be able to implement efficient watermarking techniques for the security of multimedia and database.
... SUMMARY AND COMPARISON OF FRAGILE WATERMARKING TECHNIQUES[23]. ...
... Also, sometimes they are mixed with linen, with a percentage not exceeding 25% to reduce the production cost (Luj an-Ornelas et al., 2018;Griffin et al., 2018). Usually, the substrates are coated by ink layers of different features and colors and one or two threads are inserted into them -nylon bar and magnet bar for visual inspection and protection, respectively (Hardwick et al., 2001;Khanduja, 2017). Although more than thirty countries in the world are using polymer instead of cotton for production of the substrate because of economic considerations, to increase the element of safety and recyclability, to reduce printing cost by~20%, also because polymer banknotes show higher peel strength, may yield greater durability and ensure an increased lifetime in intensive use during their circulation (Downham et al., 2018;Varenberg et al., 2014), but the rest of the countries are still using cotton in the banks because of political and economic considerations, for example, the need to introduce some adjustments to the production line (https://www.louisenthal.com). ...
Article
End-of-life cotton banknotes (ELCBs) is lignocellulosic waste rich in cellulose and usually disposed of by combustion or incineration. In order to change these useless policies and integrate ELCBs into the circular economy strategy, this research aims to convert ELCBs into high value-added energy products that could be used to achieve self-sufficiency from energy in some sectors of the banknote industry. Pyrolysis was used to achieve this goal and the experiments were performed on the basis of the concept of Technology Readiness Levels (fundamentals and pilot level). The fundamental pyrolysis was performed in nitrogen in the different heating scope of 5–30 °C/min using Differential thermal analysis/Thermogravimetric analysis/3D-Fourier-Transform Infrared spectroscopy in order to determine thermal decomposition, chemical decomposition, and pyrolysis reaction kinetics of ELCBs. The kinetic parameters were estimated using model-free methods, including Kissinger-Akahira-Sunose (KAS), Flynn-Wall-Ozawa (FWO), and Friedman method. The pilot pyrolysis experiments were implemented in a mini pyrolysis power plant built especially for this purpose under the conditions that achieve maximum activation energy (25 °C/min) up to 500, 600, and 700 °C. The built plant had a capacity of 250 g and consisted of three integrated units: conversion pyrolysis reactor, gas collection and purification, and gas monitoring. XRD, Gas chromatography-mass spectrometry, and SEM-EDS were used to analyze and examine the feedstock and obtained energy products. The fundamental results showed that the maximum thermal decomposition of ELCBs is located in the range 383–410 °C with mass loss 70 wt% and maximum activation energy 250 kJ/mol at 25 °C/min, while the pilot results under the optimum conditions showed that the suggested strategy can be generated pyrolysis product yield; 40% of bio-oil, 44% of bio-gases, and 17.8% of char from ELCBs with conversion rate 82.2%.
Chapter
Earlier days, data is basic in numerous fields like medication, scientific applications, where databases are used beneficially for information sharing. Be that as it may, the databases face the danger of being pilfered, taken or mishandled, which could end in huge amounts of security dangers concerning proprietorship rights, information tinkering and protection insurance. Watermarking is used to install possession rights on social databases being shared. Various strategies that support reversible watermarking strategies are proposed as of past due to display the privileges of holders close by getting better particular information. Most progressive techniques change the essential information to an immense degree, bringing about information quality degeneration, and can't accomplish great harmony between vigor against horrendous assaults and information securing. A stable and reversible database watermarking technique, Genetic Algorithm and Histogram Shifting Watermarking (GAHSW), is proposed for numerical social information. This is used to choose the first-rate riddle key for social occasion database, wherein the watermarking will be added with balanced winding and space. The histogram of the envisioning botch is acclimated to embed the watermark with better strength. Primer outcomes uncover the viability of GAHSW and show that it outperforms the advanced techniques as far as heartiness against resentful attack and defending of information quality.KeywordsReversible watermarkingData partitioningRelational numerical databaseHistogram shiftingTuples selection
Article
In this paper, we propose an efficient distortion-free watermarking of large-scale data sets in various formats by exploiting the power of parallel and distributed computing environment. In particular, we adapt MapReduce, Pig and Hive paradigms for the data in CSV, XML and JSON formats by identifying key computational steps involved in the sequential watermarking algorithms. Following this, we design a middleware which allows watermark generation and verification (under any computing paradigm of user’s choice) of large-scale data sets (in any suitable format of user’s interest) and their conversion without affecting the watermark. The experimental evaluation on large-scale benchmark data sets shows a significant reduction of watermark generation and verification times. Interestingly, in case of XML and JSON formats, Pig and Hive outperform the MapReduce paradigm, whereas MapReduce shows better performance in case of CSV format. To the best of our knowledge, this is the first proposal towards large-scale data sets watermarking, considering popular distributed computing paradigms and data formats.
Article
Full-text available
While data is used in cooperative milieus for information extraction; Thus, it is vulnerable to security threats concerning ownership rights and data abusing. Due to unauthorized access to the data that may alter the originality, it results in significant losses of the organization. The relational databases which are free on-hand are used by research society for mining new information regarding their research works. These databases are vulnerable to security issues. The reliability of the data source must be authenticated before using it for any application purpose. Thus, to check the ownership and reliability of data, watermarking is applied to the data. Watermarking is used for the protection of the possession rights of shared Relational Data and for providing the solution for manipulating and tampering of data.
Article
Full-text available
Nowadays, internet is becoming a suitable way of accessing the databases. Such data are exposed to various types of attack with the aim to confuse the ownership proofing or the content protection. In this paper, we propose a new approach based on fragile zero watermarking for the authentication of numeric relational data. Contrary to some previous databases watermarking techniques which cause some distortions in the original database and may not preserve the data usability constraints, our approach simply seeks to generate the watermark from the original database. First, the adopted method partitions the database relation into independent square matrix groups. Then, group-based watermarks are securely generated and registered in a trusted third party. The integrity verification is performed by computing the determinant and the diagonal’s minor for each group. As a result, tampering can be localized up to attribute group level. Theoretical and experimental results demonstrate that the proposed technique is resilient against tuples insertion, tuples deletion, and attributes values modification attacks. Furthermore, comparison with recent related effort shows that our scheme performs better in detecting multifaceted attacks.
Article
Full-text available
Advancement in information technology is playing an increasing role in the use of information systems comprising relational databases. These databases are used effectively in collaborative environments for information extraction; consequently, they are vulnerable to security threats concerning ownership rights and data tampering. Watermarking is advocated to enforce ownership rights over shared relational data and for providing a means for tackling data tampering. When ownership rights are enforced using watermarking, the underlying data undergoes certain modifications; as a result of which, the data quality gets compromised. Reversible watermarking is employed to ensure data quality along-with data recovery. However, such techniques are usually not robust against malicious attacks and do not provide any mechanism to selectively watermark a particular attribute by taking into account its role in knowledge discovery. Therefore, reversible watermarking is required that ensures; (i) watermark encoding and decoding by accounting for the role of all the features in knowledge discovery; and, (ii) original data recovery in the presence of active malicious attacks. In this paper, a robust and semi-blind reversible watermarking (RRW) technique for numerical relational data has been proposed that addresses the above objectives. Experimental studies prove the effectiveness of RRW against malicious attacks and show that the proposed technique outperforms existing ones.
Article
Watermarking relational database is a technique which can provide ownership protection and temper proofing for relational databases. Although it has been developed over ten years, it is still not popular. For attracting more people to study this technique, we introduce it in detail in this paper. The main contributions of this paper include: 1) To the best of our knowledge, this is the first paper which specially surveys data distortion watermarking relational databases; 2) We define a new requirement analysis table for data distortion watermarking relational databases and use it to analyze important and the newest research of data distortion watermarking relational databases; 3) We explain background knowledge of watermarking relational databases, such as types of attacks, requirements, and basic techniques.
Article
With the increasing use of databases, there is an abundant opportunity to investigate new watermarking techniques that cater to the requirements for emerging applications. A major challenge that needs to be tackled is to recover crucial information that may be lost accidentally or due to malicious attacks on a database that represents asset and needs protection. In this paper, we elucidate a scheme for robust watermarking with multiple watermarks that resolve the twin issues of ownership and recovery of information in case of data loss. To resolve ownership conflicts watermark is prepared securely and then embedded into the secretly selected positions of the database. Other watermark encapsulates granular information on user-specified crucial attributes in a manner such that the perturbed or lost data can be regenerated conveniently later. Theoretical analysis proves that the probability of identifying target locations, False hit rate and False miss rate is negligible. We have experimentally verified that the proposed technique is robust enough to extract the watermark accurately even after 100% tuple addition or alteration and after 98% tuple deletion. Experiments on information recovery reveal that successful regeneration of tampered/lost data improves with the increase in the number of candidate attributes for embedding the watermark.
Article
The computer-based databases have made significant value addition in information products and services, and have enabled fast access to information. The growing role of databases for information access has brought to the fore questions of legal rights of the owners and users of the databases. The paper examines current developments in the legal protection of databases. The developments in the European Union (EU) and USA show significant departure from the existing practices in many countries. The salient aspects of these developments, the relevant provisions of the international agreements, the proposed WIPO draft database treaty, the legal implications of the protection of databases within the context of promotion and progress of science, and the role of the library and information science profession, are also discussed.
Conference Paper
Digital multimedia watermarking technology was suggested in the last decade to embed copyright information in digital objects such as images, audio and video. However, the increasing use of relational database systems in many real-life applications created the need for database watermarking systems for protection of database. As a result, watermarking relational database systems is now merging as a research area that deals with the legal issue of copyright protection of database systems.Therefore, an evolution of watermarking has been started with Relational database. It all started with the first method proposed in 2002 by agrawal and kiernan for watermarking in relational database. Then there are so many methods have been proposed and implemented by many researchers. We are going to see the evolution of database watermarking for security in database in this paper.
Article
The main aspect of database protection is to prove the ownership of data that describes who is the originator of data. It is of particular importance in the case of electronic data, as data sets are often modified and copied without proper citation or acknowledgement of originating data set. We present a novel method for watermarking relational databases for identification and proof of ownership based on the secure embedding of blind and multi-bit watermarks using Bacterial Foraging Algorithm (BFA). Feasibility of BFA implementation is shown in the framed watermarking databases application. Identification of owner is cryptographically made secure and used as an embedded watermark. An improved hash partitioning approach is used that is independent of primary key of the database to secure ordering of the tuples. Strength of BFA is explored to make the technique robust, secure and imperceptible. BFA is implemented to give nearly global optimal values bounded by data usability constraints and thus makes database fragile to any attack. The parameters of BFA are tuned to reduce the execution time. BFA is experimentally proved to be better solution than Genetic Algorithm (GA). The technique proposed is experimentally proved to be resilient against malicious attacks.
Conference Paper
The digital information revolution and increasing of outsourced data in many organizations caused significant changes in the global society where digital data are available everywhere with free of cost. Proving ownership rights of outsourced relational databases is a crucial issue in today internet-based application environments and in many content distribution applications. In the past few years, a large number of techniques have been proposed for right protection of numeric data. In this paper, a new relational database watermarking method for non-numeric multi words data is proposed. A mark is embedded by horizontally shifting the location of a word within selected attribute of selected tuples; a word is displaced right or left unmoved depending on watermark bit. The location where the mark to be inserted is determined by the Levenshtein Distance between two successive words within an attribute. Our method is effective as it is robust against different forms of malicious attacks and it is blind as it does not require the original database in order to extract the embedded watermark.