Content uploaded by Sasidhar Talasila
Author content
All content in this area was uploaded by Sasidhar Talasila on Apr 23, 2015
Content may be subject to copyright.
International Journal of Applied Engineering Research
ISSN 0973-4562 Volume 10, Number 5 (2015) pp. 13383-13394
© Research India Publications
http://www.ripublication.com
A Review on Security Aspects of Data Storage in
Cloud Computing
K Ruth Ramya
Asst. Professor, Department of CSE, KL University, India.
T. Sasidhar
Asst. Professor, Department of CSE, KL University, India.
D Naga Malleswari
Asst. Professor, Department of CSE, KL University, India.
M.T.V.S. Rahul
(11003094), Student, CSE, KL University, India
ABSTRACT
Cloud Computing is the technology that is transforming the present day computations
and storage. Many top rated MNC’s and organizations are providing their cloud
services to the clients. As the data of the users are outsourced onto a centralized third
party server, the data owner no longer possess the control on his data. So, the major
parameter to be taken into the consideration is the security of the data outsourced onto
the cloud. There are many techniques, models and schemes that are proposed by many
researchers who try to provide some methodologies to provide the security and check
the security of the data that can be done either by the data owner himself or by the
third party auditor. As the user may not be in a position to check the integrity of
his/her data always, it can be done by a third party person who checks the integrity of
data of the client by challenging the public cloud server as per the warrant i.e. the
constraints that the client imposes on the third party person. There are many
technologies that are proposed till now to attain the belief for the client about the data
integrity. In the term paper, we will look after these methodologies, their merits,
demerits and which technology best provides the security at present scenario as the
data storage became the major concern in the modern day.
13384 K Ruth Ramya et al
Key Words: Data Storage, MAC, Third Party Auditor, Proof of Retrievability
INTRODUCTION
Cloud computing has been imagined as the next generation architecture. Cloud
computing represents an underlying assumption for delivering resources and services.
Cloud computing, or in simpler shorthand just "the cloud", also focuses on
maximizing the effectiveness of the shared resources. Cloud resources are usually not
only shared by multiple users but are also dynamically reallocated per demand. This
can work for allocating resources to users. The services that are provided by the cloud
are Software-as-a-Service, Platform-as-a-Service and Infrastructure-as-a-Service [11].
Cloud computing possess many characteristics such as on-demand service,
rapid elasticity, hardware and maintenance [14], measured service and up-to-date
information. The major concern in the cloud is the security of data that is stored in the
cloud [12]. There are a number of threats that may be incurred by the intruder on the
data. The major concern in the cloud nowadays is to maintain the confidentiality,
integrity, availability and privacy of data [7]. In this paper, we will majorly
concentrating on the various technological frameworks that were proposed by various
researchers, the drawbacks in each and every protocol and how the drawbacks are
being overcome by the existing protocols.
2. DATA SECURITY CHECKING BY THE CLIENT
The data that is being outsourced onto the server by the owner has to overcome many
threats that will be rendered by the intruder. The data has to be frequently checked to
check whether it is secure or not [1]. The confidentiality and integrity of the data can
be checked either by the client directly or through a third party auditor who checks as
per the norms of the client. In this section, we will discuss the various techniques that
will be used to check the integrity of the data by the client [9].
2.1. MESSAGE AUTHENTICATION CODE
A message authentication code (MAC) is a short piece of information that is used to
authenticate a message sent by the user to the receiver and to provide data integrity
and authenticity assurances on the message. Integrity detects the accidental and
intentional message changes, while authenticity checks whether the message is same
as from origin.
A MAC algorithm accepts a secret key and message with a variable length as
input, and outputs a MAC (also called as a tag). The MAC value protects both a
message's data integrity as well as its authenticity, by allowing verifiers or client’s
(who also have the secret key) to detect any changes to the message content.
The user will basically start locally by maintaining a small amount of MACs
for the data files which are to be outsourced onto the cloud server. If the data user
wants to check the integrity of the data which is being stored in the cloud server, he
will retrieve the data first, then calculate the message authentication code of the
received data, now he will check both the MAC’s. If they are equal, then the data is
A Review on Security Aspects of Data Storage in Cloud Computing 13385
secured otherwise, it is not. MACs cannot be employed on large sets as the retrieval
time and the checking time of the data and their respective MACs will take huge
amount of time. Hence, in order to overcome this problem hash tree can be used.
2.2. HASH TREE
Hashing is the transformation of a string of characters into a usually shorter fixed-
length value or key that represents the original string.
The ideal cryptographic hash function has four main properties: 1) One-way
property 2) Weak-collision resistance 3) It is infeasible to modify a message without
changing the hash. 4) It is infeasible to find two different messages with the same
hash. A hash tree is a tree, in which leaves contains hashes of information pieces. In
order to check the validity of the data, the information manager needs to store the root
hash of the tree. By doing so, we are not sure about the accuracy of our outsourced
information.
In this section, we have discussed various techniques that will be used by the
client to check the security of the data that is being outsourced. In the next section, we
will discuss the various techniques that will be used by the third party auditor [8] to
check the integrity and confidentiality of the outsourced data.
13386 K Ruth Ramya et al
3. THIRD PARTY AUDITOR
In the concept of cloud computing, the user outsources his data onto a cloud server
which is a centralized storage. The user will not have a physical control or possess
control over the data outsourced. So, the major problems or challenges faced by the
client are the data integrity, data confidentiality and the data availability. The user has
to periodically check the security aspects of integrity, confidentiality and authenticity
of the data. This is a huge time consuming task for the user. Therefore, in order to
prevent such issues from happening, the user employs a third party member called the
third party auditor (TPA) [2][6] who acts as an inspector and who is fair and is agreed
by both the user and the cloud service provider.
To solve the security problems of data outsourcing, various auditing protocols
were proposed by the researchers in order to ensure the correctness of the outsourced
data. The auditing protocols are basically divided into self-auditing protocols and the
public auditing protocols according to different types of auditors. With the limiting
constraint of the self-auditing protocol i.e. the user’s limited computing capability and
the communication capability, the public auditing protocols were developed. It is of
great importance to enable public auditing so that the users can recourse to a third
party auditor (TPA) who has an expertise and capabilities and users do not have to
audit the outsourced data by themselves. Based on the role of the verifier, the auditing
schemes are classified into private auditability and the public auditability.
Although the schemes with the private auditability can provide higher scheme
efficiency, the public auditability is used widely as anyone (not only the client) can
challenge the cloud service provider to check the integrity of the data. In order, to
carry on the public auditability, no private information is required. Therefore, it seems
more reliable to verify the data using the public auditability which is expected to play
a major vital role in the economies of scale of the cloud computing.
4. PROOF OF RETRIEVABILITY
Users outsource their data onto a centralized cloud server where they no longer
possess control over the data. The data owner can check the integrity of the data by
using message authentication codes if the file is small or hash tree if the file is large in
size [15]. This is done directly by the data owner. In some cases, the data owner wants
to check the data even by retrieving it or he wants some of his/her data back. In order
to check the integrity of the data with the possession and retrievability. Jules et al.
proposed a scheme called “proof of retrievability”
(POR) model, in which the spot-checking and the error-correcting codes are
used to ensure both“possession” and “retrievability” of the data files on the cloud
server. In some cases, special blocks called “sentinels” are randomly embedded into
the data file F for the detection purpose, and F is further encrypted to protect the
positions of these special blocks. However, the number of queries a client can perform
is also a fixed one, and the introduction of these pre-computed“sentinels” prevents the
development of realizing dynamic data updates. In addition, the public auditability is
not supported in their scheme.
A Review on Security Aspects of Data Storage in Cloud Computing 13387
4.1. Abhishek Mohta and R.Sahu Algorithm
This algorithm uses encryption to guarantee information trustworthiness. Encryption
guarantees that the information is not leaked while exchange and message overview
gives the information about the customer who has sent the information that is it
checks the authenticity of the client. The calculation was intended for control of
information, insertion of record and the record deletion. Insertion and control
calculations embed and control the information effectively but in the information
cancellation, we can't distinguish the individual who have deleted the record.
4.2. Keyed Hash Function
A keyed hash hk (F) is a technique used as a part of Proof of Retrievability (POR)
scheme. The verifier (either the customer or the TPA), computes the cryptographic
hash of F in advance utilizing hk (F) before sending the information record F to the
distributed storage, and stores this hash and additionally the mystery key K inside his/
her framework for future check. The verifier sends the mystery key K to the cloud
administration supplier to check the integrity of the document F and asks the CSP to
process and furnish a proportional reply of hk (F). The verifier can check the integrity
of the information document F for various times by putting away numerous hash
values for diverse keys. The primary limit of this plan is that it obliges higher asset
costs for the execution. The verifier needs to store the same number of keys as the
quantity of watches that he/she needs to execute and the hash estimation of the
information document F with each one hash key. The calculation of the hash esteem
for even a decently big/large information records can be computationally oppressive
for a few customers in spite of the fact that the stockpiling of the keys and their hash
values will add on as the overhead for the customer.
4.3. Ari Juels and Burton S. Kaliski Jr Scheme
The proof of retrievability scheme proposed by juels and burton is a plan produced for
substantial information documents with the assistance of sentinels. In this scheme,
just a solitary key can be utilized independent of the lifetime of the record or the
quantity of records utilized, quite opposite to the keyed-hash methodology in which
numerous number of keys are utilized for the same document. They utilize
extraordinary sentinel's pieces, which are covered up in addition to different squares
in the information record F. In starting stage, the verifier arbitrarily installs these
sentinels among the information pieces. So as to check the respectability of the
information record F, the verifier challenges the prover (that is the customer
administration supplier) amid the confirmation stage by detailing the positions of an
accumulation of sentinels and asks the prover to furnish a proportional payback
sentinel values.
On the off chance that the prover has altered or erased a significant part of F,
then with high likelihood it will likewise have stifled various sentinels. In this manner
it is unrealistic to react effectively to the verifier. To keep recognizing the sentinels
from the information obstructs, the entire altered record is scrambled and put away in
the cloud. Here the utilization of encryption renders the sentinels to not be particular
from other document pieces. This plan is best suited for putting away scrambled
13388 K Ruth Ramya et al
records.
This approach is not best suited for large size files as this scheme may not get
the better results when used with the large files. Consequently, this plan has gotten to
be disadvantageous to little clients who are left with restricted computational force.
This technique likewise has capacity overhead on the server, halfway because of the
recently embedded sentinels and mostly because of the mistake adjusting codes that
are embedded inside the information record. Furthermore the customers need to store
all the sentinels with them which turns into a stockpiling overhead for the slim
customers.
5. PRIVACY PRESERVING PUBLIC AUDITING PROTOCOL
Wang et al. proposed a public auditing protocol in 2010 which is called as the “Wang
et al. Privacy Preserving Public Auditing Protocol”. In his protocol, he defines F as
the data file to be outsourced which consists of ‘n’ data blocks [3].
This protocol is a collection of four polynomial-time algorithms namely Key
Gen, Tag Block, Gen Proof and Check Proof.
Security Attacks on the protocol proposed by Wang et al:
The cloud server who stores the data owned by the client might be a malicious cloud
server who might not keep data or might delete data owned by cloud users and may
even hide some data corruptions to maintain his reputation. Secondly, there may be an
outside attacker who can intercept or eavesdrop on the cloud user’s data that is send to
the cloud server.
1) Data modification Tag forging attack:
2) Data lost Auditing pass attack
3) Data interception and modification attack
4) Data eavesdropping and forgery
These attack schemes clearly show that the public auditing protocol propose
by the Wang et al. is vulnerable to various forgeries using known message attacks
from a malicious cloud server and even an outside attacker. Though the cloud server
A Review on Security Aspects of Data Storage in Cloud Computing 13389
has passed the auditing from the TPA, the authenticity and integrity of the outsourced
data are not guaranteed as the above mentioned limitations acts as the drawback in the
privacy preserving public auditing protocol.
6. PROVABLE DATA POSSESSION
PDP is a valid lightweight remote data integrity probabilistic checking model. The
limitation of this protocol is that it cannot be applied into the field of dynamic data.
To overcome the weakness, Ateniese et al. proposed dynamic PDP security model and
designed the corresponding schemes from symmetric cryptography algorithm [4].[5]
Although their algorithm supports most data dynamic operations, it does not support
data insert operation.
The PDP protocols can be classified into two categories. They are private PDP
and public PDP. In the response checking phase of private PDP, some private
information is needed. On the contrary, no private information is needed in the
response checking of public PDP. Private PDP is necessary in some cases where the
user themselves want to check the integrity of data. In order to carry on the auditing
protocol with the help of a third party person, a protocol named proxy provable data
possession (PPDP) was developed which is based on the private PDPP. The PPDP
protocol is a collection of six polynomial time algorithms namely SetUp, Tag Gen,
Sign Verify, Check Tag, Gen Proof, and Check Proof [10].
1. Set Up is a probabilistic key generation algorithm to setup the protocol. The
algorithm takes input a security parameter and returns the corresponding
private/public key pair.
2. Tag Gen is an algorithm that is run by the client to generate the verification
metadata. It takes input its private key, PCS’s public key, the proxy’s public
key and a file block m, and then it will return the checking tag.
3. Sign Verify is an algorithm that is run by the proxy to validate the warrant-
certificate pair from the client. If the pair is valid, it outputs ‘‘Success’’ and
accepts the pair given by the client; otherwise, it outputs ‘‘failure’’ and rejects
the pair.
4. Check Tag is an algorithm that is run by the PCS to check whether the block-
tag pair is valid or not.
5. Gen Proof is an algorithm that is run by the PCS to generate the possession
proof. The algorithm takes the public keys of the client, the PCS, the proxy, its
own private key and some set of data blocks of the data file F as input and
returns a possession proof for the blocks in F as an output.
6. Check Proof is an algorithm that is run by the proxy to validate the data
possession proof. It takes inputs the public keys of the three network entities,
its own private key and the data possession proof of the PCS. It returns
success if the possession proof is as expected by the proxy. Otherwise, it
outputs false.
13390 K Ruth Ramya et al
S.No
Name of the
Description
Drawbacks
Methodology
1)
Message
Data owners will
initially locally
maintain a small
amount
MAC
can’t
be
Authentication
Code
of MACs for the data
files which are to be
outsourced. The
data owner can verify
the integrity by
implemented on
data blocks
Large
data owner can
verify the integrity by
recalculating the
data blocks.
MAC of the
received data file
when he/she wants
to
retrieve data and will
compare it to the local
precomputed
value.
2)
Hash Tree
A hash tree can be
employed for large
data files, in which
There
is
no
guarantee
leaves contains hashes
of data blocks and
internal contains
about
the
integrity
and
hashes of their
children of the tree.
confidentiality
of
the
outsourced data.
3)
Abhishek
This algorithm
ensures data integrity
and dynamic data
There is no scope to know
Mohta and
operations which uses
encryption and
message digest to
who has deleted the data
R.Sahu
ensure data integrity.
as there is no indexing.
Algorithm
A Review on Security Aspects of Data Storage in Cloud Computing 13391
4)
Keyed Hash
The verifier, pre
-
computes the
cryptographic hash
of F
The
main
drawback
is
Algorithm
using hk (F) before
archiving the data file
F in the cloud
there is a huge resource
storage, and stores this
hash as well as the
secret key K.
and
storage
overhead
The verifier releases
the secret key K to the
cloud archive to
during
the
implementation.
check the integrity of
the file F and asks it to
compute and
return the value of
hk(F)
5)
Ari Juels and
A single key can be
used irrespective of
the size of the file
The
major
limitation
of
Burton S.
or the number of
files unlike in the
key-hash approach
this
algorithm
is
it
Kaliski Jr
scheme in which
many number of keys
are used. They used
becomes computationally
Scheme
special sentinels
blocks, which are
hidden among other
cumbersome
to
encrypt
blocks in the data file
F
data
file especially
when
the data to be encrypted is
large
as
this
scheme
involves encrypting
data
file
embedded
with
sentinels.
6)
Shah et al.
A protocol that can be
used when the data
block that is
Can only be used on the
Public
Auditing
ready to be transferred
to the server is
encrypted.
encrypted data.
Protocol
13392 K Ruth Ramya et al
7)
Ateniese et al.
A remote data
integrity probabilistic
checking model. It is
It
cannot
perform
Public
Auditing
regretful that the
protocol cannot be
applied into the field
of
insertion operation.
Protocol
dynamic data.
Proposed dynamic
PDP security model
and
designed the
corresponding
schemes from
symmetric
cryptography algorithm.
8)
Wang et al. A protocol with four steps: Data
modification
Tag
privacy-
KeyGen, Tag Block, GenProof
and Check Proof forging attack, Data lost
preserving Auditing pass attack, Data
public
auditing
interception and
protocol modification attack, Data
eavesdropping and
forgery are the
various
attacks.
9)
Proxy
Provable
The protocol has six steps
namely SetUp, Tag Gen, Sign The
protocol
is secured
Data
Possession
Verify, Check Tag, GenProof,
and Check Proof. for the data that will be
retrieved but there is no
guarantee on the
remaining data
that
is
secure or not on the
outsourced data.
7. CONCLUSION
Cloud Computing became the most reliable technology for all the user’s and the
organizations across the world today. As the user’s doesn’t have the control over their
A Review on Security Aspects of Data Storage in Cloud Computing 13393
data after outsourcing the data onto the cloud, the major concern for all of them is
how secure their data is. The data should be confidential, authenticated and should
maintain the data integrity. There are many methodologies that were developed and
proposed by many researchers. All of them are providing the security for the data but
only up to certain extent. No algorithm is providing hundred percent security for the
data in the cloud. The major methodologies that were being used at present are
depicted in this paper. Finally, the “Proxy Provable Data Possession”, an algorithm
that uses six probabilistic algorithms to provide the data security is the best among all
of the methodologies that were being used today. This algorithm is not performed by
the client, indeed it will be done by the proxy.
Though this algorithm is much more efficient that other’s, there is still a
limitation for it. Only the data which is challenged and checked is said to be secure,
but what about the data that is outsourced onto the cloud. Therefore, a mechanism
called “Pairing Based Provable Data Possession” is being developed using elliptic
curve cryptography which is said to possess maximum security for the data.
8. REFERENCES
[1]. Zhifeng Xiao and Yang Xiao, “Security and Privacy in Cloud
Computing”, IEEE Communications Surveys & Tutorials, Vol. 15, No. 2,
Second Quarter 2013
[2]. Mehul A. Shah, Mary Baker, Jeffrey C. Mogul, Ram Swami Nathan,
“Auditing to Keep Online Storage Services Honest”, HP Labs, Palo Alto
[3]. XU Chun-Xizang, HE Xiao-Hu, Daniel Abraha, “Cryptanalysis of auditing
protocol” proposed by Wang et al. for data storage security in Cloud
Computing
[4]. Yongjun Ren, Jiang Xu, Jinn Wang and Jeong-Uk Kim, “Designated-Verifier
Provable Data Possession in Public Cloud Storage”, International Journal of
Security and Its Applications Vol.7, No.6 (2013)
[5]. Qian Wang, Cong Wang, “Enabling Public Auditability and Data Dynamics
for Storage Security in Cloud Computing”, the US National Science
Foundation,2007
[6]. Bhavana Makhija, Indrajit Rajput, and Vinitkumar Gupta, ”Enhanced Data
Security in Cloud Computing with Third Party Auditor “,International Journal
of Advanced Research in Computer Science and Software Engineering 3(2),
February 2013
[7]. Peter Brudenall, Bridget Treacy and Purdey Castle Tradi, “Outsourcing to the
cloud: data security and privacy risks”, Financier World wide’s March 2010
Issue
[8]. Abhishek Mohta, Ravi Kant Sahu, Lalit Kumar Awasthi, “Robust Data
Security for Cloud while using Third Party Auditor”, International Journal of
Advanced Research in Computer Science and Software Engineering, Volume
2, Issue 2, February 2012
13394 K Ruth Ramya et al
[9]. Pradnyesh Bhisikar and Prof. Amit Sahu, “Security in Data Storage and
Transmission in Cloud Computing”, International Journal of Advanced
Research in Computer Science and Software Engineering, Volume 3, Issue 3,
March 2013
[10]. Jin Li, Kui Ren, and Wenjing Lou, “Enabling Public Verifiability and Data
Dynamics for Storage Security in Cloud Computing” , the US National
Science Foundation,2007
[11]. John Harauz, Lori M. Kaufman and Bruce Potter, “Data Security in the World
of Cloud Computing”, IEEE Computer and Reliability Societies, July/August
2009, 1540-7993/09
[12]. Vijay Varadharajan and Udaya Tupakula, “Security as a Service Model
for Cloud Environment”, IEEE Transactions On Network And Service
Management, Vol. 11, No. 1, March 2014
[13]. Honggang Wang, Shaoen Wu, Min Chen and Wei Wang, “Security Protection
between Users and the Mobile Media Cloud”, IEEE Communications
Magazine, March 2014
[14]. Bernd Grobauer, Tobias Walloschek, and Elmar Stocker, “Understanding
Cloud Computing Vulnerabilities”, IEEE Computer And Reliability Societies
1540-7993/11, March/April 2011
[15]. Deepanchakaravarthi Purushothaman and Dr.Sunitha Abburu, “An Approach
for Data Storage Security in Cloud Computing”, IJCSI International Journal of
Computer Science Issues, Vol. 9, Issue 2, No 1, March 2012, ISSN: 1694-
0814