Conference PaperPDF Available

Yugala: Blockchain Based Encrypted Cloud Storage for IoT Data

Authors:

Figures

Content may be subject to copyright.
Yugala: Blockchain based Encrypted Cloud Storage
for IoT Data
Sarada Prasad Gochhayat, Eranga Bandara, Sachin Shetty, Peter Foytik
Virginia Modeling Analysis and Simulation Center
Old Dominion University
Norfolk, VA, USA
{sgochhay, cmedawer, sshetty, pfoytik}@odu.edu
Abstract—Cloud storage enables users to upload, store and
download their data. However, with the rapid growth of the
Internet of Things (IoT) and edge devices, such as camera, the
cloud not only manages the user generate data but also devices
generated data. This increment in the volume of outsourced data
has raised two important issues in big data management for
cloud storage servers, namely, optimal data storage and data
security. In order to solve the former issue, cloud storage employs
a deduplication technique to avoids storing duplicate data; and
uses encryption to solve the later issue. However, both these
issues are orthogonal in nature. Thanks to convergent encryption
that addresses both the issues simultaneously. Nevertheless, the
existing approaches relies on proxy servers and put all the trust
on third party, which is vulnerable to single point failure.
In this paper, we propose a novel lightweight decentralized
encrypted cloud storage architecture called, Yugula, which main-
tains file confidentiality, removes the centralized data deduplica-
tion and increases file integrity by using blockchain. In particular,
we discuss two approaches for file confidentiality with data
deduplication: one uses double hashing and the other symmetric
encryption. In order to show the efficiency and usability, we
implemented the proposed architecture and showed its results in
terms of high transaction throughput requirement for IoT data
management in cloud storage.
Index Terms—Blockchain, Cloud computing, IoT, Security
I. INTRODUCTION
Cloud storage is one of the most popular cloud services
which enables users to store tremendous amount of data in
the cloud, which may exceed their own storage spaces. The
dependency on cloud storage even further increases because of
growth of Internet of Thing (IoT) and the data IoT generates.
IoT become an essential element for the evolution of connected
products and related services. Along with IoT, the era of Fog
computing, also produces a huge amound of data. Hence, the
adoption of IoT and fog computing going to create more and
more data in the cloud storage, in which most of the data will
be redundant.
In order to solve redundant data storage and improve the
utilization of storage, cloud storage employs data deduplica-
tion. Data deduplication is a process in which the cloud storage
eliminates storage of multiple copies of a file by storing only
a single copy of the file [1], [2]. Hence, it improves both
the storage and bandwidth utilization. It further enhances the
storing capacity by storing a single copy of a file in block level
[3], [4]. Particularly, at file level, the cloud storage avoids
storing multiple copies of already existing files, i.e., when
Fig. 1. Deduplication in cloud storage
client Cwants to upload a file F, to Cloud Server CS.CS
checks if it has F.IfCS finds that it has F, then it prohibits A
to upload the file. At block level, CS performs deduplication
check on different blocks of the file. Additionally, to save
the upload bandwidth, the user uploads a unique tag of the
file instead of the actual file to check the availability of the
file [5], [6]. Fig. 1 shows the deduplication process in detail.
Unfortunately, CS unable to perform deduplication when user
encrypts his/her files.
Normally, a security-concerned user encrypts his file before
outsourcing to CS. As encryption makes it difficult for anyone
to distinguish between encrypted text of a plain text and ran-
domly generated value, encryption conflicts with deduplication
mechanism [7]. For example, when users AE and BE encrypt
a plain text file PF using two different keys KB and KB, they
generate two indistinguishable cipher-text files, namely, AEF
and BEF. So, when CS uses deduplication process, it cannot
verify existence of the file. Hence, the cloud storage cant save
storage space by using deduplication. Thanks to Convergent
Encryption (CE) which allows deduplication over encrypted
files. In convergent encryption, a user first generates the hash
of the file, and then uses it to encrypt the file. In this way, the
files is encrypted, and can be protected. Nevertheless, the hash
of the file aka the tag, is used by the user to encrypt the data is
still used by the cloud to perform deduplication check. Hence,
CS and use the tag to decrypt the encrypted information. In
order to still use data deduplication along with convergence
encryption, the existing works in the literature need additional
servers [8]. Another issue an adverse CS can do is the denial of
uploading of the file by the user. For instance, a user initially
483
2019 IEEE International Conference on Blockchain (Blockchain)
978-1-7281-4693-5/19/$31.00 ©2019 IEEE
DOI 10.1109/Blockchain.2019.00073
Authorized licensed use limited to: Old Dominion University. Downloaded on November 11,2020 at 20:55:49 UTC from IEEE Xplore. Restrictions apply.
uploads a file at CS. But at some later instance, the CS simply
discards the file from the storage to save space, and when
the user returns CS denies that the user had indeed uploaded
the file. This problem is known as proof of retrievability [9],
where a CS proves that it indeed has the file uploaded by the
user. Nevertheless, the solution to the above mentioned issues
have to be lightweight to support the high load generated on
the cloud storage, as the load on cloud storage also grows with
the number of nodes in IoT.
Motivated by above mentioned challenges, in this work
we propose a novel lightweight decentralized encrypted cloud
storage architecture called, yugula, which removes the central-
ized data deduplication, uses modified convergent encryption
for file confidentiality and provides file integrity by using
Blockchain [10]. In our first approach, the user encrypts the
data using convergent encryption to ensure file confidentiality;
and uses the output of the double-hash of the data as the tag
to check data deduplication. In the second approach, we use
symmetric encryption and a secure information, to generate
the tag, in this way, which enables different organizations to
use cloud storage while still protecting individual data.
In summary, the contributions of the paper are:
We proposed a new decentralized encrypted cloud storage
architecture, called Yugula, which uses Blockchain and
modified convergence encryption. Yugala uses Mystiko
Blockchain [11] to provide high throughput while still
possessing data deduplication capability.
Yugula decouples the relationship between the tag and the
encrypted file, by first, using hash of the data to encrypt
the file, i.e., by the help of convergence encryption,
and second, by storing the double-hash of the file at
the Blockchain to enable all the user to query and find
the availability of the encrypted data in the cloud. Data
deduplication is handling with Mystiko Blockchain based
smart contracts.
We even suggested a modified version of the first ap-
proach, which uses symmetric encryption and secret
information to generate the tag that suits well when
different organizations try delineate their data and tags
on the same Blockchain based encrypted cloud storage.
As a proof of concept, we implemented the architecture
and showed that the proposed system work fits quite well
in IoT environment in terms of transaction throughput.
The rest of the paper is organized as follows: Section II
discusses some of the related works and system and threat
model; Section III presents the Yugula architecture; Section
IV dicussses implementation process of Yugula in detail by
discussing each components; and Results and conclusion are
discussed in Section V and VI, respectively.
II. RELATED WORKS
In this section, we discuss the related works in encrypted
cloud storage, by focusing on the convergence encryption, and
Blockchain. Then, we discuss about the problem statement
by emphasizing the system model and threat model in an
adversarial cloud scenario.
Fig. 2. Convergence encryption
A. Convergence encryption
Convergence encryption is a method of encrypting a file by
using a key which is deterministically associated with the file.
In particular, a user first generates a hash of the file, and uses
the hash output as a key to encrypt the file. Fig 2 shows the
process of convergence encryption.
The formal definition of the convergence encryption consists
of three algorithms (KeyGen,encrypt, and Decrypt)as
follows:
KeyGenCE (F)(Key)this algorithm takes a file F
as an input and outputs a secret key Key, thus mapping
the file to a convergent key.
EncryptCE(F, Key)(CT )This is a symmetric
encryption algorithm, which takes the plain text file F
and the Key as inputs and outputs the cipher text CT.
DecryptCE(CT, Key)(F)this is the decryption of
algorithm, which takes the the cipher text CT and the
secret key Key as the inputs, and returns the original
plain text file F.
B. Blockchain
Blockchain is a form of distributed storage system that
stores the chronological sequence of transactions in a tamper-
evident manner. In Blockchain, each node has the same order
of data which is immutable. Since Blockchain is a distributed
storage, it uses a consensus algorithm to maintain consistency
of data among the nodes. Due to its decentralized, immutabil-
ity nature, Blockchain becomes a promising technology for
untrusted peer to peer network. [12]. Currently, there are
various Blockchain platforms in the market. Bitcoin [13],
Ethereum [14], Bigchaindb [15], Hyperledger [16] are some
examples. Some of these Blockchains are mostly used for
electronic currencies such as Bitcoin. While Ethereum and
Hyperledger go beyond crypto-currency to support different
kind of transaction storage models that relate to other forms
of business or e-commerce activities. However, these platforms
lack high throughput and the immediate concerns the research
community has is to improve the throughput.
For Yugala implementation, we have incorporated Mystiko
Blockchain [11] platform which mainly targeted for big data
operations. It built with Apache Cassandra distributed stor-
age [17] and Apache Kafka [18]. The main reason to use
Mystiko Blockchain for Yugala work is it’s high transaction
throughput, and scalability features. Since Mystiko storage
model supports to keep large data payloads we were able to
484
Authorized licensed use limited to: Old Dominion University. Downloaded on November 11,2020 at 20:55:49 UTC from IEEE Xplore. Restrictions apply.
Fig. 3. Yugala architecture overview
store encrypted file payloads on the Blockchain. To support
concurrent transaction executions Mytiko uses Scala functional
programming [19] and Akka Actor [20] based Aplos smart
contract service. All smart contracts on Yugala platform writ-
ten with Aplos service of Mystiko Blockchain.
C. System Model and Threat Model
System Model: In the proposed setting, the cloud storage
system is deployed under multi-user model, which consists of
Cloud storage server, and a group of users (as shown in fig. 1).
Here, the Cloud storage offers storage of the encrypted files.
The users encrypt the files in such a way that only users having
the same files can verify the existence of the files, while the
cloud storage system can’t.
Threat Model: We assume that the cloud storage is semi-honest
and doesn’t collude with the users. CS may detect the the ex-
istence of the encrypted files which the user is inquiring about,
but wouldn’t know the content of the encrypt file. Additionally,
CS can simply remove the file which a user has uploaded into
the cloud. So, the two security goals the proposed architecture
provides with respect to the threat model are file confidentiality
and file integrity. First, it achieves file confidentiality as all
the users encrypt their files during uploading, hence, CS can’t
read the content of the file. Second, as the existence of the
file is now available in Blockchain, CS can’t remove the file
by itself (because the information in Blockchain is distributed
in multiple servers and the information is redundant,) hence
providing system integrity and reliability. The novelty of the
proposed approach is that the encryption is simplified by an
intelligent combination of convergence encryption and the
random puzzle methods.
III. YUGALA ARCHITECTURE
In this section, we first discuss the Yugula architecture
by mentioning all the important actors. Then, we explain
the proposed two approaches for file confidentiality with
data deduplication: one using double hash and another using
symmetric encryption.
Fig. 4. Double hash approach
User 1 User 2 Cloud storage
1.a :t2=H(t1)
t1=H(data)
1.b :Negative
1.c:DES t1(Data)
2.a:t2=H(t1)
t1=H(data)
2.b :Positive
2.c :No data transmission
Fig. 5. Deduplication using double hash
Architecture Overview: There are basically three actors
in the architecute: clients, Blockchain and cloud storage.
Consider a scenario where two clients (client1 and client2)
have a same file, Figure 3. First, client1 uploads the tag to
the Blockchain. The Blockchain checks whether a file already
existing with the given tag. If file not exists, its a new file.
This file will be saved at the storage. Later client2 uploads
same file to Yugala cloud with tag Blockchain checks the file
already exists with given tag and it wont save the new file
content. All the duplication checking functions handles with
smart contract on Mystiko Blockchain. So, we next discuss
the two proposed file confidentiality with data deduplication
approaches.
A. Double-hash approach
The steps of double hash approach is shown in Fig. 4.
In order to demonstrate the multi-user scenario, we have
considered two users here, which also holds true for more
485
Authorized licensed use limited to: Old Dominion University. Downloaded on November 11,2020 at 20:55:49 UTC from IEEE Xplore. Restrictions apply.
Fig. 6. Symmetric encryption based approach
number of users. The process is described in detail below. The
Fig. 5 shows the data flow diagram of double-hash approach
secured data deduplication.
1) In the beginning, when a user tries to upload a file, he
first calculates the hash of the file which he wants to
send. Then, he uses the hash output as an input to again
generate a hash output, which he considers as the tag.
Subsequently, he sends the tag to the cloud sever (See
1.a and 2.a in fig 5).
2) The cloud storage sends Negative (orpositive)tothe
user, if it doesn’t have the file (or otherwise). (See 1.b
and 2.b in fig 5)
3) If, the use receives a positive response from the cloud,
i.e., the encrypted file associated with the tag is available
at the cloud, the user doesn’t do anything. (See 2.c in
fig 5).
4) Otherwise, the user encrypts the file using the hash
output as the key. (See 1.c in fig 5)
B. Symmetric encryption approach
The steps of symmetric encryption based approach is shown
in Fig. 6. Here also, in order to demonstrate the multi-user
scenario, we have considered two users here, which also holds
true for more number of users. The process is described in
detail below. The Fig. 7 shows the data flow diagram of the
modified secured data deduplication.
1) In the beginning, when a user tries to upload a file, he
first calculates the hash of the file which he wants to
send. Then, uses this hash output to encrypt a numeric
value , for example, ’1’ and generates the tag. Subse-
quently, he sends the tag to the cloud sever (See 1.a and
2.a in fig 7).
2) The cloud storage sends Negative (or positive) to the
user, if it doesn’t have the file (or has the file). (See 1.b
and 2.b in fig 7)
3) If, the user receives a positive response from the cloud,
i.e., the encrypted file associated with the tag is available
User 1 User 2 Cloud storage
1.a :t2=DESt1(1)
t1=H(data)
1.b :Negative
1.c:DES t1(Data)
2.a:t2=DESt1(1)
t1=H(data)
2.b :Positive
2.c :No data transmission
Fig. 7. Deduplication using symmetric encryption
at the cloud, the user doesn’t do anything. (See 2.c in
fig 7).
4) Otherwise, the user encrypts the file using the hash
output as the key. (See 1.c in fig 7)
IV. IMPLEMENTATION
As a proof of concept we have implemented “Yugala”. The
Yugala cloud service is built on the top of Mystiko Blockchain
which is highly scaleble. Since Mystiko blockchian mainly tar-
get for bigdata, we were able to support IoT requirements with
Yugala cloud service. Here, first, we discuss the underlying
Blockchain Mystico and then move on to different components
of Yugula and their interact with Mystico.
A. Mystiko
Mystiko is a highly scalable Blockchain system which
targeted for big data. It utilizes Apache Cassandra [17] dis-
tributed database (with Paxos consensus [21] as the under-
lying consensus platform). Mystiko uses Apache Kafka and
Akka streams [22] to handle back pressure operations on big
data. To facilitate the full text search on Blockchain data,
Mystiko utilized the Apache Lucene [23] based Elasticsearch
API [24]. Mytiko Blockchain built as a Microservices [25]
based distributed system (See Fig 8). These Microservices
can be deployed with Docker and Kubernetes in highly scal-
able environment. Mystiko addressed three main performance
bottlenecks on existing Blockchain platforms, namely, Order-
Execute architecture, Full node data replication and Imperative
style smart contracts.
To address the issues on traditional Order-Execute
Blockchain architecture [16] Mystiko provides Redis cache
based Validate-Execute-Group architecture. This architecture
allows to validates and executes the transactions when a client
submits a transaction to the network. The client does not
need to wait until a block has been created to commit the
486
Authorized licensed use limited to: Old Dominion University. Downloaded on November 11,2020 at 20:55:49 UTC from IEEE Xplore. Restrictions apply.
Fig. 8. Mystiko blockchian architecture
transaction. This new architecture provides high scalability and
high transaction throughput of Mystiko Blockchain.
All blocks, transactions and asset information are stored in
Cassandra database tables in Mystiko. Since using cassandra
as the assert storage, Mystiko can stores larger data payloads
with the asserts. In mystiko every Blockchain peer comes with
a Cassandra node; these nodes are connected with one another
in a ring cluster architecture. After executing a transaction,
state update in a peer is distributed and replicated using
sharding with Cassandras Paxos consensus algorithm. With
this approach mystiko avoids full node replication issue in
traditional Blockchains.
As an alternative to the imperative style smart contract
on existing Blockchains, Mystiko introduces the Aplos smart
contract platform. This platform is built using the Scala
Functional Programming [19] language and used Akka ac-
tors [20], which comes with built-in concurrency control. With
Akka actors and Functional Programming-based concurrency
control, the Aplos platform enables concurrent transaction
execution. All transactions in Mystiko are executed using
actors. Clients submit transactions to Mystiko aplos service
via apache kafka [18]. Mystiko transaction comes with actor
name and the message. Actors consumes messages; then based
on the message, it perform write/update/query on the ledger
and return the response to the client.
B. Components in yugala
The Yugala cloud storage intends to support for big data
operations. Fig. 9 illustrates the Yugala cloud implementation
architecture with Mystiko Blockchain.
1) Mystiko storage service: Storage service is Apache
Cassandra based storage service in Mystiko Blockchain. In
Yugula, all file payloads and respective tags saved in Mystiko
Storage service as the blockchian assets. Mystiko Blockchain
uses sharding based data replication instead of full node
replication. When files saved on Mystiko storage those files
will be replicated with other Blockchain nodes based on
sharding.
Fig. 9. Yugala architecture
2) Mystiko aplos service : As mentioned previously, Mys-
tiko Blockchain comes with Scala functional programming
based smart contract platform which introduced as Aplos.
These smart contracts written with Akka actors, so it intro-
duced as Aplos smart actor platform. All the business logics
of the Blockchain applications written on Aplos service as
smart actors. In Yugala, data deduplication, and encryption
logics are written on Aplos service as smart actor.
3) Mystiko kafka: Mystiko Blockchain mainly targeted
for big data, it use akka streams and kafka to handle the
backpressure operations on big data. In Yugala, we used
mystiko Blockchain to handle back pressure operations. All
the file upload transactions receives to kafka. When transaction
receives to kafka it consumes by aplos service. Then it
executes corresponding smart actor and take necessary action.
4) Yugala API: Yugala API is the front end REST API to
Mystiko Blockchain in Yugala platform. It expose HTTP API
which takes encrypted file payloads and tags. When it receives
request with tag and file payload it publish these information
to apache kafka. Then, kafka delivers it Aplos service and
handle further actions.
5) Yugala client: Clients upload encrypted files with re-
spective tags to Yugala cloud. Clients need to generate
keys/tags and encrypt the files based on these keys. We
provide Yugala-SDK to handle these key/tag generations and
encryptions. Client invoke yugala SDK function with file
payload. SDK function returns the key, tag and encrypted
payload. All the key/tag generations and file encryption handle
by SDK. Finally client uploads encrypted file with tag to
Yugala cloud via Yugala API.
V. R ESULTS
We have done a performance evaluation of Yugala cloud ser-
vice. We deployed multiple-node Mystiko Blockchain cluster
and Yugala services in AWS 2xlarge instances (16GB RAM
and 8 CPUs) and obtained the results. The evaluation is per-
formed with Yugala double hashing based system that we have
built on top of Mystiko Blockchain. The results are obtained
in the following five areas: Transaction throughput,Transaction
scalability, Search performance, Double hashing performance,
and Encryption performance in Yugla client.
487
Authorized licensed use limited to: Old Dominion University. Downloaded on November 11,2020 at 20:55:49 UTC from IEEE Xplore. Restrictions apply.
Fig. 10. Number of concurrent transactions executed in mystiko Blockchain
A. Transaction throughput of Yugala cloud
For the Transaction throughput, we recorded the number of
transactions that can be executed in each Mystiko Blockchain
peer in Yugala cloud. We flooded transactions for each
Blockchain peer and recorded the number of executed transac-
tions. As shown in Figure 10, each peer in the Blockchain has
consistent transaction throughput. There are three main rea-
sons for high transaction throughput: The Underlying Apache
Cassandra database in Mystiko, the Validate-Execute-Group
architecture of Mystiko and Functional programming based
smart contracts.
B. Transaction scalability of Yugala cloud
In order to observe the transaction scalability, we recorded
the number of executed transactions (per second) over the
number of Blockchain peers in the Yugala network. We
flooded concurrent transaction in each Blockchain peer and
recorded the number of executed transactions. As shown in
Figure 11 when we increase the number of Blockchain peers,
the transaction throughput linearly increased. The main reason
for this liner scalability is Cassandra storage on which Mys-
tiko Blockchains is based. Cassandra follows master-less ring
architecture, so all nodes in the cluster have write capability.
When adding a node to the cluster, it linearly increases the
transaction throughput.
C. Search performance of Yugala
Next, we evaluated the search performance of Mystiko
Blockchain based Yugala cloud service. For this test we
bootstrap Mystiko Blockchain with different transaction sets
and obtained the time taken to search a record. As shown
in Figure 12 we were able to search a record from two
million transactions set within 4 milliseconds time. Mystiko
Blockchain uses Cassandra as it’s the storage platform and
Apache Lucene index-based Elasticsearch as it’s indexing tool.
This Elastisearch and Cassandra based storage yielding super
fast search in Yugala cloud storage.
Fig. 11. The scalability of mystiko blockchian
Fig. 12. Time taken to search the data from Mystiko Blockchain
D. Double hashing performance
Yugla platform uses double hashing based key generation.
For this evaluation we recorded number of double hashes that
can be performed (per second) in Golang [26] based Yugala
client SDK. We run SHA1, SHA2, MD5 hashing against the
50kb, 100kb, 200kb, 500kb, 1000kb, 2000kb size image file set.
Figure 13 shows how double hashing performance varying
with different file sizes and different hashing algorithms.
E. Encryption performance
Finally, we evaluated the encryption performance of Yugla
client SDK. For this evaluation, we used Golang based Yugla
client SDK. It uses AES asymmetric encryption with GCM
mode. This SDKs encryption function run against different
file sizes and recorded the number of encrypt ions that can be
performed within a second. Figure 14 shows how encryption
throughput varying with different file sizes.
488
Authorized licensed use limited to: Old Dominion University. Downloaded on November 11,2020 at 20:55:49 UTC from IEEE Xplore. Restrictions apply.
Fig. 13. Double hash performance of Yugala client SDK
Fig. 14. Encryption performance of Yugala client SDK
VI. CONCLUSION
In this paper, we discuss a new encrypted cloud storage
architecture called, Yugula, which provides file confidentiality
and integrity by using modified convergent encryption and
Blockchain. In particular, we discuss two approaches for
data confidentiality: one uses double hashing and the other,
symmetric encryption.
As a proof of concept, we implemented Yugala. The Yugala
cloud service built on top of Mystiko Blockchain which is
highly scalable Blockcain. Since Mystiko blockchian mainly
target for bigdata, we can support IoT requirements with
Yugala cloud service. The result section demonstrates the per-
formance of the architecture highlighting the high transaction
throughput requirements, with respect to the size of the upload
files and the load of Blockchain, in IoT setting.
REFERENCES
[1] C.-M. Yu, S. P. Gochhayat, M. Conti, and C.-S. Lu, “Privacy aware data
deduplication for side channel in cloud storage,” IEEE Transactions on
Cloud Computing, 2018.
[2] D. Geer, “Reducing the storage burden via data deduplication,” Com-
puter, vol. 41, no. 12, pp. 15–17, 2008.
[3] H. Shin, D. Koo, Y. Shin, and J. Hur, “Privacy-preserving and updatable
block-level data deduplication in cloud storage services,” in 2018 IEEE
11th International Conference on Cloud Computing (CLOUD). IEEE,
2018, pp. 392–400.
[4] H. Jannati, E. Ardeshir-Larijani, and B. Bahrak, “Privacy in cross-user
data deduplication,” Mobile Networks and Applications, pp. 1–13, 2018.
[5] Y. Zhou, D. Feng, W. Xia, M. Fu, and Y. Xiao, “Darm: A deduplication-
aware redundancy management approach for reliable-enhanced storage
systems,” in International Conference on Algorithms and Architectures
for Parallel Processing. Springer, 2018, pp. 445–461.
[6] Z. Pooranian, K.-C. Chen, C.-M. Yu, and M. Conti, “Rare: Defeating
side channels based on data-deduplication in cloud storage,” in IEEE
INFOCOM 2018-IEEE Conference on Computer Communications Work-
shops (INFOCOM WKSHPS). IEEE, 2018, pp. 444–449.
[7] M. Liu, C. Yang, Q. Jiang, X. Chen, J. Ma, and J. Ren, “Updatable block-
level deduplication with dynamic ownership management on encrypted
data,” in 2018 IEEE International Conference on Communications
(ICC). IEEE, 2018, pp. 1–7.
[8] J. Fan, C. Guan, K. Ren, and C. Qiao, “Middlebox-based packet-
level redundancy elimination over encrypted network traffic,” IEEE/ACM
Transactions on Networking (TON), vol. 26, no. 4, pp. 1742–1753, 2018.
[9] K. D. Bowers, A. Juels, and A. Oprea, “Proofs of retrievability: Theory
and implementation,” in Proceedings of the 2009 ACM workshop on
Cloud computing security. ACM, 2009, pp. 43–54.
[10] F. Casino, T. K. Dasaklis, and C. Patsakis, “A systematic literature review
of blockchain-based applications: current status, classification and open
issues,” Telematics and Informatics, 2018.
[11] E. Bandara, W. K. NG, K. DE Zoysa, N. Fernando, S. Tharaka,
P. Maurakirinathan, and N. Jayasuriya, “Mystikoblockchain meets big
data,” in 2018 IEEE International Conference on Big Data (Big Data).
IEEE, 2018, pp. 3024–3032.
[12] D. Tosh, S. Shetty, P. Foytik, C. Kamhoua, and L. Njilla, “Cloudpos:
A proof-of-stake consensus design for blockchain integrated cloud,”
in 2018 IEEE 11th International Conference on Cloud Computing
(CLOUD). IEEE, 2018, pp. 302–309.
[13] S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system,” 2008.
[14] V. Buterin et al., “A next-generation smart contract and decentralized
application platform,” white paper, 2014.
[15] T. McConaghy, R. Marques, A. M¨
uller, D. De Jonghe, T. Mc-
Conaghy, G. McMullen, R. Henderson, S. Bellemare, and A. Granzotto,
“Bigchaindb: a scalable blockchain database,” white paper, BigChainDB,
2016.
[16] E. Androulaki, A. Barger, V. Bortnikov, C. Cachin, K. Christidis,
A. De Caro, D. Enyeart, C. Ferris, G. Laventman, Y. Manevich et al.,
“Hyperledger fabric: a distributed operating system for permissioned
blockchains,” in Proceedings of the Thirteenth EuroSys Conference.
ACM, 2018, p. 30.
[17] A. Lakshman and P. Malik, “Cassandra: a decentralized structured
storage system,” ACM SIGOPS Operating Systems Review, vol. 44, no. 2,
pp. 35–40, 2010.
[18] J. Kreps, N. Narkhede, J. Rao et al., “Kafka: A distributed messaging
system for log processing,” in Proceedings of the NetDB, 2011, pp. 1–7.
[19] “The scala programming language.” [Online]. Available:
https://www.scala-lang.org/
[20] “Akka documentation.” [Online]. Available:
https://doc.akka.io/docs/akka/2.5/actors.html
[21] L. Lamport, “The part-time parliament,” ACM Transactions on Computer
Systems (TOCS), vol. 16, no. 2, pp. 133–169, 1998.
[22] “Akka documentation.” [Online]. Available:
https://doc.akka.io/docs/akka/2.5/stream/
[23] “Welcome to apache lucene.” [Online]. Available:
http://lucene.apache.org/
[24] “Elastic stack and product documentation — elastic.” [Online].
Available: https://www.elastic.co/guide/index.html
[25] “Microservices pattern: Microservice architecture pattern.” [Online].
Available: http://microservices.io/patterns/microservices.html
[26] “The go programming language.” [Online]. Available: https://golang.org/
489
Authorized licensed use limited to: Old Dominion University. Downloaded on November 11,2020 at 20:55:49 UTC from IEEE Xplore. Restrictions apply.
... Chen et al. [9] designed a storage scheme to manage personal medical data based on blockchain and cloud storage to solve the phenomenon of information islands in medical data for the safe storage and sharing of personal medical records. Gochhayat et al. [10], in response to the rapid growth of the IoT and edge devices, generated massive amounts of user data, addressed data storage and data security issues, proposed a novel lightweight decentralized encrypted cloud storage architecture and adopted deduplication technology to avoid storing duplicate data. The vehicle social network (VSN) is an emerging mobile communication system, and to ensure the security of VSN users' private information and the privacy protection of shared data, Zhang et al. [11] proposed a lightweight decentralized multiauthority access control scheme based on ciphertext-policy attribute-based encryption (CP-ABE) and blockchain. ...
... [9] uses a medical blockchain to record public information about medical data, such as storage addresses, hash organization, and data permissions, into blocks. [10] optimizes the consensus mechanism, improves the speed of entering the chain, and writes smart contracts to realize the access control of data, but no encryption measures are taken for the source data, and data security cannot be guaranteed during upload and download. [9] and [10] both store source data on off-chain cloud servers, which can lead to problems such as single points of failure and data tampering. ...
... [10] optimizes the consensus mechanism, improves the speed of entering the chain, and writes smart contracts to realize the access control of data, but no encryption measures are taken for the source data, and data security cannot be guaranteed during upload and download. [9] and [10] both store source data on off-chain cloud servers, which can lead to problems such as single points of failure and data tampering. ...
Article
Full-text available
Traditional centralized cloud storage has difficulties in realizing the secure storage and sharing of speech and other multimedia data, as well as realizing fine-grained access control and privacy protection for speech data. To address this problem, we propose a distributed storage scheme for encryption speech data based on blockchain and inter planetary file system (IPFS). Our scheme is based on the characteristics of blockchain anti-tampering, decentralization and traceability. It is designed to ensure the security and controllability of private sensitive speech data of data users without the use of centralized cloud storage architecture. First, the ciphertext policy hierarchical attribute-based encryption (CP-HABE) scheme is used to encrypt the speech data. The encrypted speech data is then stored in the IPFS to achieve distributed storage. Second, a distributed and trusted access control policy is implemented by deploying an access control protocol Ethereum smart contract related to speech attributes. Finally, the proposed scheme is deployed and tested in the Linux environment, and the smart contract is deployed on the Ethereum test chain. Through experimental comparison with different storage schemes, analysis of the operational properties of our scheme and the evaluation of the quality index of the encryption scheme, the proposed scheme is shown to reliable, secure and scalable.
... Public, Private and hybrid. The well-known attacks on the cloud are DDoS, Flooding attacks, SQL Injections, MITM attacks, Phishing attacks, malicious insiders, authentication attacks, unencrypted data, and incomplete data deletion, etc. [14] proposed a new decentralized, lightweight encrypted, Yugula cloud storage framework to preserve files' confidentiality, avoid central data replication, and boost blockchain completeness. Two approaches for data deduplication confidentiality files are discussed: one is using dual hacking and the other uses symmetrical encryption. ...
... Block chain based Solutions SE over hyper ledger [14] BPPCCE [24] • Ethereum • Remix-IDE [17] AuthPrivacyChain [18] Cloud@blockchain [19] ASOCSIB [20] • ...
Research
Full-text available
Cloud computing is one of the ruling storage solutions. However, the cloud computing centralized storage method is not stable. Blockchain, on the other hand, is a decentralized cloud storage system that ensures data security. Cloud environments are vulnerable to several attacks which compromise the basic confidentiality, integrity, availability, and security of the network. This research focus on decentralized, safe data storage, high data availability, and effective use of storage resources. To properly respond to the situation of the blockchain method, we have conducted a comprehensive survey of the most recent and promising blockchain state-of-the-art methods, the P2P network for data dissemination, hash functions for data authentication, and IPFS (InterPlanetary File System) protocol for data integrity. Furthermore, we have discussed a detailed comparison of consensus algorithms of Blockchain concerning security. Also, we have discussed the future of blockchain and cloud computing. The major focus of this study is to secure the data in Cloud computing using blockchain and ease for researchers for further research work.
Chapter
The Internet of things (IoT) is a disruptive technology approach with a great potential for growing and becoming a daily trend in our lives. Lecturing activities will greatly benefit from collecting and providing data to and from the student in remote learning activities. The data transit within the IoT network and into the cloud should deserve a special concern. The artificial intelligence (AI) can be the best tool to detect and stop malicious behaviors predicting a cyberattack, on the other security layer the data transit can gain in terms of security with a blockchain (BC) solution. A IoT with temporary loss of connectivity means its inability to keep collecting relevant data but also more vulnerability in terms of security. This occurs due to its temporary incapacity to keep the connection to all the security agents in the network. A delay tolerant network (DTN) approach is presented as a solution. Also, having the support of blockchain will enable secure and decentralized between all DTN nodes. An innovative approach that combines IoT super nodes, nodes with power processing, AI features, presence in vicinity of other nodes along with battery capacity twill play, temporarily, the role of relaying the traffic and work as agents for the successful security implementation should bring several advantages in comparison with classic security approaches. This work plans to present the benefits of this novel approach which is clarified in terms of the new technology paradigms applied along with the role of the super nodes in the solution.
Chapter
The continuous development of interconnected devices is reconstructing livelihood and with the emergence of advanced technology yields growth in the usage of IoT devices. Though it is revolutionary and brings convenience, it faces different security challenges. The transition from traditional to smart in varieties of sectors makes the use of IoT technologies to become an integral part of lives which obviously requires a protection shield that can be offered by Blockchain. Blockchain's protected decentralization overcomes different hazards the IoT structure is facing. In this chapter, the authors have represented a detailed study on IoT, its layer and applications, a comparative analysis of different security threats, how an association with blockchain would mitigate different shortcomings of present IoT architecture, and the issues that exist in Blockchain-associated IoT.
Article
As an important part of digital building, building internet of things (BIoT) plays a positive role in promoting the construction of smart cities. Existing schemes utilize blockchain to achieve trusted data storage in BIoT. However, the full-copy storage mechanism of blockchain and the management requirements of massive data have brought computing and storage challenges to edge nodes with limited resources. Therefore, a data management scheme for BIoT based on blockchain sharding is proposed. The scheme proposes a hybrid storage mechanism, which uses inter-planetary file system (IPFS) to ensure the integrity and availability of data outside the chain, and reduces the storage overhead of edge nodes. Based on the hybrid storage mechanism, the sharding algorithm is designed to divide the blockchain into multiple shards, and the storage overhead and computing overhead are offloaded to each shard, which effectively balances the computing and storage overhead of edge nodes. Finally, comparative analysis was made with existing schemes, and effectiveness of proposed scheme was verified from the perspectives of storage overhead, computation overhead, access delay and throughput. Results show that proposed scheme can effectively reduce storage overhead and computing overhead of edge nodes in BIoT scenario.
Article
Full-text available
This work provides a systematic literature review of blockchain-based applications across multiple domains. The aim is to investigate the current state of blockchain technology and its applications and to highlight how specific characteristics of this disruptive technology can revolutionise "business-as-usual" practices. To this end, the theoretical underpinnings of numerous research papers published in high ranked scientific journals during the last decade, along with several reports from grey literature as a means of streamlining our assessment and capturing the continuously expanding blockchain domain, are included in this review. Based on a structured, systematic review and thematic content analysis of the discovered literature, we present a comprehensive classification of blockchain-enabled applications across diverse sectors such as supply chain, business, healthcare, IoT, privacy, and data management, and we establish key themes, trends and emerging areas for research. We also point to the shortcomings identified in the relevant literature, particularly limitations the blockchain technology presents and how these limitations spawn across different sectors and industries. Building on these findings, we identify various research gaps and future exploratory directions that are anticipated to be of significant value both for academics and practitioners.
Article
Full-text available
With the ever increasing volume of digital data, using data deduplication techniques in archival storage has become imperative in large scale cloud systems. In a storage system based on a cross-user data deduplication technique, data is uploaded to the server by the client if only it has not been uploaded by him or any other clients previously. Therefore, such storage systems achieve optimal utilization of both storage and bandwidth resources. Although the existing cross-user data deduplication techniques preserve data privacy requirements to some extent (e.g. confidentiality and tag consistency), they leak information to communication parties. In particular in those settings, the clients and the server may find out which clients own identical data. Nevertheless, in this paper we introduce a privacy requirement for data deduplication techniques called data-unlinkability to resolve this problem. Moreover, we propose a novel approach for data deduplication using Bloom filter to provide data-unlinkability for clients’ files. Our proposed approach does not use Proof of Ownership (POW) scheme, and only entities who have access to the entire files content can obtain the information for files retrieval. The security and performance analysis suggest that our technique provides both privacy requirements and optimization of storage space and bandwidth.
Conference Paper
Full-text available
Client-side data deduplication enables cloud storage services (e.g., Dropbox) to achieve both storage and bandwidth savings, resulting in reduced operating cost and high level of user satisfaction. However, the deduplication checks (i.e., the corresponding essential message exchange) create a side channel, exposing the privacy of file existence status to the attacker. In particular, the binary response from the deduplication check reveals the information about the existence of a copy of the file in the cloud storage. This behavior can be exploited to launch further attacks such as learning the sensitive file content and establishing a covert channel. While current solutions provide only weaker privacy or rely on unreasonable assumptions, we propose RAndom REsponse (RARE) approach to achieve stronger privacy. The idea behind our proposed RARE solution is that the uploading user sends the deduplication request for two chunks at once. The cloud receiving the deduplication request returns the randomized deduplication response with the careful design so as to preserve the deduplication gain and at the same time minimize the privacy leakage. Our analytical results confirm privacy guarantee and results show that both deduplication benefit and privacy of RARE can be preserved.
Chapter
Chunk-based deduplication has been widely used in storage systems to save storage space. However, deduplication impairs data reliability due to the inter-file chunk sharing. The loss of shared chunks will make these referenced files inaccessible. Meanwhile, we find that inter-file and highly-referenced chunks are important that need higher reliability assurance, but occupy a small fraction of physical storage. Traditional deduplication systems utilize erasure coding or replication techniques to ensure data reliability. With the growth of shared chunks, promoting the reliability of erasure-coded systems incurs large I/O cost because of the weakness of coding scalability. Although replication is easy to scale, it incurs larger storage overhead. In this paper, we present DARM, a Deduplication-Aware Redundancy Management approach via exploiting deduplication semantics (e.g., inter-/intra-file duplicates, chunk size and reference count) to improve data reliability with low overhead. DARM leverages erasure coding for storing unique and low-referenced chunks to improve both storage reliability and space efficiency, and employs Selective and Dynamic Chunk-based Replication (SDCR) for maintaining inter-file and highly-referenced chunks to enhance storage reliability. Experimental results based on real-world datasets show that DARM reduces storage overhead by up to 43.4% and achieves at most 12.7% reliability improvements over the state-of-the-art schemes.
Article
To eliminate redundant transfers over WAN links and improve network efficiency, middleboxes have been deployed at ingress/egress. These middleboxes can operate on individual packets and are application layer protocol transparent. They can identify and remove duplicated byte strings on the fly. However, with the increasing use of HTTPS, current redundancy elimination (RE) solution can no longer work without violating end-to-end privacy. In this paper, we present RE over encrypted traffic (REET), the first middlebox-based system that supports both intra-user and inter-user packet-level RE directly over encrypted traffic. REET realizes this by using a novel protocol with limited overhead and protects end users from honest-but-curious middleboxes. We implement REET and show its performance for both end users and middleboxes using several hundred gigabytes of network traffic traces collected from a large U.S. university.
Conference Paper
Fabric is a modular and extensible open-source system for deploying and operating permissioned blockchains and one of the Hyperledger projects hosted by the Linux Foundation (www.hyperledger.org). Fabric is the first truly extensible blockchain system for running distributed applications. It supports modular consensus protocols, which allows the system to be tailored to particular use cases and trust models. Fabric is also the first blockchain system that runs distributed applications written in standard, general-purpose programming languages, without systemic dependency on a native cryptocurrency. This stands in sharp contrast to existing block-chain platforms that require "smart-contracts" to be written in domain-specific languages or rely on a cryptocurrency. Fabric realizes the permissioned model using a portable notion of membership, which may be integrated with industry-standard identity management. To support such flexibility, Fabric introduces an entirely novel blockchain design and revamps the way blockchains cope with non-determinism, resource exhaustion, and performance attacks. This paper describes Fabric, its architecture, the rationale behind various design decisions, its most prominent implementation aspects, as well as its distributed application programming model. We further evaluate Fabric by implementing and benchmarking a Bitcoin-inspired digital currency. We show that Fabric achieves end-to-end throughput of more than 3500 transactions per second in certain popular deployment configurations, with sub-second latency, scaling well to over 100 peers.