ArticlePDF Available

Popularity (Hit Rate) Based Replica Creation for Enhancing the Availability in Cloud Storage

Authors:

Abstract and Figures

In cloud computing, the replication management system has been well adopted in cloud storage applications. To provide the availability and reliability, the replication system replicates the files and can be stored in different server. The system led some complicated issues such as high memory consumption, incurred high storage cost and to access the file is more complicated issues in recent cloud storage applications. In the existing technique, File Accessing Frequency based Ranking (FAFR) Algorithm and Dynamically Reduced Replica for Rarely Accessed files (DRRRA) algorithm work jointly and identify the rarely accessed files and retain the replica in two server other replicated files are deleted. To provide access to more request with 2 or 3-replica is a complicated issue. Thus, this paper proposes a Dynamic replica Creation for Availability enhanced Storage (DRCAES) algorithm which jointly work with FAFR algorithm to predict most frequently accessed files and automatically replicated to other server based on server memory. The aim of this proposed approach is to enhance the availability, thereby reducing the request-response delay time. Thus the proposed approach optimizes the number of replicas, occupied space, and cost.
Content may be subject to copyright.
Received: November 18, 2017 161
International Journal of Intelligent Engineering and Systems, Vol.11, No.2, 2018 DOI: 10.22266/ijies2018.0430.18
Popularity (Hit Rate) Based Replica Creation for Enhancing the Availability
in Cloud Storage
S. Annal Ezhil Selvi1* R. Anbuselvi1
1Department of Computer Science, Bishop Heber College,
Trichy, Tamilnadu, 620017, India
* Corresponding author’s Email: ezhilabel.bhc@gamil.com
Abstract: In cloud computing, the replication management system has been well adopted in cloud storage
applications. To provide the availability and reliability, the replication system replicates the files and can be stored in
different server. The system led some complicated issues such as high memory consumption, incurred high storage
cost and to access the file is more complicated issues in recent cloud storage applications. In the existing technique,
File Accessing Frequency based Ranking (FAFR) Algorithm and Dynamically Reduced Replica for Rarely Accessed
files (DRRRA) algorithm work jointly and identify the rarely accessed files and retain the replica in two server other
replicated files are deleted. To provide access to more request with 2 or 3-replica is a complicated issue. Thus, this
paper proposes a Dynamic replica Creation for Availability enhanced Storage (DRCAES) algorithm which jointly
work with FAFR algorithm to predict most frequently accessed files and automatically replicated to other server
based on server memory. The aim of this proposed approach is to enhance the availability, thereby reducing the
request-response delay time. Thus the proposed approach optimizes the number of replicas, occupied space, and cost.
Keywords: Cloud storage, Replication, Reduce replica, Dynamic replica, File popularity, File accessing frequency.
1. Introduction
Microsoft Azure, Amazon, Google Cloud
Storage (GCS) and as leading Cloud service
Providers offer different types of storage (i.e.,
sequences of files, etc.) with different prices for data
storage services. The data storage services and
accessing files are very difficult issues on
Redundancy Storage. Each cloud service provider
also provides and monitors the commands to
retrieve, store and delete data through network
services, which impose in- and out-network delay
and cost on an application [1]. In leading Cloud
service provider in network cost is free, while out-
network cost (network cost for accessing) is charged
and may be different for usage of cloud providers. In
cloud server Data transferring or data replication
among from one server to other server, this shows
significant price differences among them. The
existing problem on this diversification plays an
essential role in the optimization of data
management request response and delay in cloud
environments [2]. This proposed techniques at
optimizing this request response and delay that
consists of residential cost (i.e., storage) and
potential migration cost (i.e., network server cost).
Cloud service provider many applications are
moving towards a distributed interconnected
network environments. In this distributed
environment, the data storage and all computational
cloud resources are distributed during different and
widespread locations based on ranking.
A cloud server store the data, the data can have
a huge number of users that require having access to
huge data volumes. For example, consider a set of
documents or images or videos that needs to be read
and accessed by a number of user spread worldwide,
in a distributed way. The access to vast data
volumes by huge number of users can be access
very time consuming. As the size of the system is
increased, the tasks of providing such data service
Received: November 18, 2017 162
International Journal of Intelligent Engineering and Systems, Vol.11, No.2, 2018 DOI: 10.22266/ijies2018.0430.18
becomes more difficult since its users suffer from
long delays in data access.
Replica cloud storage is a new research field; the
dynamic replication policy is still rarely seen. In this
paper, the focal point is on early distributed storage
file system replica creation approach, combine with
the quality of cloud storage, this paper design a
dynamic replication strategy based on high
prediction. It create a replica according to the file
access history quality, so with the purpose of users
can access the nearby which bring the cloud system
to one of the most excellent status with ranking,
specifically the replica number of minimum, access
to the highest efficiency, network lifetime is
increased.
A fixed number of replicas for every file is
insufficient to give quick file read for hot files while
waste resources for storing replicas of cold files.
Random selection of replica destination require
observance all Data Centers active to ensure data
availability, which though waste power consumption.
As the random selection of replica destination does
not think purpose of bandwidth and request handling
capacity, network congestions could occur due to
capacity restriction of some links and server may
turn into overloaded by data requests.
In this environment, data replication is essential
so that the users can retrieve the most request data
from storage residing in nearby server. The
replication is based on server memory [3]. The
performance of the distributed networks is crucially
affected by the replication strategy used. The huge
majority of the known replication strategy
determines the replicas by computing an easy
ranking based on the number of requests for each
file on individual cloud server. The “most accessing”
files, the ones with the highest ranking value, are
selected for replication due to the memory based
other server replicated.
This ranking technique that it is quite possible
that, the accessing files with high recent demand
will be requested on cloud server, with even higher
hit rates. Since the computation is quite simple, the
strategy mainly focuses on the problem of select the
most suitable files for storing the replicas based on
memory [4].
The two main drawbacks of the strategies
proposed are related to the following techniques
implemented in this work. First one is optimization
process and second one is high prediction ranking
algorithm. The scheme presented in the literature
does not take into account the change that might
suggest itself in the interest of users for certain files.
Instead, they are mostly involved in one or more
factors that decide the importance of the file
themselves, like the file size, the number of requests
for an entity file or the contents of a file. The user
request response time and files are analysed
optimization process [5].
The Second process user request analysed and
which files are most hitting on individual server.
The existing replication decision algorithm satisfies
the better request response time but the replication
of hitting is complicated [5]. The hitting ratio is
calculated or monitor on high prediction ranking
algorithm. This algorithm identifies the most hitting
files and replicated to other server based on server
memory. The replication process implemented on
this server automatically reduced the delay and
request response time on server. At the end of this
approach the cloud storage system will act as a
Role-Based intelligent System (RBIS). So that, the
user can an efficient data storage on cloud
computing environment.
Some of the roles of the DRCAES approach is
listed below,
Dynamically predict rarely accessed files and
most frequently accesses the file using FAFR
model.
Through DRRRA dynamically reduce the
number of replicas of that rarely accessed files,
so that, the cost and occupied space will be
minimized. For reduction of the replica, it finds
minimum available Space of DC among DC’
where that file exists (Removal).
In DRCAES, dynamically create and place the
new replica for the most accessed file if the
frequency of each replica is equally accessed
otherwise it won’t replicate. The new replica
placed in the data center that has more available
space and that file does not exist.
Balanced storage retained during removal and
new replica placement. Because, it analyses all
aspects of Storage system like available Space,
SLA, Accessing frequency of all existing replica.
The existing work are discussed in Section 2,
overall back ground process are described in section
3, In section 4 the proposed work which is improve
the overall request time and reduced delay using
high prediction ranking algorithm are described. The
section 5 discusses the results details. And section 6
is conclusion.
2. Related works
Data replication issue effectively requires taking
a closer look at the arrangement of most common
services and applications deployed on storage clouds
to provide services to other parties. Such
applications are usually implement as multi-tier
Received: November 18, 2017 163
International Journal of Intelligent Engineering and Systems, Vol.11, No.2, 2018 DOI: 10.22266/ijies2018.0430.18
applications running on distributed software system.
The multi-tier application user giving the request the
overall response time is very high. The storage
strategies so far consider the number of requests as
the main hitting files for computing the popularity of
each file [6].
The hitting files identify the limitation of the
current research of data replication in cloud server:
they are whichever hypothetical investigation
without realistic consideration, or heuristics-based
execution without a provable performance guarantee.
The most directly related work to this replication
work is complicated process on clouds server. The
data replication and request response on cloud
server as a static optimization problem on user
access [7]. They show that this problem is NP-hard
and request delay, which means that present, is no
polynomial algorithm that provides an accurate
solution. They only reflect on static data replication
for the intention of proper analysis. The limitation of
the static approach is that the replication cannot
regulate to the dynamically shifting user access
prototype. Additionally, their centralized process of
integer programming technique cannot be simply
implementing in a distributed cloud server.
The request response and resource sharing use
an auction protocol to make the replication choice
and to trigger long-term optimization by with file
access patterns. In this propose utility-based
replication strategies on clouds server. In this
process address the data replication for availability
in the face of unreliable works, this is different from
this optimization work [8, 9].
The random collection of replica destination
neglects server heterogeneity (i.e., different Data
Centers vary in data request handling capacities and
network capacities). The write due to creating
replicas in production clusters at searching engine
application for almost half of all cross-rack traffic.
While the network within clusters is frequently
underutilized, there exist some traffic jam links
resulting from the network usage imbalance [10].
To assume the multi-facility cloud resource
allocation problem, they are mainly involved in
solutions that are agreeable to parallel
implementations. There is quite a lot of reason. First,
for a cloud resource allocation, problem (1) is
inherently an important convex resource
optimization problem, with millions of variables or
still more. A centralized process of cloud server
resource allocation solver is extremely inefficient in
solving such large-scale cloud storage problems [11].
3. Background process
Cloud services, such as search engines,
education portal, parallel application, social
networking, etc., are often deploy on a
geographically spread the infrastructure, i.e. data
centers placed in different regions and better and
reliability. A usual query is then how to direct the
workload from users along with the set of geo
distributed data centers in organize to achieve a
desired transaction between performance and delay,
since the power price exhibit an important degree of
geographical diversity [12]. This query has involved
much attention recently and is usually referred to as
geographical cloud server load balancing.
The resource scheduling based data replication
problem and focus on scheduling pathetically
parallel resource usage which are collected of a set
of independent responsibilities with very minimal or
no data synchronization. A huge number of
applications fit in to this type of resource sharing on
cloud storage. Examples consist of distributed
relational database query, search engine query,
BLAST searches, data processing, and image
processing applications such as shaft tracing. To
effect apathetically parallel resource allocation, each
of its tasks is placed on a physical server and
executed in an external server added for that task.
The completion time of this resource request is the
completion time of the last finished request and
overall request response, i.e., the make criticize of
that set of request are completed [13].
The conventional data caching/replication
problem have been considered extensively in the
framework of the Web distributed cloud databases
and multimedia systems. What be different from
Web caching is that disk memory and I/O bandwidth
are the main concerns in multimedia storage systems.
A number of algorithms are proposed to attain high
acceptance rate and resource utilization by balancing
the use of different request response resources.
Unlike Web search engine and multimedia data,
database contents are access by both read and write
operations based in optimization process and
ranking [14].
It is assumed that the frequent accessed files in
the past will be accessed more than the others in the
future. This is called as high prediction ranking on
temporal locality. With the property of sequential
locality, a most accepted data file is resolute by
analysing the number of access to the data files from
users. After discovery the best popular file, we trace
to the client that produce the most requests for the
popular data file and a new replica is placed in it.
Therefore, in this application have to collect history
Received: November 18, 2017 164
International Journal of Intelligent Engineering and Systems, Vol.11, No.2, 2018 DOI: 10.22266/ijies2018.0430.18
of records regarding the end-to-end data transfers to
decide which file should be replicated [15].
4. Proposed framework and algorithm
Dynamic replication of data is another
significant issue since access frequencies to
individual data items are likely to modify in most
cloud server environments. The aim is to make the
replication strategy rapidly and accurately adapt to
changes and achieve optimal ranking process on
long-term performance.
In [12], they establish that, to take advantage of
cached data, it is sometimes essential to procedure
individual queries using “suboptimal” plans in
arrange to reach high system performance. In data
replication is triggered as a result of changes of
request rates in cloud server.
4.1 Dynamic replica creation availability
enhanced storage framework
The proposed contribution is used an
optimization based high prediction ranking
algorithm. The Ranking algorithm (FAFR) is giving
better performance on request response time and
delay. The ‘N’ number of user agreed request and
access the files on cloud server. To find the files and
request user identification process and queuing
process all are calculated the cloud server. Fig. 1 is
architecture of optimization based ranking with
conclude of input request response. The user
requests the input and cloud analyse the user which
files are request. The server files viewed by ranking
(based on most popularity). Example the java.doc
files mostly requested and download the files from
user means, the java.doc is replicated to other server
based on server memory.
Design data sharing-aware optimization
algorithms for solving the resource request problem.
Before relating the algorithms establish few static
definition and assumption concerning the cloud
servers. The Data Centers manage by the cloud
service provider are in one of the following two
states: active (available) and replicate.
An active cloud server is a server that is
powered on and is currently considered for all
resource allocation by the algorithms. A replication
is a server that is most frequent files are request in
cloud server considered for replicated data on other
server based on memory allocation by the
algorithms. So it can denote by Fi, Si the set of most
frequent access file and number of Data Centers.
When the entire cloud Data Centers hosted by files,
and accessing most frequent files are replicated files
on ranking based techniques in clouds server.
Figure.1 Proposed architecture
Data Centers are configured and located in
different geo-locations. Files are stored in that
configured Data Centers. The previous work of this
research proposed an algorithm (Dynamically
Reduced Replica for Rarely Accessed file (DRRRA)
Algorithm [15]) reduce the replica based on file
popularity (Least accessed files) which is the result
of Ranking Algorithm (File Accessing Frequency
based Ranking) [14]. The ranking algorithm predicts
the files based on their popularity. But, this paper
focuses the most frequently accessed file which is
the most popular file. If the user need is increased
for a file that files replicated dynamically.
4.2 Mathematical model for replica creation and
placement
The following notation and equations are used in
the prediction process of frequently accessed files
and replica increasing process.
m: Number of uploaded files.
n: Number of Data Center (DC).
 : ith DC, i 1,…,n.
: jth file, j 1,…,m.
 Replica Accessing Frequency (RAF)
(Hit Rate) of ith file in jth DC. It is shown in
the following matrix (m X n ) representation.
  
  



  
FAF: File Accessing Frequency computed
using eq. (1) as,
FAF
 , i 1,…, n
j 1,…,m. (1)
Received: November 18, 2017 165
International Journal of Intelligent Engineering and Systems, Vol.11, No.2, 2018 DOI: 10.22266/ijies2018.0430.18
TFL : Total File list(TFL) in ith file in jth
DC, where i 1,…,n j 1,…,m.
AFL : L where is Allocated File List
(AFL) of ith file in jth DC, if
 > 0,
TFL
NAFL: Non Allocated File List of ith file
in jth DC, if
 0, TFL
TFLDC :K where is Total File List of
each Data Center‘ of ith DC in jth File,
where i 1,…,n j 1,…,m.
AFLDC : D where is Allotted File List of
each Data Center, of ith DC in jth File, if
 0 K.
DCC : Data Center' Capacity (DCC) of jth
DC, where j 1,…, n .
FS: File’s Sizes of jth DC, where j 1,…,
m .
OS: Occupied Space of jth DC is calculated
using the following equation
OS= 
 D, j 1,…,n. (2)
AS : Available Space of jth DC calculated
using eq. (3) as,
AS = DCC- OS j 1,…,n. (3)
FA_Th: Frequently Accessed files
Threshold Value.
Th_LL: Low Limit of threshold value.
Th_UL: Upper Limit of threshold value
In prediction process, there are two levels of
prediction done to find the files which are really
needed to increase the number of replicas for
meeting the availability enhancement requirements.
In first level prediction, the most frequently
accessed files are predicted using Eq. (4) based on
the FAF of the FAFR model.
The second level prediction able to done based
on the result of the first level prediction. By first
level prediction, some of the most frequently
accessed file is in resulting set. From that files, the
second level prediction finds the files which are
really required to provide seamless availability that
is done using Eq. (5) based on RAF of the FAFR
model.
 FA_Th (4)

󰇱 󰇡ASNAFL󰇢 if th_LL

th_UL,
 

(5)
To be exact, Check the frequency of each replica
(if it is equally accessed it will be replicated
otherwise it won’t be replicate). Then the Replica
Placement will be done based on available space of
the data center. That is, the dynamically created
replica will be placed in Datacenter that has more
available space and that file does not exist.
4.3 File accessing frequency based ranking
(FAFR) algorithm
FAFR is a ranking algorithm which proposed in
[14] itself. But in this paper, the new name is given
with some refined work. is the Replica Accessing
Frequency (RAF) (hit rate). Initially, value is 0
when the file is uploaded then the value became 1.
And whenever the file is accessed the value will
be incremented. Finally, the summative value is
calculated and considered as a File Accessing
Frequency.
Input: , , M(), k=0, =0,q
Output: Rank { q’s result set}
1. Done file in Data center’s when user
interface triggered
2. If file upload
3. =1
4. If file access
5. = +1;
6. for each do
7. for each  do
8. K=k+
9. End
10. Rank. insert (all Values)
11. If Rank =q
12. Return Rank
Figure.2 Working principle of DRRRA
Received: November 18, 2017 166
International Journal of Intelligent Engineering and Systems, Vol.11, No.2, 2018 DOI: 10.22266/ijies2018.0430.18
In Fig. 2 is shown the working principle of
Dynamically Reduced Replica for Rarely Accessed
files (DRRRA) is described in [15]. FAFR and
DRRRA are jointly worked and predict Rarely
Accessed files then reduce their replica to 2-replica
strategy. Here the minimum replica is 2 for assuring
Reliability and maximum is 3-replica strategy.
4.4 Dynamic replica creation for availability
enhanced storage (DRCAES) algorithm
The following DRCAES algorithm dynamically
finds the most frequently accessed file using FAFR
[14]. Then, based on FAFR result DRCAES
dynamically replicate the replica and place it on DC
that have maximum Available Space (AS) and that
DC does not have that file’s replica on it.
Input : Fi, DCi , NALk, i,j , FA_Th, Th_LL ,
Th_UL., DC, Replica
Output: Dynamically Increased Replica.
1. Set values to FA_Th, Th_LL and Th_UL.
2. If user access
3. FAF getFAF()
4. If >= FA_Th then
5. Fi getHighRankFiles()
6. DCi getHighRankFilesDC()
7. NALk get_allNonAllocatedList()
8. For each Fi do
9. For each DCj do
10. i,j getRAF(F i , D j)
11. If( >= Th_LL and i,j < Th_UL) then
12. Replica=copyof(Fi)
13. DC=getDCid(max(ASi
getAvailableSpace(NALk )))
14. REPLICATE( DC, Replica)
15. End
16. End
In DRCAES algorithm Step (3) is used to get
most frequently accessed files. For every user access
monitored and when FAF reach the FA_Th valued
that time it performs the second level prediction
which is done in RAF. That is, the every RAF
should satisfy the range which mentions in step 10.
End of this checking, if the result set has files that
files are required to increase their number of
replications, that will be done based on SLA
specification and Available Space of DC which is
defined in Step (12).
Step (11) is used to verify if the file need to
replicate or not. That is, Th_LL is lower limit of
threshold value range and Th_UL is a upper limit of
threshold value range which determine the ranges of
threshold values are decide to replicate or not.Then,
Step(12-14) are work when step (11)’s decision.
Example:
The worked out examples of DRCAES
algorithm is presented in tables 1 to 6, which will be
discussed below one by one. After this discussion,
the reader can easily understand the concept of our
approach clearly.
The Prediction of Frequently Accessed Files
based on FAF process is represented in table 1. The
FA_Th value for validation is 15. When the FAF
reach 15 that file comes under the consideration. For
an example, using Eq. (4) the files 1, 3, and 5 are in
consideration.
In Table 1 shows seven Data Centers are
configured with 5GB Memory and they are located
in different geo-locations. And five different files
are stored on among the 7 DC.
Next, these files are verified by second level
prediction process that is Prediction of highly
needed files based on RAF which is explained in
table 2. In second level prediction done using Eq.
(5) which checks the individual replica frequency, if
it all equally accessed that file will be replicated in
one more data center.
In Eq. (5), there are two threshold values
involved. First one is Th_LL, its value for validation
is 10. The second one is Th_UL, its value for
validation is 20. The file 3’s replicas residing in
DC2, DC4, and DC5, as well as their RAF, is 3 is
8,10,6 respectively. So this files not in the range of
Th_UL and Th_UL. Thus, this file need not be
replicated, but other two files 1 and 5 need to be
replicated because it satisfies the range values.
In table 3 depicted the process of replica
placement which is described in Eq. (7). The file
1’s replicas are residing in DC2, DC4 and DC7
along with their RAF are 14, 20 and 15 respectively.
The replica placement has done based on
Available Space (AS) of the data center which
doesn’t have the replica of the file. Here, the DC1,
DC3, DC5, and DC6 doesn’t have the replica of
file1 as well as their AS is 4.5, 4.9, 4.7 and 4.8
respectively.
The DC3 has the maximum of AS among the
DCs which as mentioned in the previous point. So,
the replica will be placed in DC3, the reflected
changes in metadata are shown in table 4.
Received: November 18, 2017 167
International Journal of Intelligent Engineering and Systems, Vol.11, No.2, 2018 DOI: 10.22266/ijies2018.0430.18
Table 1. Prediction of frequently accessed files based on FAF
S
.
N
o
File Name
File
Type
N
R
File
Size
in
MB
RAF (Replica Accessing Frequency ) ()
DC 1
5GB
DC 2
5GB
DC 3
5GB
DC 4
5GB
DC 5
5GB
DC 6
5GB
DC 7
5GB
1
Array_Java
docx
3
0.07
0
14
0
20
0
0
15
2
Tree_ds
pdf
2
0.11
0
0
1
0
0
1
0
3
Img_001
jpeg
3
0.31
0
8
0
10
6
0
0
4
CS_C
mp3
2
0.55
3
0
0
2
0
0
0
5
HelloEnglish
mp4
2
1
0
13
0
0
0
0
19
OS (Occupied Space In GB)
0.5
1.4
0.1
0.9
0.3
0.1
1.0
AS ( Available Space in GB)
4.5
3.6
4.9
4.1
4.7
4.9
4.0
Table 2. Prediction of highly needed files based on RAF
S.No
File Name
File
Type
NR
File
Size
in
MB
RAF (Replica Accessing Frequency ) ()
FAF
DC
1
5GB
DC
2
5GB
DC
3
5GB
DC
4
5GB
DC
5
5GB
DC
6
5GB
DC
7
5GB
1
Array_Java
docx
3
0.07
0
14
0
20
0
0
15
49
2
Tree_ds
pdf
2
0.11
0
0
1
0
0
1
0
2
3
Img_001
jpeg
3
0.31
0
8
0
10
6
0
0
24
4
CS_C
mp3
2
0.55
3
0
0
2
0
0
0
5
5
HelloEnglish
mp4
2
1
0
13
0
0
0
0
19
32
OS (Occupied Space In GB)
0.5
1.4
0.1
0.9
0.3
0.1
1.0
AS ( Available Space in GB)
4.5
3.6
4.9
4.1
4.7
4.9
4.0
Table 3. Ex. 1: RAF based replica creation (Before placement)
S.No
File Name
File
Type
NR
File
Size
in
MB
RAF (Replica Accessing Frequency ) ()
FAF
DC
1
5GB
DC
2
5GB
DC
3
5GB
DC
4
5GB
DC
5
5GB
DC
6
5GB
DC
7
5GB
1
Array_Java
docx
3
0.07
0
14
0
20
0
0
15
49
2
Tree_ds
pdf
2
0.11
0
0
1
0
0
1
0
2
3
Img_001
jpeg
3
0.31
0
8
0
10
6
0
0
24
4
CS_C
mp3
2
0.55
3
0
0
2
0
0
0
5
5
HelloEnglish
mp4
2
1
0
13
0
0
0
0
19
32
OS (Occupied Space In GB)
0.5
1.4
0.1
0.9
0.3
0.1
1.0
AS ( Available Space in GB)
4.5
3.6
4.9
4.1
4.7
4.8
4.0
Received: November 18, 2017 168
International Journal of Intelligent Engineering and Systems, Vol.11, No.2, 2018 DOI: 10.22266/ijies2018.0430.18
Table 4. Ex. 1: RAF based replica creation (After placement)
S.No
File Name
File
Type
NR
File
Size
in
MB
RAF (Replica Accessing Frequency ) ( )
FAF
DC
1
5GB
DC
2
5GB
DC
3
5GB
DC
4
5GB
DC
5
5GB
DC
6
5GB
DC
7
5GB
1
Array_Java
docx
4
0.07
0
14
1
21
0
0
15
49
+1
2
Tree_ds
pdf
2
0.11
0
0
1
0
0
1
0
2
3
Img_001
jpeg
3
0.31
0
8
0
10
6
0
0
24
4
CS_C
mp3
2
0.55
3
0
0
2
0
0
0
5
5
HelloEnglish
mp4
2
1
0
13
0
0
0
0
19
32
OS (Occupied Space In GB)
0.5
1.4
0.4
0.9
0.3
0.1
1.0
AS ( Available Space in GB)
4.5
3.6
4.6
4.1
4.7
4.9
4.0
Table 5. Ex. 2: RAF based replica creation (Before placement)
S
.
N
o
File Name
File
Type
NR
File
Size
in
MB
RAF (Replica Accessing Frequency ) ()
FAF
DC 1
5GB
DC 2
5GB
DC 3
5GB
DC 4
5GB
DC 5
5GB
DC 6
5GB
DC 7
5GB
1
Array_Java
docx
4
0.07
0
14
1
20
0
0
15
49 +1
2
Tree_ds
pdf
2
0.11
0
0
1
0
0
1
0
2
3
Img_001
jpeg
3
0.31
0
8
0
10
6
0
0
24
4
CS_C
mp3
2
0.55
3
0
0
2
0
0
0
5
5
HelloEnglish
mp4
2
1
0
13
0
0
0
0
19
32
OS (Occupied Space In GB)
0.5
1.4
0.4
0.9
0.3
0.1
1.0
AS ( Available Space in GB)
4.5
3.6
4.6
4.1
4.7
4.9
4.0
Table 6. Ex. 2: RAF based replica creation (After placement)
S.
N
o
File Name
File
Type
NR
File
Size
in
MB
RAF (Replica Accessing Frequency ) ()
FAF
DC 1
5GB
DC 2
5GB
DC 3
5GB
DC 4
5GB
DC 5
5GB
DC 6
5GB
DC 7
5GB
1
Array_Java
docx
4
0.07
0
14
1
20
0
0
15
49 +1
2
Tree_ds
pdf
2
0.11
0
0
1
0
0
1
0
2
3
Img_001
jpeg
3
0.31
0
8
0
10
6
0
0
24
4
CS_C
mp3
2
0.55
3
0
0
2
0
0
0
5
5
HelloEnglish
mp4
2
1
0
13
0
0
0
0
19
32
OS (Occupied Space In GB)
0.5
1.4
0.4
0.9
0.3
0.1
1.0
AS ( Available Space in GB)
4.5
3.6
4.6
4.1
4.7
4.9
4.0
Received: November 18, 2017 169
International Journal of Intelligent Engineering and Systems, Vol.11, No.2, 2018 DOI: 10.22266/ijies2018.0430.18
Table 7: Comparison of Existing PRC [3], Proposed DRRRA [15] and DRCAES
S.no
FAF
Existing
(PRC)
conventional 3 Replica
Strategy (Minimum Replica 1
Maximum Replica 3)
Proposed DRRRA
Algorithm
(Dynamically Reduced
Replica of Rarely Accessed
files (2-Replica)
Proposed DRCAES
Algorithm
(Dynamically create the
Replica based on User need)
NR
OS=
FS *
NR
Cost= OS *
0.00067
RR_TD
NR
OS=
FS *
NR
Cost= OS *
0.00067
RR_TD
NR
OS=
FS *
NR
Cost= OS *
0.00067
RR_TD
1
>10
3
2.31
0.0015477
3
2
1.54
0.0010318
3
2
1.54
0.0010318
3
2
15-40
3
2.31
0.0015477
3.5
3
2.31
0.0015477
3.5
3
2.31
0.0015477
3.5
3
41-60
3
2.31
0.0015477
4.2
3
2.31
0.0015477
4.2
4
3.08
0.0020636
3.7
4
61-80
3
2.31
0.0015477
7
3
2.31
0.0015477
7
5
3.85
0.0025795
3.9
In table 5 is shown the replica placement process
of file 5. The file 5’s replicas are residing in DC2
and DC7 along with their RAF is 13, and 19
respectively. Here, the highlight point regarding this
file is, the replica of this file is reduced by DRRRA
approach (2-replica strategy) which is discussed in
chapter 6. But, the need for this file is increased. So,
it is going to be replicated.
Here, the DC1, DC3, DC4, DC5, and DC6
doesn’t have the replica of file 5 as well as their AS
is 4.5, 4.6, 4.1, 4.7 and 4.9 respectively. The DC6
has the maximum of AS. So, the replication will be
stored in DC6 which is shown in table 6. It is shown
in Table (5) is after ranking the process replicated to
that files based on server memory.
5. Result and discussion
The reflections in the parameter which involved
in research due to DRCAES approach is presented
in table 7. There are 4 parameters and their value
calculation shown in this table such as NR (number
of Replicas), OS (occupied Space), Cost and
RR_TD (Request-Response Time Delay).
Finally, when the research comparing the
proposed algorithms with the existing algorithm the
DRCAES give better performance in all aspects
such as the number of replicas, Occupied Space
(OS), Cost and Request-Response Time Delay
(RR_TD) based on File Accessing Frequency
Finally, when the research comparing the proposed
algorithms with the existing algorithm the DRCAES
give better performance in all aspects such as the
number of replicas, Occupied Space (OS), Cost and
Request-Response Time Delay (RR_TD) based on
File Accessing Frequency (FAF) for file of File Size
0.77 GB. It is shown in the table 7.
The change in NR value will be reflected to OS,
Cost, and RR_TD parameter values. In the existing
system, the NR value decided based on disk failure
rate benchmark of NR is 3 which is the convention
strategy.
For an example, the above table represents the
file with File Size (FS) 0.77 in GB is uploaded and
accessed in different scenarios.
In the proposed system, the NR is decided
based on DRRRA and DRCAES approaches.
The Occupied Space (OS) is calculated using
following way,
Occupied Space (OS) = File Size (FS) *
Number of Replicas (NR)
The cost is calculated in the following way,
Cost = Occupied Space (OS) * 0.00067
(0.00067 is the amount incurred for 1 GB per
day which is adopts based on Google Drive
Cost plan. This is only for testing purpose)
The RR_TD values are obtained by the use
of MATLAB tool.
These values are calculated based on different
File Accessing Frequency (FAF). The table values
are presented in graphical representations.
Fig. 3 shown the comparison of changes in
Number of Replicas parameter in different File
Accessing Frequency (FAF). From the graph, we
can clearly understand the NR is standard in existing
PRC, either 2 or 3 in DRRRA approach, and it is
vary based on FAF in DRCAES approach.
Fig. 4 presents the comparison of Occupied
Space (OS) and different File Accessing Frequency
(FAF) range. The graph is boons for the clear
understanding the reflections done due to the NR.
The OS is also standard in existing PRC because of
Standard NR, minimized for rarely accessed files in
Received: November 18, 2017 170
International Journal of Intelligent Engineering and Systems, Vol.11, No.2, 2018 DOI: 10.22266/ijies2018.0430.18
Figure.3 FAF vs. No. of replica
Figure.4 FAF vs. Occupied space
Figure.5 FAF vs. Cost
DRRRA approach, but optimized based on FAF in
DRCAES approach.
Fig. 5, the comparison of Cost in dollars ($) and
different File Accessing Frequency (FAF) range is
presented. The graph is determines the changes in
cost due to the Occupied Space (OS) modification.
The cost is also a regular in existing PRC because of
Standard OS, minimized for rarely accessed files in
Figure.6 FAF vs. Request-response rime delay.
DRRRA approach, but in DRCAES approach, it is
optimized based on FAF.
Fig. 6 shows the Request-Response Time Delay
(RR_TD) in the sec. and different File Accessing
Frequency (FAF) range. The graph values
represented based on MATLAB tool result and the
reflection of the RR_TD values is determined by the
value of NR. The RR_TD is worst in existing PRC
and DRRRA approach for high FAF range because
of Standard NR, but it is reduced in DRCAES
approach, because of optimized NR based on FAF.
Similarly, the table 7 presents the comparison of
parameters such as Number of Replicas (NR),
Occupied Space (OS), Cost, Request-Response
Time Delay (RR_TD) along with reliability and
availability concern of the proposed DRCAES with
DRRRA and exiting PRC algorithm. The optimized
cost obtained without affecting the existing
reliability assured percentage.
From the table, we can understand the DRCAES
provides an efficient data storage on cloud
computing environment with reliability, availability
concerns in a cost-effective manner.
The benefits of DRCAES approach is listed
below which stated in section 1. That all are proved,
Dynamically predict rarely accessed files and
most frequently accesses the file using FAFR
model.
Through DRRRA dynamically reduce the
number of replicas of that rarely accessed
files, so that, the cost and occupied space is
minimized. For reduction of the replica, it
finds minimum available Space of DC
among DC’ where that file exists (Removal).
In DRCAES, dynamically create and place
the new replica for the most accessed file if
the frequency of each replica is equally
accessed otherwise it won’t replicate. The
new replica placed in the data center that has
Received: November 18, 2017 171
International Journal of Intelligent Engineering and Systems, Vol.11, No.2, 2018 DOI: 10.22266/ijies2018.0430.18
Table 8. Comparison of parameters for existing and proposed algorithms
Number
Replica
Occupied
Space
Cost
Reliability
Availability
Request-
Response
Time Delay
Existing
PRC
[3]
1 or 2 or 3
Decide
Based on
Disk Failure
Rate
Minimized
Based on
Replica
Minimized
Based on
Replica
1-Replica No
reliability
2-replica95%
Assured
3-replica99%
Not
Considered
Increased for
more request
Proposed
DRRRA
2 or 3
decide
Based on
FAF
Minimized
for Rarely
Accessed
File.
Minimized
for Rarely
Accessed
File.
2-replica 95%
Assured
[3]
Not
Considered
Increased
for more
request
Proposed
DRCAES
2-Replica is
Minimum
and
maximum is
decided
Based on
SLA
optimized
optimized
2-replica95%
Assured [3]
Enhanced
Decreased
more available space and that file does not
exist.
Balanced storage retained during removal
and new replica placement. Because, it
analyses all aspects of Storage system like
available Space, SLA, Accessing frequency
of all existing replica.
6. Conclusion
To minimize the request response time and delay
of data placement for time-varying workload
applications, user necessity optimally makes use of
the time difference between storage and network
services across multiple cloud service provider. The
previous work of this research dynamically predicts
rarely accessed files with a help of FAFR algorithm
and reduces the number of replicas for that file, if it
satisfies the time limit using DRRRA algorithm.
Similarly, the proposed DRCAES algorithm
dynamically predicts most frequently accessed files
with the help of the FAFR algorithm. Then, it
creates a new replica for that file and finds the data
center that has more available space and doesn’t
have that file. However, this work achieves the
optimizing occupied space, cost, server performance,
increased server’s service delivery speed and
decreased request-response time delay. Thus,
ultimately the proposed DRCAES provide an
efficient data storage with an optimized cost without
affecting reliability, availability concerns for the
cloud also by optimizing the number of replicas
based on the user need and SLA. The proposed
algorithm achieves better result when compared to
the existing algorithms. In future replica
management during the disaster could be considered
without affecting the reliability and availability
concerns with minimum replica.
References
[1] R. Han, M.M. Ghanem, L. Guo, Y. Guo, and M.
Osmond, Enabling cost-aware and adaptive
elasticity of multi-tier cloud applications”,
Elsevier, Future generation power systems in
Science Direct, Vol.32, No.1, pp. 82-98, 2014.
[2] M. Du and F. Li, "ATOM: Efficient Tracking,
Monitoring, and Orchestration of Cloud
Resources", IEEE Transactions on Parallel &
Distributed Systems, Vol. 28, No.8 , pp. 2172-
2189, 2017.
[3] W. Li, Y. Yang, and D. Yuan, Ensuring Cloud
Data Reliability with Minimum Replication by
Proactive Replica Checking”, IEEE Transactions
on Computers, Vol. 65, No. 5, pp. 1494-1506,
2016.
[4] S. Souravlas, and A. Sifaleras, "Binary-Tree
Based Estimation of File Requests for Efficient
Data Replication", IEEE Transactions on
Parallel & Distributed Systems, Vol. 28, No. 7,
pp. 1839-1852, 2017.
[5] R. Han, S. Huang, Z. Wang, and J. Zhan, "CLAP:
Component-Level Approximate Processing for
Low Tail Latency and High Result Accuracy in
Cloud Online Services", IEEE Transactions on
Received: November 18, 2017 172
International Journal of Intelligent Engineering and Systems, Vol.11, No.2, 2018 DOI: 10.22266/ijies2018.0430.18
Parallel & Distributed Systems, Vol. 28, No.8 ,
pp. 2190-2203, 2017.
[6] U. Tos, R. Mokadem, A. Hameurlain, T. Ayav,
and S. Bora “A Performance and Profit Oriented
Data Replication Strategy for Cloud Systems,
In: Proc. of IEEE Conferences on Ubiquitous
Intelligence & Computing, Toulouse, France,
pp. 780-787, 2016.
[7] D.T. Nukarapu, B. Tang, L. Wang, and S. Lu,
“Data Replication in Data Intensive Scientific
Applications with Performance Guarantee,
IEEE Transactions on Parallel and Distributed
Systems, Vol. 22, No. 8, pp.1299-1306, 2011.
[8] A. Kumar, R. Tandon, and T.C. Clancy, “On the
Latency and Energy Efficiency of Distributed
Storage Systems”, IEEE Transactions on Cloud
Computing, Vol. 5, No 2, pp. 221- 233, 2017.
[9] M. Hadji, “Scalable and Cost-Efficient
Algorithms for Reliable and Distributed Cloud
Storage”, Springer International Publishing
Switzerland, Vol.581, No.1, pp. 3-12, 2016.
[10] S.Q. Long, Y.L. Zhao, and W. Chen, “MORM:
A Multi-objective Optimized Replication
Management strategy for cloud storage cluster”,
Journal of Systems Architecture, Vol. 60, No.1,
pp. 234-244, 2014.
[11] S. Rampersaud and D. Grosu, "Sharing-Aware
Online Virtual Machine Packing in
Heterogeneous Resource Clouds", IEEE
Transactions on Parallel & Distributed Systems,
Vol. 28, No. 7, pp. 2046-2059, 2017.
[12] Y. Lin and H. Shen, “EAFR: An Energy-
Efficient Adaptive File Replication System in
Data-Intensive Clusters, IEEE Transactions on
Parallel and Distributed Systems , Vol. 28, N,
pp. 1017-1030, 2017.
[13] L. Shi, Z. Zhang, and T. Robertazzi, "Energy-
Aware Scheduling of Embarrassingly Parallel
Jobs and Resource Allocation in Cloud", IEEE
Transactions on Parallel & Distributed Systems,
Vol. 28, No. 8, pp. 1607-1620, 2017.
[14] S.A.E. Selvi and R. Anbuselvi, “Ranking
Algorithm Based on File’s Accessing
Frequency for Cloud Storage System”,
International Journal of Advanced Research
Trends in Engineering and Technology, Vol. 4,
No. 9, pp. 29-33,2017.
[15] S.A.E. Selvi and R. Anbuselvi, “Optimizing the
Storage Space and Cost with Reliability
Assurance by Replica Reduction on Cloud
Storage System, International Journal of
Advanced Research in Computer Science, Vol.
8, No.8, pp.327-332, 2017.
... Software frameworks for cloud computing control cloud resources and deliver scalable, fault-tolerant computing services with globally consistent and hardware-independent user interfaces. The cloud service provider assumes responsibility for handling infrastructure-related problems [18]. ...
Article
Full-text available
In today's digitalized and globalized scenario, everyone has moved to cloud computing for storing their information on cloud storage to access their data from anywhere at any time. The most significant feature of cloud storage is its high availability and reliability then it has the capability of reducing management factors as well as incurred lower storage cost compared with some other storing methods, it is most suitable for a high volume of data storage. In order to meet the requirements of high availability and reliability, the system adopts a replication system concept. In replicating systems, the objects are replicated many times, with each copy residing in a different geographical location. Though it is beneficial to the users, it leads to some issues like security, integrity, consistency and hidden storage and maintenance cost, etc. Therefore, it is exposed to a few threats to the Cloud Storage System (CSS) user and the provider as well. So, this research seeks to explore the mechanisms to rectify the above-mentioned issues. Thus, the predecessor of the research work has proposed an algorithm named as 2-Replica Placing (2RP) algorithm which is used to reduce the storage cost, maintenance cost; and maintenance overheads as well as increase the available storage spaces for the providers by placing the data files on two locations based on Geo-Distance. But it fails to address the recovery mechanism when a natural disaster happens because providing reliability with less than 2 replicas is a challenging task for the providers. Thus, the research proposed Geo-distance based 2-Replica Maintaining (2RM) algorithm which is used to consider that issue for ensuring reliability forever even during natural disasters
... On the behavior of water resources in datasets there is a wealth of information. [26] Emphasizes two dimensions of water qualitydissolved oxygen concentration in water, and water Can be fished -but for other activities Also reports the results. Oxygen concentration, because this is a common universal of water quality in research is measurement. ...
Chapter
Full-text available
Water quality for a specific purpose, usually of drinking or swimming Depending on the suitability, its chemical, of water including physical and biological properties Describes the condition. The quality of water is its of water based on quality of use Refers to chemical, physical and biological properties. Usually through water purification of standards against which attainable conformity can be assessed it is often used to refer to the set. Some of the contaminants in our water are alkaline Intestinal disease, reproductive problems and Health including neurological disorders can lead to problems. Infants, young children, pregnant women, the elderly and the weak especially those with compromised immune systems May be in danger of getting sick. Values closer to 150 mg/L are generally better from an aesthetic point of view. Water below 150 mg/L for soft water and above 200 mg/L Values in are also considered hard water. Sources: In soil and rock material from primarily dissolved carbonate Minerals. Disturbances such as fire, wind blowing, or debris flows can affect stream temperature, turbidity, and other water quality parameters. Geography, geography and climate affect Water quality. Air and water quality and Hawaii for the overall natural environment Number one in the nation. This Massachusetts is second in the division is in place, followed by North There are Dakota, Virginia and Florida. Better for air and water quality Learn more about the states.
... The DEMATEL method is a powerful method gathering team knowledge to build a structured model and visualizing the causal relationship of subsystems. But crisp values The ambiguity of the real world Is adequate reflection [29]. DEMATEL explores the interdependence between equity The amount of investment factors and factors and ANP to assess their dependencies Integrates. ...
Chapter
Full-text available
Forensic medical examination is Searching for injuries and police to be used as evidence at trial Taking samples and A lawsuit following Sexual Assault Forensic Medicine The purpose of the test is to thevictim's Health needs assessment Treatment of any injuries When consolidating and prosecuting Gather resources for potential use trial (US Department of Justice, 2004, pp. 30-2). Because the body is the scene of the crime, testimony is time sensitive and may only last until the victim has showered, washed, and/or urinated. (1) Examination by a health care provider is recommended even if no injuries are seen. As a result of the attack, (2) the victim did not want to collect evidence, or (3) the attack was not recent. In these cases, the victim may have injuries that are not obvious or serious or have associated health concernsForensics in the criminal justice system Science is an important part. Forensic scientists at crime scenes and sources from elsewhere study commonly murder in major criminal cases including impeachment Guilt or innocence Forensic Science as a Factor in Determining was used. Forensic Medicine or Forensics Pathology is a branch of forensic science that helps in finding the results of crime evidence. Visualize by applying clinical facts to the situation. Course subjects help individuals become proficient in identifying the cause and time of death and other details about the deceasedDEMATEL (Decision Making Trial and Evaluation Laboratory) They are divided into analysis using the Forensic Medical Examination Assessment of the victim, Informed consent, Medical, and gynecological history, Assault history, and Physical top-to-toe examination It is the interaction between the factors Visualized and assesses dependent relationships Through the structural model Also deals with identifying important.Assessment of the victim, Informed consent, Medical, and gynecological history, Assault history, Physical top-to-toe examinationthe assessment of the victim got the first rank whereas the Informed consent number is having the lowest rank.Forensic Medical Examination. Assessment of the victim is got the first rank whereas the Armed consent number is having the Lowest Rank
Article
Full-text available
Total loss of quality in products or processes Reduction is the objective of robust design. Strong Design is an effective approach that aims to simultaneously decrease product costs and enhance quality while also significantly reducing development time. Strength is defined as a skill Raw material, operating conditions, process equipment, environmental conditions, and human Expected variation in factors Tolerant manufacturing process. Robust design is the design of products, devices, and manufacturing equipment so that their performance and functionality are insensitive to multiple variations, such as manufacturing and assembly tolerances, ambient use conditions, or degradation over time. Therefore, there is not strong design sensitivity-meaning that variation in the product will have minimal influence. In essence, robust design means minimizing the impact of variation on a product. One or more due to unforeseen circumstances Input variables or assumptions are rigorous although modified, their output and predictions are accurate A model is considered robust if. The alternatives being considered are related to specific aircraft features: the aerodynamic characteristic (C1), maximum takeoff weight (C2), armament (C3), and avionics (C4). The evaluation options are Ao, F-16, Su-35, and Mig-35. Based on the evaluation results, Ao obtained the top rank, while Mig-35 received the lowest rank.The value of the dataset for Robust Design of Aero Engine in Weighted product method shows that it results in Ao and top ranking
Chapter
Full-text available
The detection and interpretation of the presence of drugs and other potentially hazardous substances in bodily fluids and tissues fall under the multidisciplinary subject of forensic toxicology. 2020, Toxicology Fifth Edition, posted. Applied clinical science is the focus of forensic medicine, often known as forensic pathology. facts to the situation to determine the results of evidence at a crime scene. Course subjects help individuals become proficient in identifying the cause and time of death and other details about the deceased. Forensic medicine is one of the most popular professions, especially in India. There are endless opportunities in this industry due to the unlimited number of crimes taking place all over the world. The average salary for freshers is between 3 lakhs to 6 lakh per annum. Professionals with 5 years or more experience can expect a salary of 6-12 lakhs per annum. Due to their significant prison impact, particularly in civil and criminal proceedings, forensic toxicology and forensic medicine are distinct among all other medical specialties. Metabolomics, the most recent of the "omics sciences," has been proven to be one of the most potent methods for tracking changes in forensic domains thanks to new excessive-throughput technology adapted from chemistry and physics. Forensic medication and toxicology are taught within the final two years of scientific college students' research, in addition to taught to third-yr college students in the School of Pharmacy. Finally, the branch integrates all studies and submit-graduate publications inside the diverse fields of forensic remedy and toxicology.
Chapter
Full-text available
s: The primary source of water in locations without access to surface water is groundwater. In the past, groundwater served as the primary resource for all needs in the Morapur area of Tamil Nadu Dharmapuri district. According to reports, the area has a serious fluorosis problem since the groundwater contains too much fluoride. The area is made up of charnockite, epidotic hornblende gneiss, and ultramafic rocks that date back to the Achaean period. Numerous tectonic upheavals in the region led to the development of quartz/feldspathic veins and heavily mineralized vertical joints. There are two hydrological systems in the area, namely the worn and cracked aquifer water table. 149 groundwater samples were taken before and after the monsoon to better understand the variables influencing high fluoride concentration in groundwater. According to analysis's findings, 35% of groundwater samples contained fluoride concentrations greater than 1.5 bps (permissible limit). The findings show that deep aquifers and high fluoride levels have an impact on both aquifers. Groundwater in the area interacts with biotic and hornblende minerals to release calcium, magnesium, and fluoride. And according to acid-alkaline indices, sodium ions replace calcium ions due to reverse ion exchange, resulting in a high concentration of sodium with a high concentration of fluoride. The federal government has taken measures to make fluoride-free food and water from distant water sources available. To increase groundwater quality, specific water management techniques are crucial. Tamil Nadu, Dharmapuri District, Domestic and Water quality for irrigation purposes to assess water quality survey has been carried out. PH, TDS, TH, Calcium, Magnesium, Chloride, Sulphate. This paper notes that increasing levels of water pollution, the resulting billion-dollar utility and with control schemes, it provides a way to measure and evaluate the quality of given water body Development of water quality codes is necessary. The data output of current water monitoring stations is huge and Dimensional reporting units are different and not integrated in a straightforward algebraic way, even by scientifically trained users have few means of integrating the data to provide; water quality. That quality is locally better than hook and line to be broadly defined, Because of the importance of downstream streams less emphasized in that context. The stream is never fishable However, it is an integral part of the watershed; Protection is essential if downstream streams are to remain fishable and swimmable. The Clean Water Act's biological integrity mandate without considering local streams separately, It depends on the overview of the entire hydrological system at the water table level. Agricultural waste, applied fertilizers, soil leach ate, urban waste, Cattle excreta and sewage Sources of poor water quality. Some models have hardness and due to magnesium concentration are highly saline not suitable for irrigation purposes. In general, ground water farming activities of Dharmapuri district, anthropogenic activities, ion exchange and contaminated by weather.
Chapter
Full-text available
Water quality for a specific purpose, usually of drinking or swimming Depending on the suitability, its chemical, of water including physical and biological properties describes the condition. The quality of water is its of water based on quality of use Refers to chemical, physical and biological properties. Research significance: Usually through water purification of standards against which attainable conformity can be assessed it is often used to refer to the set. Some of the contaminants in our water are alkaline Intestinal disease, reproductive problems and Health including neurological disorders can lead to problems. Alternative: sulphate, chloride, magnesium, calcium, PH. sulphate, chloride, magnesium, calcium, PH Result: The result it is seen that PH is got the first rank where as is the calcium having the lowest rank. Conclusion: The value of the dataset for Water Quality in Test and evaluate decision making the lab shows that it results in PH and top ranking.
Chapter
Full-text available
Solid State Drives (SSD) are single drive. The performance is greater than that produced by disks allow Their low latency and parallelism Possibilities for operations, operating system I/O Read data at speeds that break interfaces They also allow writing. In addition, their performance Characteristics in existing grading systems Reveal gaps. An SSD is more Efficiency, high activity rates and more to operate with minimum latency under demanding conditions We prove that it can. High efficiency SSDs with Future High-Performance Computing (HPC) Dramatic parallel I/O performance for systems This suggests that the method can be improved. our Instead of traditional hard drives in public laboratories Using solid-state drives, our Public perception of services How technology has had an immediate effect This article also explains. Solid State Drives (SSDs) are more popular now Dense and compact are NAND flash Due to the growth in the cost of memories widely is used.
Article
Full-text available
Content Distribution Networks have been attracted a great deal of attraction in recent years on cloud computing. Replica placement problems (RPPs) as one of the key technologies in the Content Distribution Networks have been widely studied. The internet services are hosted by multiple geographically distributed datacenters. For the increasingly expanded utility of Cloud storage, the improvement of resources management and reduces the storage space, cost is complicated issues on data replication. So in order to reduce the data replication, this paper, proposed the concept of Reliability Assurance algorithm (RA). The RA is reducing the file replication from the server and improves the reliability of the server. The RA algorithm identified the data replication on un-accessing files. The unpredicted files called as replicated files, these files occupy a more space on cloud server. The low ranking prediction algorithm identified the unpredicted files on cloud server based on file accessing. Reduced data replication on cloud server it’s allows optimizing the storage space and cost. Using the RA and Ranking algorithm combined to reduce the data replication and storage space.
Article
Full-text available
The number of cloud storage users has improved abundantly at recent times. The reason behind is, the Cloud Storage system minimizes the burden of maintenance and it has less storage cost compare with other storage methods. It provides high availability, reliability and it is most suitable for high volume of data storage. In order to provide high availability and reliability, the systems introduce redundancy. In replicated system, the cloud storage services are hosted by multiple geographically distributed data centres. But the file Replication is rendering little bit threat about the Cloud Storage System for the users and for the providers it is a big challenge to offer efficient Data Storage. Since the increasingly expanded utility of Cloud storage, the improvement of resources management in the shortest time to respond to the user's requests and the geographical constraints are of prime importance to both the Cloud service providers and the users. The data replication helps in attractive the data availability which reduces the overall access time of the files, but at the same time it occupies more storage space and storage cost. In order to rectify the above mentioned problems, need to identify the popularity of the file. So this paper proposed new ranking algorithm which lists the most often accessed files and less frequently accessed files. In future the least accessed a file's replications going to be reduced likewise most accessed file's replications going to be increased based on their SLA.
Article
Full-text available
Recently, data replication has received considerable attention in the field of grid computing. The main goal of data replication algorithms is to optimize data access performance by replicating the most popular files. When a file does not exist in the node where it was requested, it necessarily has to be transferred from another node, causing delays in the completion the file requests. The general idea behind data replication is to keep track of the most popular files requested in the grid and create copies of them in selected nodes. In this way, more file requests can be completed over a period of time and average job execution time is reduced. In this paper, we introduce an algorithm that estimates the potential of the files located in each node of the grid, using a binary tree structure. Also, the file scope and the file type are taken into account. By potential of a file, we mean its increasing or decreasing demand over a period of time. The file scope generally refers to the extent of the group of users which are interested or potentially interested in a file. The file types are divided into read and write intensive. Our scheme mainly promotes the high-potential files for replication, based on the temporal locality principle. The simulation results indicate that the proposed scheme can offer better data access performance in terms of the hit ratio and the average job execution time, compared to other state-of-the-art strategies.
Article
Full-text available
One of the key problems that cloud providers need to efficiently solve when offering on-demand virtual machine (VM) instances to a large number of users is the VM Packing problem, a variant of Bin Packing. The VM Packing problem requires determining the assignment of user requested VM instances to physical servers such that the number of physical servers is minimized. In this paper, we consider a more general variant of the VM Packing problem, called the Sharing-Aware VM Packing problem, that has the same objective as the standard VM Packing problem, but allows the VM instances collocated on the same physical server to share memory pages, thus reducing the amount of cloud resources required to satisfy the users’ demand. Our main contributions consist of designing several online algorithms for solving the Sharing-Aware VM Packing problem, and performing an extensive set of experiments to compare their performance against that of several existing sharing-oblivious online algorithms. For small problem instances, we also compare the performance of the proposed online algorithms against the optimal solution obtained by solving the offline variant of the Sharing-Aware VM Packing problem (i.e., the version of the problem that assumes that the set of VM requests are known a priori). The experimental results show that our proposed sharing-aware online algorithms activate a smaller average number of physical servers relative to the sharing-oblivious algorithms, directly reduce the amount of required memory, and thus, require fewer physical servers to instantiate the VM instances requested by users.
Article
The emergence of Infrastructure as a Service framework brings new opportunities, which also accompanies with new challenges in auto scaling, resource allocation, and security. A fundamental challenge underpinning these problems is the continuous tracking and monitoring of resource usage in the system. In this paper, we present ATOM, an efficient and effective framework to automatically track, monitor, and orchestrate resource usage in an Infrastructure as a Service (IaaS) system that is widely used in cloud infrastructure. We use novel tracking method to continuously track important system usage metrics with low overhead, and develop a Principal Component Analysis (PCA) based approach to continuously monitor and automatically find anomalies based on the approximated tracking results. We show how to dynamically set the tracking threshold based on the detection results, and further, how to adjust tracking algorithm to ensure its optimality under dynamic workloads. We demonstrate the extensibility of ATOM through virtual machine (VM) clustering. Lastly, when potential anomalies are identified, we use introspection tools to perform memory forensics on VMs guided by analyzed results from tracking and monitoring to identify malicious behavior inside a VM. We evaluate the performance of our framework in an open source IaaS system.
Article
In data intensive clusters, a large amount of files are stored, processed and transferred simultaneously. To increase the data availability, some file systems create and store three replicas for each file in randomly selected servers across different racks. However, they neglect the file heterogeneity and server heterogeneity, which can be leveraged to further enhance data availability and file system efficiency. As files have heterogeneous popularities, a rigid number of three replicas may not provide immediate response to an excessive number of read requests to hot files, and waste resources (including energy) for replicas of cold files that have few read requests. Also, servers are heterogeneous in network bandwidth, hardware configuration and capacity (i.e., the maximal number of service requests that can be supported simultaneously), it is crucial to select replica servers to ensure low replication delay and request response delay. In this paper, we propose an Energy-Efficient Adaptive File Replication System (EAFR), which incorporates three components. It is adaptive to time-varying file popularities to achieve a good tradeoff between data availability and efficiency. Higher popularity of a file leads to more replicas and vice versa. Also, to achieve energy efficiency, servers are classified into hot servers and cold servers with different energy consumption, and cold files are stored in cold servers. EAFR then selects a server with sufficient capacity (including network bandwidth and capacity) to hold a replica. To further improve the performance of EAFR, we propose a dynamic transmission rate adjustment strategy to prevent potential incast congestion when replicating a file to a server, a network-aware data node selection strategy to reduce file read latency, and a load-aware replica maintenance strategy to quickly create file replicas under replica node failures. Experimental results on a real-world cluster show the effectiveness of EAFR and proposed strategies in reducing file read latency, replication time, and power consumption in large clusters.
Chapter
This paper focuses on minimizing jointly data storage and networking costs in a distributed cloud storage environment. We present two new efficient algorithms to place encrypted data chunks and enhance data availability when guaranteeing a minimum cost of storage and communication in the same time. The proposed underlying solutions, based on linear programming approach lead to an exact formulation with convergence times feasible for small and medium network sizes. A new polynomial time algorithm is presented and shown to scale to much larger network sizes. Performance assessment results, using simulations, show the scalability and cost-efficiency of the proposed distributed cloud storage solutions.