Table 6 - uploaded by Vijay Madisetti
Content may be subject to copyright.
Workload model specification for a social event calendar application.

Workload model specification for a social event calendar application.

Source publication
Article
Full-text available
We present techniques for characterization, modeling and generation of workloads for cloud computing applications. Methods for capturing the workloads of cloud computing applications in two different models-benchmark application and workload models are described. We give the design and implementation of a synthetic workload generator that accepts t...

Context in source publication

Context 1
... Model Specification: The workload model specifications are formulated in as an XML document that is input to the GT-CSWL code generator. Table 6 shows the specifications of the workload model for a social event calendar application. The workload model contains specifications for the distributions for the work- load model attributes such as think time, inter-session interval and session length. ...

Citations

... However, it is limited to DSPS applications that are centrally hosted in the cloud. In addition, ref. [29] targets cloud-based applications. Hence, while standard holistic benchmarks exist in other related areas such as big data [5,7,[30][31][32], databases [3,4], RDF stream processing [33,34], the same cannot be said for IoT. ...
Article
Full-text available
With the increasing growth of IoT applications in various sectors (e.g., manufacturing, healthcare, etc.), we are witnessing a rising demand of IoT middleware platform that host such IoT applications. Hence, there arises a need for new methods to assess the performance of IoT middleware platforms hosting IoT applications. While there are well established methods for performance analysis and testing of databases, and some for the Big data domain, such methods are still lacking support for IoT due to the complexity, heterogeneity of IoT application and their data. To overcome these limitations, in this paper, we present a novel situation-aware IoT data generation framework, namely, SA-IoTDG. Given a majority of IoT applications are event or situation driven, we leverage a situation-based approach in SA-IoTDG for generating situation-specific data relevant to the requirements of the IoT applications. SA-IoTDG includes a situation description system, a SySML model to capture IoT application requirements and a novel Markov chain-based approach that supports transition of IoT data generation based on the corresponding situations. The proposed framework will be beneficial for both researchers and IoT application developers to generate IoT data for their application and enable them to perform initial testing before the actual deployment. We demonstrate the proposed framework using a real-world example from IoT traffic monitoring. We conduct experimental evaluations to validate the ability of SA-IoTDG to generate IoT data similar to real-world data as well as enable conducting performance evaluations of IoT applications deployed on different IoT middleware platforms using the generated data. Experimental results present some promising outcomes that validate the efficacy of SA-IoTDG. Learning and lessons learnt from the results of experiments conclude the paper.
... is problem has a direct impact on any predictive autoscaling system not tuned to the expected workload pattern, for example, a system adjusted to receive a periodic workload will underperform if it changes to an unpredictable workload. Creating a sufficiently generic predictive autoscaling system capable of using any workload on any cloud service is a very complex task [9] due to the limited capabilities of this type of system caused by this ambiguity. ...
Article
Full-text available
In recent years cloud computing has established itself as the computing paradigm that supports most distributed systems, which are essential in mobile communications, such as publish-subscribe (pub/sub) systems or complex event processing (CEP). The cornerstone of cloud computing is elasticity, and today’s autoscaling systems leverage that property by making scaling decisions based on estimates of future workload to satisfy service level agreements (SLAs). However, these autoscaling systems are not generic enough, as the workload definition is application-based. On the other hand, the workload prediction needs to be mapped in terms of SLA parameters, which introduces a double prediction problem. This work presents an empirical study on the relationship between different types of workloads in the literature and their relationship in terms of SLA parameters in the context of mobile communications. In addition, more than 30 prediction models have been trained using different techniques (time series analysis, regression, random forests) to test which ones offer better prediction results of the SLA parameters based on the type of workload and the prediction horizon. Finally, a series of conclusions on the predictive models to be used as a first step towards an autonomous decision system are presented.
... Choice of data-centre will be made according to the value of config max . Further,the synthetic workload generator [3] is employed for generating the workload. Multi-tier benchmark P2BED-C: A Novel Peer to Peer Load Balancing and Energy Efficient… applications generated using this generator that is deployed across several nodes in the cloud. ...
Article
Full-text available
In recent days, cloud computing data centres are considerably involved in performing operations. It accounts for the enormous energy consumption, which increases with an increase in computing capacity. Thinking with respect for the environment, reducing operating costs and energy consumption can prove to be beneficial. Previous works in data-centre energy optimization only involved scheduling the job between the servers based on thermal profiles or workload parameters. Dynamic power management by shutting down the free accessories of data centres was also considered in many models to reduce energy consumption. Further, the role of the communication fabric focused on energy consumption. The proposed work focuses on the minimization of energy consumption at both computing servers and communicating devices. Here, a parameter is defined named config to initialize the configuration of a system in a current state. The parameter will assist the existing Dynamic Voltage Frequency Scheduling (DVFS) scheme for assigning the tasks to a virtual machine to minimize energy consumption at computing servers. Moreover, it extends the Data-centre Energy-efficient Network-aware Scheduling (DENS) with the peer-to-peer load balancer to reduce energy consumption from networking components. The proposed system uses a scheduling algorithm for the cloud data centre, which reduces the energy consumption both at the server and the communication fabric level. Based on the number of samples for the energy consumption, 95% confidence achieve. Energy consumed by the proposed P2BED-C model is 1610.22 Wxh, while other existing approaches FCFS and Round Robin consumed 1684.32 and 1678.35, respectively. The results show considerable improvement in the power utilization of the server resulting in more power savings.
... Secondly, we performed tests to determine the average requests that each class of our application servers can handle without violating the SLA. We created workloads using the proposed workload model by (Bahga et al., 2011). ...
Conference Paper
Web-based business applications commonly experience user request spikes called flash crowds. Flash crowds in web applications might result in resource failure and/or performance degradation. To alleviate these challenges, this class of applications would benefit from a targeted load balancer and deployment architecture of a multi-cloud environment. We propose a decentralised system that effectively distributes the workload of three-tier web-based business applications using geographical dynamic load balancing to minimise performance degradation and improve response time. Our approach improves a dynamic load distribution algorithm that utilises five carefully selected server metrics to determine the capacity of a server before distributing requests. Our first experiments compared our algorithm with multi-cloud benchmarks. Secondly, we experimentally evaluated our solution on a multi-cloud test-bed that comprises one private cloud, and two public clouds. Our experimental evaluation imitated flash crowds by sending varying requests using a standard exponential benchmark. It simulated resource failure by shutting down virtual machines in some of our chosen data centres. Then, we carefully measured response times of these various scenarios. Our experimental results showed that our solution improved application performance by 6.7% during resource failure periods, 4.08% and 20.05% during flash crowd situations when compared to Admission Control and Request Queuing benchmarks.
... These factors make it difficult to design such models and generators fitting different workload types and attributes. In the current state of the art, effort is instead deployed to design specialized workload modeling techniques focusing mainly on specific user profiles and application types [1,[3][4][5][6]. ...
... Since job execution times are not considered, this evaluation incurs a "drift" in variation in the rate of usage resources, since the system keeps processing jobs stacked in its queue while new jobs arrive. Dynamic workload evaluation such as in [1,6] overcomes this issue by adding a probabilistic and/or distributed (e.g., normal distribution, exponential distribution) approach to job arrivals and execution times, which provides a more accurate representation of the impact of workload types and attributes on the system resources. ...
... This type of optimization, on the other hand, focuses on long-term analysis of predictable workloads and scheduled tasks, rather than on quick bursts in demand for a specific application in a nonpredictable manner. The performance evaluation of Web and cloud applications ( [1,5,6]) is another popular domain worthy of consideration. Among other things, this domain usually involves the evaluation of user behavior, which is less prevalent in other domains. ...
Article
Full-text available
Workload models are typically built based on user and application behavior in a system, limiting them to specific domains. Undoubtedly, such a practice creates a dilemma in a cloud computing (cloud) environment, where a wide range of heterogeneous applications are running and many users have access to these resources. The workload model in such an infrastructure must adapt to the evolution of the system configuration parameters, such as job load fluctuation. The aim of this work is to propose an approach that generates generic workload models (1) which are independent of user behavior and the applications running in the system, and can fit any workload domain and type, (2) model sharp workload variations that are most likely to appear in cloud environments, and (3) with high degree of fidelity with respect to observed data, within a short execution time. We propose two approaches for workload estimation, the first being a Hull-White and Genetic Algorithm (GA) combination, while the second is a Support Vector Regression (SVR) and Kalman-filter combination. Thorough experiments are conducted on real CPU and throughput datasets from virtualized IP Multimedia Subsystem (IMS), Web and cloud environments to study the efficiency of both propositions. The results show a higher accuracy for the Hull-White-GA approach with marginal overhead over the SVR-Kalman-Filter combination.
... . Time between two sequential requests of an emulated client. Thorough discussion and experiments are presented in[43] ...
Article
Full-text available
We present robust dynamic resource allocation mechanisms to allocate application resources meeting Service Level Objectives (SLOs) agreed between cloud providers and customers. In fact, two filter-based robust controllers, i.e. $\mathcal{H}_\infty$ filter and Maximum Correntropy Criterion Kalman filter (MCC-KF), are proposed. The controllers are self-adaptive, with process noise variances and covariances calculated using previous measurements within a time window. In the allocation process, a bounded client mean response time (mRT) is maintained. Both controllers are deployed and evaluated on an experimental testbed hosting the RUBiS (Rice University Bidding System) auction benchmark web site. The proposed controllers offer improved performance under abrupt workload changes, shown via rigorous comparison with current state-of-the-art. On our experimental setup, the Single-Input-Single-Output (SISO) controllers can operate on the same server where the resource allocation is performed; while Multi-Input-Multi-Output (MIMO) controllers are on a separate server where all the data are collected for decision making. SISO controllers take decisions not dependent to other system states (servers), albeit MIMO controllers are characterized by increased communication overhead and potential delays. While SISO controllers offer improved performance over MIMO ones, the latter enable a more informed decision making framework for resource allocation problem of multi-tier applications.
... Google has demonstrated that an energy saving of 40% can be achieved by implementing machine learning to manage its data center [22]. Many researchers investigating data center management utilize synthetic data to represent workloads for their simulated data centers [23] and for predicting cloud computing resources [24]. This allows the data centers to be simulated in a wider range of scenarios than would otherwise be possible. ...
... It is hoped that the strengths of OTF synthetic data generation can benefit researchers in a broad range of domains and disciplines. There are multiple problems that OTF can have a positive impact, including generating: stock market data [14], patient data [26], electrical load data [20], cloud data [23], etc. These are just a small sample of the problems that the OTF framework is applicable to. ...
... It supports multiple generation strategies and can be extended via userdefined requests. In [16], the authors propose a framework to analyze and extract the characteristics of multi-tenant cloud platforms; then they build some basic workload elements, and finally, using a specification language, the required workload is generated. In [17] the authors develop an extension to the cloud simulation program CloudSim. ...
... Specifically, in Cloud computing, some relevant approaches are based on task resource consumption patterns (Mishra et al. 2010) and the usage of storage systems (Aggarwal, Phadke, and Bhandarkar 2010). The analysis of behaviour patterns and derived models has been discussed in (Bahga and Madisetti 2011;Chen et al. 2010;Smith and Sommerville 2011). Yang et al. (2012) presents the principal component analysis (PCA) technique used to retrieve relations between configuration and resource usage and performances in Cloud computing. ...
Article
This paper discusses the design of a Digital Twin (DT) demonstrator for Smart Manufacturing, following an open source approach for implementation. Open source technology can comprise of software, hardware and hybrid solutions that nowadays drive Smart Manufacturing. The major potential of open source technology in Smart Manufacturing lies in enabling interoperability and in reducing the capital costs of designing and implementing new manufacturing solutions. After presenting our motivation to adopt an open source approach for the design of a DT demonstrator, we identify the major implementation requirements of Smart Cyber Physical Systems (CPSs) and DTs. A conceptualisation of the core components of a DT demonstrator is provided and three technology building blocks for the realisation of a DT have been identified. These technology building blocks include components for the management of data, models and services. From the conceptual model of the DT demonstrator, we derived a high-level micro-services architecture and provided a case study infrastructure for the implementation of the DT demonstrator based on available open source technologies. The paper closes with research questions to be addressed in the future.
... Google has demonstrated that an energy saving of 40% can be achieved by implementing machine learning to manage its data center [22]. Many researchers investigating data center management utilize synthetic data to represent workloads for their simulated data centers [23] and for predicting cloud computing resources [24]. This allows the data centers to be simulated in a wider range of scenarios than would otherwise be possible. ...
... It is hoped that the strengths of OTF synthetic data generation can benefit researchers in a broad range of domains and disciplines. There are multiple problems that OTF can have a positive impact, including generating: stock market data [14], patient data [26], electrical load data [20], cloud data [23], etc. These are just a small sample of the problems that the OTF framework is applicable to. ...
Preprint
Full-text available
Collecting, analyzing and gaining insight from large volumes of data is now the norm in an ever increasing number of industries. Data analytics techniques, such as machine learning, are powerful tools used to analyze these large volumes of data. Synthetic data sets are routinely relied upon to train and develop such data analytics methods for several reasons: to generate larger data sets than are available, to generate diverse data sets, to preserve anonymity in data sets with sensitive information, etc. Processing, transmitting and storing data is a key issue faced when handling large data sets. This paper presents an "On the fly" framework for generating big synthetic data sets, suitable for these data analytics methods, that is both computationally efficient and applicable to a diverse set of problems. An example application of the proposed framework is presented along with a mathematical analysis of its computational efficiency, demonstrating its effectiveness.