Figure 1 - uploaded by Verena Kantere
Content may be subject to copyright.
Services of Table 1 in the Pin × Pout space

Services of Table 1 in the Pin × Pout space

Source publication
Article
Full-text available
As we move from a Web of data to a Web of services, enhancing the capabilities of the current Web search engines with effective and efficient techniques for Web services retrieval and selection becomes an important issue. Traditionally, the relevance of a Web service advertisement to a service request is determined by com-puting an overall score th...

Contexts in source publication

Context 1
... the example of Table 1. Let S m i R,S = (si.Pin, si.Pout) denote the match vector under criterion fm i for the input and output parameters of service S. Figure 1 draws the degrees of match si as an instance in the Pin × Pout space for all services and criteria. For example, a1 corresponds to the de- grees of match of service A under fm 1 and, hence, has coordinates (0.96, 0.92). ...
Context 2
... (Cont'd). Consider object C with instances c1, c2 and c3 shown in Figure 1. Instance c1 is dominated by a1, a2 and a3, whereas it dominates b1, b3, d1, d2 and d3. ...
Context 3
... the current object U , the algorithm first searches for objects that fully dominate it. For example, in the case of the data set of Figure 1, with a single dominance check between bmax and amin, we can conclude that all instances b1, b2 and b3 are dominated by a1, a2 and a3. According to property 2, only objects with F (vmin) > F (umax) need to be checked. ...
Context 4
... step searches for individual instances v that dominate U . For ex- ample, in Figure 1, a dominance check between dmax (which co- incides with d1) and c1 shows that all instances d1, d2, and d3 are dominated by c1. As before, only instances with F (v) > F (umax) are considered. ...
Context 5
... it is dominated, the score of u is again increased by 1/M , and the threshold is checked. In Figure 1, this is the case with d3 and bmin. ...
Context 6
... in fact, is a known problem faced by the skyline computation approaches as well. As the dimensionality increases, it becomes increasingly more difficult to find instances dominating other instances; hence, many unnecessary dominance Figure 10: Effect of corr under low (left) and high (right) vari- ance var checks are performed. A possible work-around is to group together related service parameters so as to decrease the dimensionality of the match objects. ...
Context 7
... possible work-around is to group together related service parameters so as to decrease the dimensionality of the match objects. For the same reasons, a similar effect is observed in Figure 10. For correlated data sets, where many successful dom- inance checks occur, the computational cost for all methods drops close to zero. ...
Context 8
... an application favors more accurate results, then T KM seems as an excellent solution. If the time factor acts as the driving decision point, then T KDD should be favored, since it provides high quality results (see Table 2) almost instantly (see Figures 9 and 10). ...

Similar publications

Article
Full-text available
In recent years, many studies on computational linguistics have employed the Web as source for research. Specifically, the distribution of textual data in the Web is used to drive linguistic analyses in tasks such as information extraction, knowledge acquisition or natural language processing. For these purposes, commercial Web search engines are c...
Article
Full-text available
The rapid change of computers from isolated machines to networks and the need of people to exchange information lead us to the World Wide Web (WWW). Nowadays a lot of people are spending lot of hours in WWW searching information for every aspect of life. This increase of information in WWW, increase also the difficulty to find and access the inform...
Article
Full-text available
This paper describes a method for spoken document retrieval using Web document expansion. This technique improves document retrieval performance by expanding the target spoken documents using Web data. In this research, two types of indexes are built. One is made from transcriptions of the spoken documents; the other is made from Web documents that...
Thesis
Full-text available
The Web is comprised of a vast quantity of text. Modern search engines struggle to index it independent of the structure of queries and type of Web data, and commonly use indexing based on Web‘s graph structure to identify high-quality relevant pages. However, despite the apparent widespread use of these algorithms, Web indexing based on human feed...
Research
Full-text available
Nowadays, many users use web search engines to find and gather information. User faces an increasing amount of various semi-structured information sources. The issue of correlating, integrating and presenting related information to users becomes important. When a user uses a search engine such as Yahoo and Google to seek a specific information, the...

Citations

... In this paper, we consider a useful and important spatial query, top-k dominating query [27,43], over a large-scale uncertain database in a distributed environment, which has many real-life applications such as multi-criteria decision making [38], coal mine surveillance [27], and so on. Figure 1a shows an example of a top-k dominating query over 2-dimensional (certain) data points {o, p, r , s, t, u, v, w, x, y, z}. ...
Article
Full-text available
In many real-world applications such as business planning and sensor data monitoring, one important, yet challenging, task is to rank objects (e.g., products, documents, or spatial objects) based on their ranking scores and efficiently return those objects with the highest scores. In practice, due to the unreliability of data sources, many real-world objects often contain noises and are thus imprecise and uncertain. In this paper, we study the problem of probabilistic top-k dominating (PTD) query on such large-scale uncertain data in a distributed environment, which retrieves k uncertain objects from distributed uncertain databases (on multiple distributed servers), having the largest ranking scores with high confidences. In order to efficiently tackle the distributed PTD problem, we propose a MapReduce framework for processing distributed PTD queries over distributed uncertain databases. In this MapReduce framework, we design effective pruning strategies to filter out false alarms in the distributed setting, propose cost-model-based index distribution mechanisms over servers, and develop efficient distributed PTD query processing algorithms. Extensive experiments have demonstrated the efficiency and effectiveness of our proposed distributed PTD approaches on both real and synthetic data sets through various experimental settings.
... (1) To improve the efficiency of the proposed solution, we first employ the Skyline method [23]. Skyline method permits to eliminate the dominated services and only select the dominant and pertinent services according to their QoS performances regardless of users' requirements. ...
... Step 2: Apply Skyline method to reduce search space Skyline [23] method is a basic MCDM solution that permits to extract the subclass of dominant services and eliminate the dominated ones regardless of any user's requirements. This is because the optimal solution is necessarily within the dominant services [30]. ...
... Definition 5 [23]. Given a set of functionally similar ...
Article
Full-text available
Cloud Computing has become a reliable solution for outsourcing business data and operation with its cost-effective and resource-efficient services. A key part of the success of the cloud is the multi-tenancy architecture, where a single instance of a service can be shared between a large number of users, also known as tenants. Service selection for multiple tenants presents a real challenge that has not been properly addressed in the literature so far. Most of the existing cloud services selection approaches are designed for a single-user, and hence are inefficient when applied to the case of a large group of users with different, and often, conflicting requirements. In this paper, we propose a multi-tenant cloud services evaluation framework, whereby service selection is carried out per group of tenants that can belong to different service classes, rather than per a single user. We formulate the cloud services selection for multi-tenants as a complex multi-attribute large-group decision-making (CMALGDM) problem. Skyline method is initially applied to reduce the search space by eliminating the dominated services regardless of tenants’ requirements. Tenants are clustered based on their profiles characterized by different personal, service, and environmental features. Each tenant is assigned a weight to reflect its importance in the decision-making. The weight of a tenant is determined locally based on its closeness to the group decision and globally by combining its local weight with its corresponding cluster weight to reflect its total contribution to the overall decision-making. The final ranking of the alternatives is guided by a dynamic consensus process to reach a final solution with the highest level of agreement. The proposed framework supports multiple types of information, including deterministic data, interval numbers, and fuzzy numbers, to realistically represent the heterogeneity and uncertainty of security information.
... In addition, state of the art specific methods for web services matching or recommendations [8,9] exist, which exploit further task-specific techniques, in compared with the abovementioned methodologies. Most most of them are knowledgebased but cannot be applied in this use case. ...
Chapter
This paper presents a methodology that combines latent factor models with graph-based models. The proposed recommendation system identifies a recommended item as a node of a graph. More specifically, the topology of the graph and the paths between the nodes are considered as critical features regarding the associations between them. Furthermore, in the current approach, these structural features are considered as feedback. These structural features are extracted from a pool of several application graphs which are afterwards generalized into a unified matrix of proximities. The main reason for the use of this structural feedback is to generate recommendations and discover unobserved relations using matrix factorization techniques. The approach is tested on a data set that consists of cloud-native microservices graphs.
... Sometimes, prioritising between QoS attributes is considered based on user preferences using the MCDM (multi-criteria decision-making) approaches [27,36]. Skyline is another approach that has been used for web service selection based on QoS attributes [37,38]. ...
Article
Full-text available
Quality of service (QoS)-based web service selection has been studied in the service computing community for some time. However, characteristics of the input dataset that is going to be processed by the web service are not usually considered in the selection process, even though they might have impact on QoS values of the service, e.g. latency on processing a bigger dataset is higher than that on a smaller dataset, one service takes longer time to process a certain dataset than another service. To address this issue, in this work, we take into consideration the dataset features in the QoS-based service recommendation process and we focus on data mining services because their QoS values could be highly dependent on dataset features. We propose two approaches for data mining service recommendations and compare their performances. In the first approach, we use a meta-learning algorithm to incorporate dataset features in the recommendation process and study the use of different machine learning algorithms (both classification models and regression models) as meta-learners in recommending data mining services for the given dataset. We also investigate the impact of the number of dataset features on the performance of the meta-learners. In the second approach, we propose a novel technique of using factor analysis for web service recommendation. We use decomposition technique to identify latent features of the input dataset and then recommend services by exploiting these latent variables. Our proposed approach of web service recommendation based on latent features was shown to be a more robust model with an accuracy of 85% compared to meta-feature-based recommendation.
... Skoutas, Sacharidis, Simitsis, Kantere, and Sellis (2009) proposed the concept of QoS. It considers the security as part of QoS requirements. ...
Article
Full-text available
Various map-centered web services facilitate citizens’ lives. Web-map applications exist for many years already. Due to simplification and improvement of technologies supporting WebGIS, map-based services become more popular and important nowadays. Data quality assurance for such services is a significant challenge. Since many of such applications intensively use open data, approaches focused on open solutions are required. This work proposes a data-quality concept, which is based on intrinsic and comparable approaches. OpenStreetMap (OSM) allows intrinsic data evaluation. Moreover, it is used as a reference dataset for quality assessment of public-sector-information Open Data layers. Equidistant point (EDP)-based statistics enables to filter out low-quality Open Data features. A data-type model carries out the inventory of OSM data. The comparison of raster web-map tile file sizes and calculation of a simplified data quality indicator make it possible to specify acceptable data quality levels. Embeddable instances of quality assurance web services incorporate data features with acceptable quality. This work provides all required software and data for the deployment of such services under liberal licenses. Concrete instructions allow users to adopt the proposed solutions for their platforms. Some generic use cases illustrate the advantages of the introduced shared web services.
... Recently, preference queries, which retrieve only preferable data objects from a multidimensional dataset, have been receiving significant research attention in the database community. These types of queries provide a wide range of multi-criteria decision making applications with much benefits, for example, multimedia retrieval [32], web search [29], market analysis [38], and e-commerce [38]. Two of the most widely used preference queries are top-k and skyline queries. ...
Article
Full-text available
Preference query processing is important for a wide range of applications involving distributed databases, such as network monitoring, web-based systems, and market analysis. In such applications, data objects are generated frequently and massively, which presents an important and challenging problem of continuous query processing over distributed data stream environments. A top-k dominating query, which has been receiving much research attention recently, returns the k data objects that dominate the highest number of data objects in a given dataset, and due to its dominance-based ranking function, we can easily obtain superior data objects. An emerging requirement in distributed stream environments is an efficient technique for continuously monitoring top-k dominating data objects. Despite of this fact, no study has addressed this problem. In this paper, therefore, we address the problem of continuous top-k dominating query processing over distributed data stream environments. We present two algorithms that monitor the exact top-k dominating data and efficiently eliminate unqualified data objects for the result, which reduces both communication and computation costs. In addition to these algorithms, we present an approximate algorithm that further reduces both communication and computation costs. Extensive experiments on both synthetic and real data have demonstrated the efficiency and scalability of our algorithms.
... However, considering the wealth of Web services, a fundamental issue, namely the integration of Web services still remains. Despite the research progress in Web service representation [85,10,58,58,101] and discovery [40,100,24]; integration is mainly hindered by the lack of explicit schemas for the output of Web service operations. Failing to fully integrate Web services at the output level leaves the landscape largely disintegrated, with the applications being developed only for specific services. ...
... In [100] authors proposed an approach of ranking the top-k most relevant Web services according to user's need under multi-criteria matching. ...
... Furthermore, we compute views with binding patterns that map the entire call result to the target knowledge base. A fundamental difference is that the works in [40,24,100] either assume existence of already formal descriptions of the Web services in the form of ontologies or WSDL, or lack experimental evaluation of the proposed approaches. Finally, some of the proposed approaches, e.g. ...
Thesis
Full-text available
Ma thèse a comme but l’intégration automatique de nouveaux services Web dans une base de connaissances. Pour chaque méthode d’un service Web, une vue est calculée de manière automatique. La vue est représentée comme une requête sur la base de connaissances. L’algorithme que nous avons proposé calcule également une fonction de transformation XSLT associée à la méthode qui est capable de transformer les résultats d’appel dans un fragment conforme au schéma de la base de connaissances. La nouveauté de notre approche c’est que l’alignement repose seulement sur l’alignement des instances. Il ne dépend pas des noms des concepts ni des contraintes qui sont définis par le schéma. Ceci le fait particulièrement pertinent pour les services Web qui sont publiés actuellement sur le Web, parce que ces services utilisent le protocole REST. Ce protocole ne permet pas la publication de schémas. En plus, JSON semble s’imposer comme le standard pour la représentation des résultats d’appels de services. À différence du langage XML, JSON n’utilise pas de noeuds nommés. Donc les algorithmes d’alignement traditionnels sont privés de noms de concepts sur lesquels ils se basent.
... Paper [21] ranks web services under multi-criteria matching. It targets at accurate web service selection and assigns a dominance score to each advertised web service. ...
... One of the methods for skyline computation is BNL (Block Nested Loops) [10]. Many techniques for getting the top-k dominant skylines [12,13] based on user's preferences and the dominance relationship have been proposed. Index and Bitmap-based algorithms based model [14] for returning skyline result progressively have been proposed. ...
Article
Full-text available
Background/Objectives: Energy conservation in Wireless Sensor Network is essential to enhance its life. A sensor node consumes more energy for communication than performing data gathering or data processing. Data aggregation minimizes the data size for communication. Methods/Statistical Analysis: The Comb Needle model is available in literature to perform data aggregation for grid networks (regular deployment). Extended the Basic Comb Needle Model in randomly deployed sensor networks. The simple random network with Comb Needle Model is compared with simple random network without Comb Needle Model. The theoretical analysis and simulation study shows that Extended Comb Needle Model performs better data aggregation. Findings: When we apply the Proposed Model in random network, the communication cost, overhead, and energy consumption are significantly reduced. The simulation results for the proposed Extended Comb Needle Model prove that the energy consumption and overall communication costs are substantially minimized. The simulation comparison is done for simple random network with and without Comb Needle Model in terms of communication cost, energy consumption, delay, packet loss, packet delivery ratio, and throughput. We found that the communication cost is decreased from 82% to 58%. the average energy consumption is decreased from 80% to 40%. Delay is decreased from 76% to 20%. Packet loss in decreased from 67% to 12%. Packet Delivery Ratio is increased from 82 % to 87%. And throughput is increased from 70% to 90%. Application/Improvements: Proposed Model optimizes WSN performance in terms of better packet delivery ratio, improved throughput, minimized energy consumption and reduced delay. Simulation results as well as theoretical analysis affirm the same.
... One of the methods for skyline computation is BNL (Block Nested Loops) [10]. Many techniques for getting the top-k dominant skylines [12,13] based on user's preferences and the dominance relationship have been proposed. Index and Bitmap-based algorithms based model [14] for returning skyline result progressively have been proposed. ...
Conference Paper
Full-text available
An exponential increase in the number of web services over the last few years increases the importance of the service selection task for choosing the best among a group of web services with similar functionalities. Most of the web service selection approaches are service provider perspective based on non-functional properties such as performance, reliability etc. (such attributes known as Quality of Service or QoS). But it has been observed that in any decision support issues in selection, users' feedback (known as Quality of Experience or QoE) plays a crucial role. In this paper, an integrated model has been proposed, based on both QoE and QoS, where the best service selection is made based not only on current QoS values of the services but also users' past experience of using them. Further, the case study has been provided and the results have been analyzed by inducing users' ratings as a QoE factor along with QoS parameters. The results show that the proposed approach, augmented by user's feedback, improves the quality of selection.