Figure - uploaded by Hasan Bulut
Content may be subject to copyright.
Reasoning times for different memory sizes.

Reasoning times for different memory sizes.

Source publication
Article
Full-text available
Reasoning is a vital ability for semantic web applications since they aim to understand and interpret the data on the World Wide Web. However, reasoning of large data sets is one of the challenges facing semantic web applications. In this paper, we present new approaches for scalable Resource Description Framework Schema (RDFS) reasoning. Our RDFS...

Contexts in source publication

Context 1
... more clearly state the examined problem, we have performed several tests using one of the selected currently available reasoning technologies. Table 1 summarizes results of these tests (performed on a computer with Intel Xeon CPU E5504 @ 2.00 GHz, 4 GB cache, and up to 12 GB memory) using the Jena [16] RDFS Reasoner in the "RDFS SIMPLE" mode, which considers only rules with two triples in their bodies. In the tests, the SwetoDBLP data set is used and divided into 16 parts to obtain data sets of different sizes. ...
Context 2
... seen in Table 1, the reasoning capability for large data sets increases with the available memory size. This is because the reasoner tries to create a graph of the whole data set in the memory to carry out reasoning on the graph. ...
Context 3
... 1 shows the terms and corresponding data partitions created by Algorithm 1 and Algorithm 2. The partitions colored with gray indicate data partitions with corresponding empty schema partitions and thus can be eliminated from the reasoning process since RDFS entailment rules do not contain a rule with two data triples in its body. For this example, it is possible to obtain full closure (for rules given in Table 1) with processing only around 55% of the original data set. Meanwhile, 45% of the data set, around 6M triples, can be eliminated from the reasoning process, which dramatically reduces the computation load. ...
Context 4
... 7 and 8 are examined together, it is seen that two-level partitioning takes less time, although it produces more repetition. This is mainly because of the nonlinear increase of the reasoning times with the increasing amount of data, which can also be observed from the values in Table 1. Since two-level partitioning works on smaller partitions, it produces more repetitions with less time. ...
Context 5
... Table 9, the table that was created in the problem definition (Table 1) is updated with test results of the hybrid approach for different memory sizes. The values are comparable since the same computer and data sets are used while creating both. ...
Context 6
... curve for reasoning times with 1 GB memory using hybrid approach. In Figure 8, the methods are compared in terms of speedup, and Table 10 gives the exact speedup values. Results indicate that the hybrid approach has a higher speedup than the two-level partitioning method. ...

Citations

... The mixed integer linear programming is developed for solving the hybrid stochastic-deterministic unit commitment problem [31]. A hybrid approach for reasoning of a large semantic web data is proposed [32]. A hybrid method is proposed for improving the performance of the stochastic and deterministic optimal power flow problem [33]. ...
Article
Full-text available
Deterministic methods are used for optimum solution of many engineering and scientific problems. Filled function method, which is a deterministic method, is not trapped to local minimums with its ability to bypass energy barriers. In order to achieve this, the basin regions of the filled function should be located, and the filled function should be constructed in that region. However, classical search strategies used for finding the basin regions don’t yield effective results. In this study, a new stochastic search approach is presented as a faster and more efficient alternative to classic filled function search strategy. An unconstrained global optimization method based on clustering and parabolic approximation (GOBC-PA) has been used as a stochastic method for accelerating the L type filled function as a deterministic method. The developed method has been tested against classical filled function using 11 benchmark functions. When the obtained results are examined, it is seen that the stochastic search approach has superiority over the mean error, standard deviation and elapsed time values according to the classical approach. These results show that the combination of deterministic and stochastic methods can be more successful in finding the global minimum against the classic deterministic method. Keywords: Stochastic search; GOBC-PA; L type filled function; global optimization.
... This approach avoids using popular terms such rdfs:type for DHT partitioning if the term is not used in the reasoning process. Subsequently, they implemented a forward reasoning approach [63] and evaluate it using up to 14M triples. However, no details are provided on loading the data on the cluster and the storage implications of larger datasets. ...
... As noted earlier, BigQuery supports 64-bit signed integers, the minimum number that can be stored is −2 63 can encode approximately 2.14 billion unique URI parts. ...
Thesis
Full-text available
The Semantic Web was proposed as an extension of the traditional Web to give Web data context and meaning by using the Resource Description Framework (RDF) data model. The recent growth in the adoption of RDF in addition to the massive growth of RDF data, have led numerous efforts to focus on the challenges of processing this data. To this extent, many approaches have focused on vertical scalability by utilising powerful hardware, or horizontal scalability utilising always-on physical computer clusters or peer to peer networks. However, these approaches utilise fixed and high specification computer clusters that require considerable upfront and ongoing investments to deal with the data growth. In recent years cloud computing has seen wide adoption due to its unique elasticity and utility billing features. This thesis addresses some of the issues related to the processing of large RDF datasets by utilising cloud computing. Initially, the thesis reviews the background literature of related distributed RDF processing work and issues, in particular distributed rule-based reasoning and dictionary encoding, followed by a review of the cloud computing paradigm and related literature. Then, in order to fully utilise features that are specific to cloud computing such as elasticity, the thesis designs and fully implements a Cloud-based Task Execution framework (CloudEx), a generic framework for efficiently distributing and executing tasks on cloud environments. Subsequently, some of the large-scale RDF processing issues are addressed by using the CloudEx framework to develop algorithms for processing RDF using cloud computing. These algorithms perform efficient dictionary encoding and forward reasoning using cloud-based columnar databases. The algorithms are collectively implemented as an Elastic Cost Aware Reasoning Framework (ECARF), a cloud-based RDF triple store. This thesis presents original results and findings that advance the state of the art of performing distributed cloud-based RDF processing and forward reasoning.