Figure 1 - uploaded by Fabio Machado Porto
Content may be subject to copyright.
Three-Level Architecture for Linked Data Integration.

Three-Level Architecture for Linked Data Integration.

Source publication
Article
Full-text available
In this paper, we present a three-level mediator based framework for linked data integration. In our approach, the mediated schema is represented by a domain ontology, which provides a conceptual representation of the application. Each relevant data source is described by a source ontology, published on the Web according to the Linked Data principl...

Contexts in source publication

Context 1
... this section, we discuss the three-level architecture for linked data integration, which is depicted in Figure 1. ...
Context 2
... relevant data source is described by a source ontology, published on the Web according to the Linked Data principles, thereby becoming part of the Web of linked data. These source ontologies are depicted in the Web of Linked Data layer in Figure 1. The local source schemas are accessed via wrappers, like those introduced in [3], which export the local data into OWL. ...
Context 3
... Translation. The query is parsed, that is, transformed into a query tree representing the structure of the query, as illustrated in Figure 10. Each node of this tree is labeled with a datatype or object variable. ...
Context 4
... Reformulation. a. First, the query Q is reformulated in terms of queries over the application ontologies ( Figure 11). b. ...
Context 5
... The query tree in Figure 11 consists of four sub-queries Q1, Q2, Q3 and Q4, which aim at extracting data from single application ontology. The generated query tree represents the SPARQL query illustrated in Figure 12. c. ...
Context 6
... query tree in Figure 11 consists of four sub-queries Q1, Q2, Q3 and Q4, which aim at extracting data from single application ontology. The generated query tree represents the SPARQL query illustrated in Figure 12. c. ...
Context 7
... generated query tree represents the SPARQL query illustrated in Figure 12. c. Then the sub-queries are reformulated, based on the local mappings in Figure 7, in terms of a query over the data source schema ( Figure 13). The reformulation is unambiguous and therefore straightforward. ...
Context 8
... reformulation is unambiguous and therefore straightforward. Figure 14(a) shows a graphical and intuitive representation of the execution plan for the queries shown in Figure 13. ...
Context 9
... reformulation is unambiguous and therefore straightforward. Figure 14(a) shows a graphical and intuitive representation of the execution plan for the queries shown in Figure 13. ...
Context 10
... that queries Q2' and Q4' are similar. Hence, the execution plan is rearranged, as shown in Figure 14(b). ...
Context 11
... Scan phase, which performs bind pattern projections on join properties, with duplicate elimination. In our example, Q 2 ' (Figure 15) projects the name (?n) and publisher URI-link (?pub) of publishers located in 'USA'. (2) Set bind processing phase. ...
Context 12
... constraint (SPARQL FILTER expression) is used to specify which data must be joined. Note that sub- queries Q 1 ' and Q 2 ' are rewritten adding the bind patterns as filter into sub-queries Q 1 " and Q 2 " (see Figure 16). ...
Context 13
... 6.1: Referring to our case study, consider the source ontology instances in Figure 17. Figure 18 shows the application ontologies instances obtained by applying the mappings in Figure 7. Figure 19 shows the instance of the domain ontology Sales obtained by applying the mediated mappings in Figure 8. ...
Context 14
... 6.1: Referring to our case study, consider the source ontology instances in Figure 17. Figure 18 shows the application ontologies instances obtained by applying the mappings in Figure 7. Figure 19 shows the instance of the domain ontology Sales obtained by applying the mediated mappings in Figure 8. ...
Context 15
... 6.1: Referring to our case study, consider the source ontology instances in Figure 17. Figure 18 shows the application ontologies instances obtained by applying the mappings in Figure 7. Figure 19 shows the instance of the domain ontology Sales obtained by applying the mediated mappings in Figure 8. ...
Context 16
... the query in Figure 9. Referring to the instance of the domain ontology in Figure 19, the query returns only one title "Fundamentals of Database Systems". But the correct answer for the query is "Fundamentals of Database Systems, A Semantic Web Primer" because, based on the virtual same-as property, we can infer that that P1 and P2 are the same entity (recall that we say that the same-as property is virtual because it is derived by applying a rule, as discussed in Section 3). Figure 12 shows the query obtained from unfolding the query Q over the application ontologies on the basis of the mediated mappings in Figure 8. Referring to the data graph in Figure 18, the query returns only one title "Fundamentals of Database Systems". ...
Context 17
... the correct answer for the query is "Fundamentals of Database Systems, A Semantic Web Primer" because, based on the virtual same-as property, we can infer that that P1 and P2 are the same entity (recall that we say that the same-as property is virtual because it is derived by applying a rule, as discussed in Section 3). Figure 12 shows the query obtained from unfolding the query Q over the application ontologies on the basis of the mediated mappings in Figure 8. Referring to the data graph in Figure 18, the query returns only one title "Fundamentals of Database Systems". Note that the query in Figure 19 provides all correct answers "Fundamentals of Database Systems, A Semantic Web Primer". Figure 20 shows the query reformulation algorithm. ...
Context 18
... the query in Figure 9. Referring to the instance of the domain ontology in Figure 19, the query returns only one title "Fundamentals of Database Systems". But the correct answer for the query is "Fundamentals of Database Systems, A Semantic Web Primer" because, based on the virtual same-as property, we can infer that that P1 and P2 are the same entity (recall that we say that the same-as property is virtual because it is derived by applying a rule, as discussed in Section 3). Figure 12 shows the query obtained from unfolding the query Q over the application ontologies on the basis of the mediated mappings in Figure 8. Referring to the data graph in Figure 18, the query returns only one title "Fundamentals of Database Systems". Note that the query in Figure 19 provides all correct answers "Fundamentals of Database Systems, A Semantic Web Primer". Figure 20 shows the query reformulation algorithm. ...
Context 19
... the correct answer for the query is "Fundamentals of Database Systems, A Semantic Web Primer" because, based on the virtual same-as property, we can infer that that P1 and P2 are the same entity (recall that we say that the same-as property is virtual because it is derived by applying a rule, as discussed in Section 3). Figure 12 shows the query obtained from unfolding the query Q over the application ontologies on the basis of the mediated mappings in Figure 8. Referring to the data graph in Figure 18, the query returns only one title "Fundamentals of Database Systems". Note that the query in Figure 19 provides all correct answers "Fundamentals of Database Systems, A Semantic Web Primer". Figure 20 shows the query reformulation algorithm. Briefly, the main steps of the algorithm are: ...
Context 20
... 6.2. Consider the query tree (QT) in Figure 10. The primary concept is s:Book, which occurs in the vocabularies of the Amazon and the eBay application ontologies. ...
Context 21
... our example, property "s:hasPub" is an inter-ontology link (ap:book, pp:publ, ap:hasPub). Figure 21 summarizes the main steps for rewriting property "s:hasPub": (1) create the SQT shown in Figure 22; delete all descents of node ?pub from primary query tree; (3) rewrite SQT with the publisher's (pub) namespace. In Figure 21, the join variable ?pub of the ap:hasPub property is used in the SQT tree to implement the URI join between both queries. ...
Context 22
... 21 summarizes the main steps for rewriting property "s:hasPub": (1) create the SQT shown in Figure 22; delete all descents of node ?pub from primary query tree; (3) rewrite SQT with the publisher's (pub) namespace. In Figure 21, the join variable ?pub of the ap:hasPub property is used in the SQT tree to implement the URI join between both queries. c) If p is not in the Vocabulary V (lines 16-20), then for each virtual same-as inter- ontology link (V:cr, Vi:cr, PJ) in V, create a secondary query tree SQT which is a copy of the sub-graph containing node N and all the descendents of the properties not in V. Next, rewrite SQT with Vi. ...
Context 23
... result of REWRITE_NODE (N, V) is a reformulated query tree (RQT) whose root is a join node with the following children: the primary query tree (Query tree rewritten with the vocabulary V) and all generated secondary query trees. Figure 21 shows the reformulated Query Tree for the Amazon vocabulary. The join variable "?pub" of am:hasPub property is used in the secondary query tree to implement the join operation. ...

Similar publications

Conference Paper
Full-text available
Recently, processing of queries on linked data has gained attention. We identify and systematically discuss three main strategies: a bottom-up strategy that discovers new sources during query processing by following links between sources, a top-down strategy that relies on complete knowledge about the sources to select and process relevant sources,...

Citations

... For example, verifying software in avionics made up 40% of the development costs in 2001 [4]. The amount of software testing has been increasing [5] because of technological developments for integrating avionics subsystems [6]. The current rising trend is shown in Figure 1 [7]. ...
... The need to replace these manual efforts with an automated approach depends on the frequency of how often test case needs to be exchanged between suppliers. As shown in the introduction, aircraft design is becoming more and more complicated [6], and the test cases in the test processes are also becoming more complex and, therefore, prone to errors. Thus, the implication of a potential test automation improvement is very high in practice. ...
Article
Full-text available
Heterogeneous test processes with respect to test script languages are an integral part of the development process of mechatronic systems that are carried out in supply chains. Up to now, test cases are not exchangeable between test processes because interoperability is not given. The developed approach enables the source-to-source compiling of test cases between test script languages. With this, the interoperability of test cases is achieved, and seamless integration within the supply chain is possible. The developed approach uses transcompilers as a baseline. In doing so, an interoperability model for test cases is presented. Based on the interoperability model, a source-to-source compiling for test cases is shown. The outcome is a prototype that handles test script languages, which are different with respect to type safety and applied programming paradigms. The approach ensures that test cases are still understandable and usable for test reports. The evaluation confirms the translation capabilities as well as the readability of the generated test case for the high-lift scenario from aviation. The interoperability of test cases within the supply chain enables the formalisation of procedural test knowledge to be used in a broad range of future scenarios, such as test automation, digital twins and, predictive maintenance.
... In this section, we describe a four-level ontology-based architecture that facilitates the creation of data views from multiple data sources published as LD. Data interpretation between these levels is ensured by an ontological representation based on mappings, according to a mediated approach, extended from (Wiederhold, 1992), LDIF (Schultz et al., 2012) and Vidal et al. (2011). Each level of the architecture illustrated in Figure 1 is described as follows: ...
Article
Purpose – The purpose of this paper is to present a four-level architecture that aims at integrating, publishing and retrieving ecological data making use of linked data (LD). It allows scientists to explore taxonomical, spatial and temporal ecological information, access trophic chain relations between species and complement this information with other data sets published on the Web of data. The development of ecological information repositories is a crucial step to organize and catalog natural reserves. However, they present some challenges regarding their effectiveness to provide a shared and global view of biodiversity data, such as data heterogeneity, lack of metadata standardization and data interoperability. LD rose as an interesting technology to solve some of these challenges. Design/methodology/approach – Ecological data, which is produced and collected from different media resources, is stored in distinct relational databases and published as RDF triples, using a relational-Resource Description Format mapping language. An application ontology reflects a global view of these datasets and share with them the same vocabulary. Scientists specify their data views by selecting their objects of interest in a friendly way. A data view is internally represented as an algebraic scientific workflow that applies data transformation operations to integrate data sources. Findings – Despite of years of investment, data integration continues offering scientists challenges in obtaining consolidated data views of a large number of heterogeneous scientific data sources. The semantic integration approach presented in this paper simplifies this process both in terms of mappings and query answering through data views. Social implications – This work provides knowledge about the Guanabara Bay ecosystem, as well as to be a source of answers to the anthropic and climatic impacts on the bay ecosystem. Additionally, this work will enable evaluating the adequacy of actions that are being taken to clean up Guanabara Bay, regarding the marine ecology. Originality/value – Mapping complexity is traded by the process of generating the exported ontology. The approach reduces the problem of integration to that of mappings between homogeneous ontologies. As a byproduct, data views are easily rewritten into queries over data sources. The architecture is general and although applied to the ecological context, it can be extended to other domains.
... Finding semantic relationships between two given entities is also discussed in the context of ontology matching [9,21,22]. In the case described here, hub ontologies could also be used to infer missing relationships into another ontology. ...
Chapter
Full-text available
Connectivity and relatedness of Web resources are two concepts that define to what extent different parts are connected or related to one another. Measuring connectivity and relatedness between Web resources is a growing field of research, often the starting point of recommender systems. Although relatedness is liable to subjective interpretations, connectivity is not. Given the Semantic Web’s ability of linking Web resources, connectivity can be measured by exploiting the links between entities. Further, these connections can be exploited to uncover relationships between Web resources. This chapter describes the application and expansion of a relationship assessment methodology from social network theory to measure the connectivity between documents. The connectivity measures are used to identify connected and related Web resources. The approach is able to expose relations that traditional text-based approaches fail to identify. The proposed approaches are validated and assessed through an evaluation on a real-world dataset, where results show that the proposed techniques outperform state of the art approaches. Finally, a Web-based application called Cite4Me that uses the proposed approach is presented.
... • The ontology Services Layer: A major source of knowledge for mediation comes from the ontology layer (Rivero et al. 2011b;Vidal et al. 2011). The MetaMed Ontology services targets two basically different application requirements. ...
Article
Full-text available
Research and development activities relating to the grid have generally focused on applications where data is stored in files. However, many scientific and commercial applications are highly dependent on Information Servers (ISs) for storage and organization of their data. A data-information system that supports operations on multiple information servers in a grid environment is referred to as an interoperable grid system. Different perceptions by end-users of interoperable systems in a grid environment may lead to different reasons for integrating data. Even the same user might want to integrate the same distributed data in various ways to suit different needs, roles or tasks. Therefore multiple mediator views are needed to support this diversity. This paper describes our approach to supporting semantic interoperability in a heterogeneous multi-information server grid environment. It is based on using Integration Operators for generating multiple semantically rich RDF/OWL-based user defined mediator views above the grid participating ISs. These views support different perceptions of the distributed and heterogeneous data available. A set of grid services are developed for the implementation of the mediator views.
... Finding semantic relationships between two given entities is also discussed in the context of ontology matching [9,20,21]. In our case, hub ontologies could also be used to infer missing relationships into another ontology. ...
Conference Paper
Full-text available
Connectivity and relatedness of Web resources are two concepts that define to what extent different parts are connected or related to one another. Measuring connectivity and relatedness between Web resources is a growing field of research, often the starting point of recommender systems. Although relatedness is liable to subjective interpretations, connectivity is not. Given the Semantic Web's ability of linking Web resources, connectivity can be measured by exploiting the links between entities. Further, these connections can be exploited to uncover relationships between Web resources. In this paper, we apply and expand a relationship assessment methodology from social network theory to measure the connectivity between documents. The connectivity measures are used to identify connected and related Web resources. Our approach is able to expose relations that traditional text-based approaches fail to identify. We validate and assess our proposed approaches through an evaluation on a real world dataset, where results show that the proposed techniques outperform state of the art approaches.
... To our best knowledge at least six approaches exist that adopt various techniques to overcome this limitation. For instance, the approach described by Vidal et al. [36] uses explicit rule definitions to map elements of the domain specific ontology to the central one. However due to the formalism used for the rule specifications the application of entailment regimes is not considered in its entirety. ...
Conference Paper
Full-text available
In recent years the core of the semantic web has evolved into a conceptual layer built by a set of ontologies mapped onto data distributed in numerous data sources, interlinked, interpreted and processed in terms of semantics. One of the central issues in this context became the federated querying of such linked data. This paper presents the federated query engine ELITE that facilitates a complete and transparent integration and querying of distributed autonomous data sources. To achieve this aim a combination of existing approaches for Ontology-based Data Access (OBDA) and federated query processing on Linked Open Data (LOD) is applied. Consolidating technologies like entailment regimes, the DL-Lite formalism, query rewriting, mapping relational data to RDF and an improved implementation of R-Tree based indexing contributes to the unique features of this federation engine. ELITE thereby enables the integration of various kinds of data sources, for example as relational databases or triple stores, simplicity of query design, guaranteed completeness of query results and highly eficient query processing. The federation engine has been developed and evaluated in the domain of carbon reduction in urban planning.
... This approach meets our assumptions that the closer two objects are, the higher is the proximity between them. Finding semantic associations between two given objects is also discussed in the context of ontology matching [6, 20, 23]. In our case, hub ontologies could also be used to infer missing relationships into another ontology. ...
Conference Paper
Full-text available
The richness of the (Semantic) Web lies in its ability to link related resources as well as data across the Web. However, while relations within par-ticular datasets are often well defined, links between disparate datasets and cor-pora of Web resources are rare. The increasingly widespread use of cross-domain reference datasets, such as Freebase and DBpedia for annotating and enriching datasets as well as document corpora, opens up opportunities to exploit their in-herent semantics to uncover semantic relationships between disparate resources. In this paper, we present an approach to uncover relationships between disparate entities by analyzing the graphs of used reference datasets. We adapt a relation-ship assessment methodology from social network theory to measure the connec-tivity between entities in reference datasets and exploit these measures to identify correlated Web resources. Finally, we present an evaluation of our approach using the publicly available datasets Bibsonomy and USAToday.
... Os requisitos de dados do LIDMS são especicados através de uma visão de integração a qual é especicada por uma tripla < P , O, Q >, onde: (i) P é uma lista de parâmetros de entrada que serão usados em última instância para ltrar os resultados da saída; (ii) O é uma ontologia que descreve o resultado retornado; (iii) Q é uma consulta SPARQL parametrizada, denida sobre a ontologia de Passo 2 -Geração do Plano de Consulta Federado O objetivo desse passo é a geração do plano de execução da consulta federada do LIDMS, que posteriormente será usado em um ambiente de execução capaz de processar a consulta de forma eciente. O plano é gerado automaticamente a partir da consulta SPARQL parametrizada sobre a OD, usando o processo de geração de planos de consulta federados proposto em [Vidal et al. 2011]. Nesse processo, uma consulta sobre a OD é reescrita e otimizada em termos das OFs, através do uso de links RDF entre as fontes de dados e dos mapeamentos entre OD e OFs. ...
Conference Paper
Full-text available
Tecnologias da Web Semântica como modelo RDF, URIs e linguagem de consulta SPARQL, podem reduzir a complexidade de integração de dados ao fazer uso de ligações corretamente estabelecidas e descritas entre fontes. No entanto, a dificuldade para formulação de consultas distribuídas tem sido um obstáculo para aproveitar o potencial dessas tecnologias em virtude da autonomia, distribuição e vocabulário heterogêneo das fontes de dados. Esse cenário demanda mecanismos e cientes para integração de dados sobre Linked Data. Linked Data Mashups permitem aos usuários executar consultas e integrar dados estruturados e vinculados na web. O presente trabalho propõe uma arquitetura de Linked Data Mashups baseada no uso de Linked Data Mashup Services (LIDMS). Um módulo para execução e ciente de planos de consulta federados sobre Linked Data foi desenvolvido e é um componente da arquitetura proposta. Os resultados de experimentos realizados com o uso do módulo de execução mostraram-se mais e cientes que outras estratégias existentes. Além disso, um ambiente Web para execução de LIDMS também foi definido e implementado como contribuição deste trabalho.
... FedX (SCHWARTE et al., 2011a(SCHWARTE et al., , 2011b é um mediador que estende o Framework Sesame com uma camada de federação que possibilita o processamento eficiente de consultas sobre fontes distribuídas de Linked Data. (VIDAL et al., 2011) apresentam um framework baseado em mediador de três níveis para integração de dados sobre Linked Data. Desafios relacionados à eficiência de consultas federadas e uma abordagem para otimização dessas consultas baseada em programação dinâmica foram tratados por (GÖRLITZ; STAAB, 2011). ...
... As instâncias de Drug das OAs que representam uma mesma droga são interligadas através da propriedade owl:sameAs ou dmed:genericDrug.As regras de mapeamento das OAs para a OD são apresentadas na Figura 4.6. Os mapeamentos estão definidos através do formalismo de mapeamento baseado em regra apresentado em(VIDAL et al., 2011). Através desse formalismo é possível definir classes ou propriedades virtuais, que aparecem na cabeça de cada regra.dbp-ont: ...
Thesis
Full-text available
Tecnologias da Web Semântica como modelo RDF, URIs e linguagem de consulta SPARQL, podem reduzir a complexidade de integração de dados ao fazer uso de ligações corretamente estabelecidas e descritas entre fontes. No entanto, a dificuldade para formulação de consultas distribuídas tem sido um obstáculo para aproveitar o potencial dessas tecnologias em virtude da autonomia, distribuição e vocabulário heterogêneo das fontes de dados. Esse cenário demanda mecanismos eficientes para integração de dados sobre Linked Data. Linked Data Mashups permitem aos usuários executar consultas e integrar dados estruturados e vinculados na web. O presente trabalho propõe duas arquiteturas de Linked Data Mashups: uma delas baseada no uso de mediadores e a outra baseada no uso de Linked Data Mashup Services (LIDMS). Um módulo para execução eficiente de planos de consulta federados sobre Linked Data foi desenvolvido e é um componente comum a ambas as arquiteturas propostas. A viabilidade do módulo de execução foi demonstrada através de experimentos. Além disso, um ambiente Web para execução de LIDMS também foi definido e implementado como contribuições deste trabalho.
... Let F be a set of predicates and V be a set of predicate variables; O . In the case of heterogeneous mapping, that use Skolem functions to express the semantic relationships between two ontologies, for example, when information represented as a class in one ontology and an object property in the other ontology [15]. ...
Conference Paper
Full-text available
The article considers the main methods of data integration for a variety of storage and knowledge representation systems, namely the integration of ontologies, linked data and smart spaces. The main attention is paid to the integration of smart spaces using a special software component - a mediator agent. The paper describes a formal model of the mediator agent and integration procedures based on the descriptive logic. The architecture of the mediator agent is designed for the smart spaces integration scenario based on mapping rules.