Anastasia Dimou

Anastasia Dimou
KU Leuven | ku leuven · Department of Computer Science

Professor

About

110
Publications
14,566
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,389
Citations

Publications

Publications (110)
Article
Full-text available
Background In healthcare, an increasing collaboration can be noticed between different caregivers, especially considering the shift to homecare. To provide optimal patient care, efficient coordination of data and workflows between these different stakeholders is required. To achieve this, data should be exposed in a machine-interpretable, reusable...
Article
Full-text available
RDF-star has been proposed as an extension of RDF to make statements about statements. Libraries and graph stores have started adopting RDF-star, but the generation of RDF-star data remains largely unexplored. To allow generating RDF-star from heterogeneous data, RML-star was proposed as an extension of RML. However, no system has been developed so...
Chapter
Full-text available
The Relational to RDF Mapping Language (R2RML) became a W3C Recommendation a decade ago. Despite its wide adoption, its potential applicability beyond relational databases was swiftly explored. As a result, several extensions and new mapping languages were proposed to tackle the limitations that surfaced as R2RML was applied in real-world use cases...
Chapter
Full-text available
Knowledge Graphs (KGs) are a powerful mechanism to structure and organize data on the Web. RDF KGs are usually constructed by declaring a set of mapping rules, specified according to the grammar of a mapping language (e.g., RML), that relates the input data sources to a domain vocabulary. However, the verbosity and (manual) definition of these rule...
Preprint
Full-text available
Stream-reasoning query languages such as CQELS and C-SPARQL enable query answering over RDF streams. Unfortunately, there currently is a lack of efficient RDF stream generators to feed RDF stream reasoners. State-of-the-art RDF stream generators are limited with regard to the velocity and volume of streaming data they can handle. To efficiently gen...
Chapter
Stream-reasoning query languages such as CQELS and C-SPARQL enable query answering over RDF streams. Unfortunately, there currently is a lack of efficient RDF stream generators to feed RDF stream reasoners. State-of-the-art RDF stream generators are limited with regard to the velocity and volume of streaming data they can handle. To efficiently gen...
Article
More and more data in various formats are integrated into knowledge graphs. However, there is no overview of existing approaches for generating knowledge graphs from heterogeneous (semi-)structured data, making it difficult to select the right one for a certain use case. To support better decision making, we study the existing approaches for genera...
Chapter
This document summarizes the workshops and tutorials of the 19th European Semantic Web Conference. This edition accepted 10 workshops on different topics revolving around knowledge graphs, such as natural language processing, industrial use of knowledge graphs, biomedical data, etc. Moreover, 2 tutorials were accepted which included knowledge graph...
Conference Paper
Full-text available
Nowadays, Knowledge Graphs (KG) are among the most powerful mechanisms to represent knowledge and integrate data from multiple domains. However, most of the available data sources are still described in heterogeneous data structures, schemes, and formats. The conversion of these sources into the desirable KG requires manual and time-consuming tasks...
Preprint
Full-text available
In constraint languages for RDF graphs, such as ShEx and SHACL, constraints on nodes and their properties in RDF graphs are known as "shapes". Schemas in these languages list the various shapes that certain targeted nodes must satisfy for the graph to conform to the schema. Using SHACL, we propose in this paper a novel use of shapes, by which a set...
Article
Full-text available
The quality of knowledge graphs can be assessed by a validation against specified constraints, typically use-case specific and modeled by human users in a manual fashion. Visualizations can improve the modeling process as they are specifically designed for human information processing, possibly leading to more accurate constraints, and in turn high...
Chapter
Full-text available
Social media as infrastructure for public discourse provide valuable information that needs to be preserved. Several tools for social media harvesting exist, but still only fragmented workflows may be formed with different combinations of such tools. On top of that, social media data but also preservation-related metadata standards are heterogeneou...
Poster
Full-text available
RDF-star was recently proposed as a convenient representation to annotate statements in RDF with metadata by introducing the so-called RDF-star triples, bridging the gap between RDF and property graphs. However, even though there are many solutions to generate RDF graphs, there is no systematic approach so far to generate RDF-star graphs from heter...
Chapter
Full-text available
Digital applications typically describe their privacy policy in lengthy and vague documents (called PrPs), but these are rarely read by users, who remain unaware of privacy risks associated with the use of these digital applications. Thus, users need to become more aware of digital applications’ policies and, thus, more confident about their choice...
Book
Full-text available
This book constitutes the proceedings of the satellite events held at the 18th Extended Semantic Web Conference, ESWC 2021, in June 2021. The conference was held online, due to the COVID-19 pandemic. During ESWC 2021, the following six workshops took place: 1) the Second International Workshop on Deep Learning meets Ontologies and Natural Language...
Chapter
Full-text available
Constructing a knowledge graph with mapping languages, such as RML or SPARQL-Generate, allows seamlessly integrating heterogeneous data by defining access-specific definitions for e.g., databases or files. However, such mapping languages have limited support for describing Web APIs and no support for describing data with varying velocities, as need...
Article
Full-text available
Anomalies and faults can be detected, and their causes verified, using both data-driven and knowledge-driven techniques. Data-driven techniques can adapt their internal functioning based on the raw input data but fail to explain the manifestation of any detection. Knowledge-driven techniques inherently deliver the cause of the faults that were dete...
Article
Full-text available
The correct functioning of Semantic Web applications requires that given RDF graphs adhere to an expected shape. This shape depends on the RDF graph and the application’s supported entailments of that graph. During validation, RDF graphs are assessed against sets of constraints, and found violations help refining the RDF graphs. However, existing v...
Chapter
In this chapter, an overview of the state of the art on knowledge graph generation is provided, with focus on the two prevalent mapping languages: the W3C recommended R2RML and its generalisation RML. We look into details on their differences and explain how knowledge graphs, in the form of RDF graphs, can be generated with each one of the two mapp...
Chapter
Full-text available
Rewarding people is common in several contexts, such as human resource management and crowdsourcing applications. However, designing a reward strategy is not straightforward, as it requires considering different parameters. These parameters include, for example, management of rewarding tasks and identifying critical features, such as the type of re...
Chapter
At the end of 2019, Chinese authorities alerted the World Health Organization (WHO) of the outbreak of a new strain of the coronavirus, called SARS-CoV-2, which struck humanity by an unprecedented disaster a few months later. In response to this pandemic, a publicly available dataset was released on Kaggle which contained information of over 63,000...
Chapter
Full-text available
A key source of revenue for the media and entertainment domain is ad targeting : serving advertisements to a select set of visitors based on various captured visitor traits. Compared to global media companies such as Google and Facebook that aggregate data from various sources (and the privacy concerns these aggregations bring), local companies onl...
Chapter
Full-text available
This chapter introduces how Knowledge Graphs are generated. The goal is to gain an overview of different approaches that were proposed and find out more details about the current prevalent ones. After reading this chapter, the reader should have an understanding of the different solutions available to generate Knowledge Graphs and should be able to...
Poster
Full-text available
Rewarding people is common in several contexts, such as human resource management and crowdsourcing applications. However, designing a reward strategy is not straightforward, as it requires considering different parameters. These parameters include, for example, management of rewarding tasks and identifying critical features, such as the type of re...
Preprint
In this paper, an overview of the state of the art on knowledge graph generation is provided, with focus on the two prevalent mapping languages: the W3C recommended R2RML and its generalisation RML. We look into details on their differences and explain how knowledge graphs, in the form of RDF graphs, can be generated with each one of the two mappin...
Chapter
Nowadays, a website is used to disseminate information about an event (e.g., location, dates, time). In the academic world, it is common to develop a website for an event, such as workshops or conferences. Aligning with the “Web of data”, its dissemination should also happen by publishing the information of the event as a knowledge graph, e.g., via...
Article
Full-text available
Functions are essential building blocks of information retrieval and information management. However, efforts implementing these functions are fragmented: one function has multiple implementations, within specific development contexts. This inhibits reuse: metadata of functions and associated implementations need to be found across various search i...
Chapter
Functions are essential building blocks of any (computer) information system. However, development efforts to implement these functions are fragmented: a function has multiple implementations, each within a specific development context. Manual effort is needed handling various search interfaces and access methods to find the desired function, its m...
Conference Paper
Within ontology engineering concepts are modeled as classes and relationships, and restrictions as axioms. Reusing ontologies requires assessing if existing ontologies are suited for an application scenario. Different scenarios not only influence concept modeling, but also the use of different restriction types, such as subclass relationships or di...
Conference Paper
Full-text available
To unlock the value of increasingly available data in high volumes, we need flexible ways to integrate data across different sources. While semantic integration can be provided through RDF generation, current generators insufficiently scale in terms of volume. Generators are limited by memory constraints. Therefore, we developed the RMLStreamer, a...
Article
Full-text available
Knowledge graphs, which contain annotated descriptions of entities and their interrelations, are often generated using rules that apply semantic annotations to certain data sources. (Re)using ontology terms without adhering to the axioms defined by their ontologies results in inconsistencies in these graphs, affecting their quality. Methods and too...
Conference Paper
Decentralised data solutions bring their own sets of capabilities, requirements and issues not necessarily present in centralised solutions. In order to compare the properties of different approaches or tools for management of decentralised data, it is important to have a common evaluation framework. We present a set of dimensions relevant to data...
Chapter
Full-text available
Knowledge graphs are often generated using rules that apply semantic annotations to data sources. Software tools then execute these rules and generate or virtualize the corresponding RDF-based knowledge graph. RML is an extension of the W3C-recommended R2RML language, extending support from relational databases to other data sources, such as data i...
Conference Paper
Full-text available
The process of extracting, structuring, and organizing knowledge requires processing large and originally heterogeneous data sources. Offering existing data as Linked Data increases its shareability, extensibility, and reusability. However, using Linking Data as a means to represent knowledge can be easier said than done. In this tutorial, we elabo...
Conference Paper
Full-text available
Assessing upfront the causes and effects of failures is an important aspect of system manufacturing. Nowadays, these analyses are performed by a large number of experts. To enable semantic unification and easy operationalization of these risk analyses, this paper demonstrates an approach to automatically map the captured information into an ontolog...
Conference Paper
Full-text available
Sensors, inside internet-connected devices, analyse the environment and monitor possible unwanted behaviour or the malfunctioning of the system. Current risk analysis tools, such as Fault Tree Analysis (FTA) and Failure Mode and Effect Analysis (FMEA), provide prior information on these faults together expert-driven insights of the system. Many peo...
Chapter
Enriching scholarly data with metadata enhances the publications’ meaning. Unfortunately, different publishers of overlapping or complementary scholarly data neglect general-purpose solutions for metadata and instead use their own ad-hoc solutions. This leads to duplicate efforts and entails non-negligible implementation and maintenance costs. In t...
Chapter
Full-text available
Linked Data is often generated based on a set of declarative rules using languages such as R2RML and RML. These languages are built with machine-processability in mind. It is thus not always straightforward for users to define or understand rules written in these languages, preventing them from applying the desired annotations to the data sources....
Chapter
Data management increasingly demands transparency with respect to data processing. Various stakeholders need information tailored to their needs, e.g. data management plans (DMP) for funding agencies or privacy policies for the public. DMPs and privacy policies are just two examples of documents describing aspects of data processing. Dedicated tool...
Article
Full-text available
Visual tools are implemented to help users in defining how to generate Linked Data from raw data. This is possible thanks to mapping languages which enable detaching mapping rules from the implementation that executes them. However, no thorough research has been conducted so far on how to visualize such mapping rules, especially if they become larg...
Chapter
Evaluating federated Linked Data queries requires consulting multiple sources on the Web. Before a client can execute queries, it must discover data sources, and determine which ones are relevant. Federated query execution research focuses on the actual execution, while data source discovery is often marginally discussed-even though it has a strong...
Conference Paper
dbpedia data is largely generated from extracting and parsing the wikitext from the infoboxes of Wikipedia. This generation process is handled by the dbpedia Extraction Framework (dbpedia ef). This framework currently consists of data transformations, a series of custom hard-coded steps which parse the wikitext, and schema transformations, which mo...
Conference Paper
dbpedia ef, the generation framework behind one of the Linked Open Data cloud’s central interlinking hubs, has limitations with regard to quality, coverage and sustainability of the generated dataset. dbpedia can be further improved both on schema and data level. Errors and inconsistencies can be addressed by amending (i) the dbpedia ef; (ii) the d...
Conference Paper
The success of the Semantic Web highly depends on its ingredients. If we want to fully realize the vision of a machine-readable Web, it is crucial that Linked Data are actually useful for machines consuming them. On this background it is not surprising that (Linked) Data validation is an ongoing research topic in the community. However, most approa...
Conference Paper
Full-text available
Ontology-Based Data Access systems provide access to non-rdf data using ontologies. These systems require mappings between the non-rdf data and ontologies to facilitate this access. Manually defining such mappings can become a costly process when dealing with large and complex data sources, and/or multiple data sources at the same time. This result...
Conference Paper
Full-text available
The process of extracting, structuring, and organizing knowledge from one or multiple data sources and preparing it for the Semantic Web requires a dedicated class of systems. They enable processing large and originally heterogeneous data sources and capturing new knowledge. Offering existing data as Linked Data increases its shareability, extensib...
Article
Full-text available
While most challenges organized so far in the Semantic Web domain are focused on comparing tools with respect to different criteria such as their features and competencies, or exploiting semantically enriched data, the Semantic Web Evaluation Challenges series, co-located with the ESWC Semantic Web Conference, aims to compare them based on their ou...
Article
Full-text available
While most challenges organized so far in the Semantic Web domain are focused on comparing tools with respect to different criteria such as their features and competencies, or exploiting semantically enriched data, the Semantic Web Evaluation Challenges series, co-located with the ESWC Semantic Web Conference, aims to compare them based on their ou...
Article
Full-text available
While most challenges organized so far in the Semantic Web domain are focused on comparing tools with respect to different criteria such as their features and competencies, or exploiting semantically enriched data, the Semantic Web Evaluation Challenges series, co-located with the ESWC Semantic Web Conference, aims to compare them based on their ou...
Conference Paper
Full-text available
Generating Linked Data based on existing data sources requires the modeling of their information structure. This modeling needs the identification of potential entities, their attributes and the relationships between them and among entities. For databases this identification is not required, because a data schema is always available. However, for o...
Conference Paper
Data has been made reusable and machine-interpretable by publishing it as Linked Data. However, Linked Data automatic processing is not fully achieved yet, as manual effort is still needed to integrate existing tools and libraries within a certain technology stack. To enable automatic processing, we propose exposing functions and methods as Linked...
Conference Paper
Linked Data generation and publication remain challenging and complicated, in particular for data owners who are not Semantic Web experts or tech-savvy. The situation deteriorates when data from multiple heterogeneous sources, accessed via different interfaces, is integrated, and the Linked Data generation is a long-lasting activity repeated period...
Conference Paper
The root of schema violations for RDF data generated from (semi-)structured data, often derives from mappings, which are repeatedly applied and specify how an RDF dataset is generated. The DBpedia dataset, which derives from Wikipedia infoboxes, is no exception. To mitigate the violations, we proposed in previous work to validate the mappings which...
Article
Evaluating federated Linked Data queries requires consulting multiple sources on the Web. Before a client can execute queries, it must discover data sources, and determine which ones are relevant. Federated query execution research focuses on the actual execution, while data source discovery is often marginally discussed—even though it has a strong...
Conference Paper
Nowadays, the Web has become one of the main sources of biodiversity information. An increasing number of biodiversity research institutions add new specimens and their related information to their biological collections and make this information available on the Web. However, mechanisms which are currently available provide insufficient provenance...
Conference Paper
The Semantic Publishing Challenge aims to involve participants in extracting data from heterogeneous sources on scholarly publications, and produce Linked Data which can be exploited by the community itself. The 2014 edition was the first attempt to organize a challenge to enable the assessment of the quality of scientific output. The 2015 edition...
Conference Paper
Full-text available
Linked Data is in many cases generated from (semi-) structured data. This generation is supported by several tools, a number of which use a mapping language to facilitate the Linked Data generation. However, knowledge of this language and other used technologies is required to use the tools, limiting their adoption by non-Semantic Web experts. We d...
Conference Paper
Applications built on top of the Semantic Web are emerging as a novel solution in different areas, such as decision making and route planning. However, to connect results of these solutions – i.e., the semantically annotated data – with real-world applications, this semantic data needs to be connected to actionable events. A lot of work has been do...
Conference Paper
Full-text available
Although several tools have been implemented to generate Linked Data from raw data, users still need to be aware of the underlying technologies and Linked Data principles to use them. Mapping languages enable to detach the mapping definitions from the implementation that executes them. However, no thorough research has been conducted on how to faci...
Conference Paper
The objective of the Semantic Publishing (SemPub) challenge series is to bootstrap a value chain for scientific data to enable services, such as assessing the quality of scientific output with respect to novel metrics. The key idea was to involve participants in extracting data from heterogeneous resources and producing datasets on scholarly public...
Conference Paper
Provenance and other metadata are essential for determining ownership and trust. Nevertheless, no systematic approaches were introduced so far in the Linked Data publishing workflow to capture them. Defining such metadata remained independent of the RDF data generation and publishing. In most cases, metadata is manually defined by the data publishe...
Conference Paper
Handling Big Data is often hampered by integration challenges, especially when speed, efficiency and accuracy are critical. In Semantic Web, integration is achieved by aligning multiple representations of the same entities, appearing within distributed heterogeneous sources and by annotating data using same vocabularies. However, mapping different...
Conference Paper
In order to assess the trustworthiness of information on social media, a consumer needs to understand where this information comes from, and which processes were involved in its creation. The entities, agents and activities involved in the creation of a piece of information are referred to as its provenance, which was standardized by W3C PROV. Howe...
Conference Paper
RDF dataset quality assessment is currently performed primarily after data is published. However, there is neither a systematic way to incorporate its results into the dataset nor the assessment to the publishing workflow. Adjustments are manually---but rarely---applied. Nevertheless, the root of the violations which often derive from the mappings...
Conference Paper
Assessing the trustworthiness of a dataset is of crucial importance on the Web of Data and depends on different factors. In the case of Linked Data derived from (semi-)structured data, the trustworthiness of a dataset can be assessed partly through their mappings. The accuracy with which schema(s) are combined and applied to semantically annotate d...
Conference Paper
Full-text available
Modeling domain knowledge as Linked Data is not straightforward for data publishers, because they are domain experts and not Semantic Web specialists. Most approaches that map data to its RDF representation still require users to have knowledge of the underlying implementations, as the mapping definitions remained, so far, tight to their execution....
Conference Paper
Obtaining Linked Data by modeling domain-level knowledge derived from input data is not straightforward for data publishers, especially if they are not Semantic Web experts. Developing user interfaces that support domain experts to semantically annotate their data became feasible, as the mapping rules were abstracted from their execution. However,...
Conference Paper
RDF dataset quality assessment is currently performed primarily after data is published. Incorporating its results, by applying corresponding adjustments to the dataset, happens manually and occurs rarely. In the case of (semi-)structured data (e.g., CSV, XML), the root of the violations often derives from the mappings that specify how the RDF data...
Book
In order to assess the trustworthiness of information on social media, a consumer needs to understand where this information comes from, and which processes were involved in its creation. The entities, agents and activities involved in the creation of a piece of information are referred to as its provenance, which was standardized by W3C PROV. Howe...
Conference Paper
The RDF data model allows the description of domain-level knowledge that is understandable by both humans and machines. RDF data can be derived from different source formats and diverse access points, ranging from databases or files in CSV format to data retrieved from Web APIs in JSON, Web Services in XML or any other speciality formats. To this e...
Article
Full-text available
The Semantic Publishing Challenge series aims at investigating novel approaches for improving scholarly publishing using Linked Data technology. In 2014 we had bootstrapped this effort with a focus on extracting information from non-semantic publications - computer science workshop proceedings volumes and their papers - to assess their quality. The...
Conference Paper
Full-text available
In this paper, we present our solution for the first task of the second edition of the Semantic Publishing Challenge. The task requires extracting and semantically annotating information regarding CEUR-WS workshops, their chairs and conference affiliations, as well as their papers and their authors, from a set of html-encoded workshop proceedings v...
Conference Paper
Full-text available
The Semantic Publishing Challenge series aims at investigating novel approaches for improving scholarly publishing using Linked Data technology. In 2014 we had bootstrapped this effort with a focus on extracting information from non-semantic publications – computer science workshop proceedings volumes and their papers – to assess their quality. The...
Conference Paper
Full-text available
The various ways of interacting with social media, web collaboration tools, co-authorship and citation networks for scientific and research purposes remain distinct. In this paper, we propose a solution to align such information. We particularly developed an exploratory visualization of research networks. The result is a scholar centered, multi-per...
Article
Full-text available
The Open Access movement and the research management can take a new turn if the research information is published as Linked Open Data. With Linked Open Data, the management of the research information within institutions and across institutions can be facilitated, the quality of the available data can be improved and their availability to the publi...
Article
A proposed technique quantifies the semantic interoperability of open government datasets with three metrics calculated using a set of statements that indicate for each pair of identifiers in the system whether or not they represent the same concept.
Conference Paper
Semantically annotating and interlinking Open Data results in Linked Open Data which concisely and unambiguously describes a knowledge domain. However, the uptake of the Linked Data depends on its usefulness to non-Semantic Web experts. Failing to support data consumers understanding the added-value of Linked Data and possible exploitation opportun...
Conference Paper
Digital learning content is becoming more and more commonplace. However, the creation of this learning content is usually done in specialized publishing platforms, that may lock in the content. This prevents the learning content from being discovered automatically, and hinders its uptake. A more general approach for authoring discoverable learning...
Conference Paper
Web resources can be linked directly to their provenance, as specified in W3C PROV-AQ. On its own, this solution places all responsibility to the resource's publisher, who hopefully maintains and publishes provenance information. In reality, however, most publishers lack of incentives to publish the resources' provenance, even if the authors would...
Conference Paper
Full-text available
To inform citizens when they can use government services, governments publish the services' opening hours on their website. When opening hours would be published in a machine interpretable manner, software agents would be able to answer queries about when it is possible to contact a certain service. We introduce an ontology for describing opening h...
Conference Paper
Incorporating structured data in the Linked Data cloud is still complicated, despite the numerous existing tools. In particular, hierarchical structured data (e.g., JSON) are underrepresented, due to their processing complexity. A uniform mapping formalization for data in different formats, which would enable reuse and exchange between tools and ap...
Conference Paper
Despite the significant number of existing tools, incorporating data into the Linked Open Data cloud remains complicated; hence discouraging data owners to publish their data as Linked Data. Unlocking the semantics of published data, even if they are not provided by the data owners, can contribute to surpass the barriers posed by the low availabili...
Conference Paper
The missing feedback loop is considered the reason for broken Data Cycles on current Linked Open Data ecosystems. Read-Write platforms are proposed, but they are restricted to capture modifications after the data is released as Linked Data. Triggering though a new iteration results in loosing the data consumers' modifications, as a new version of t...
Conference Paper
As the Web evolves in an integrated and interlinked knowledge space thanks to the growing amount of published Linked Open Data, the need to find solutions that enable the scholars to discover, explore and analyse the underlying research data emerges. Scholars, typically non-expert technology users, lack of in-depth understanding of the underlying s...
Conference Paper
Despite the significant number of existing tools, incorporating data from multiple sources and different formats into the Linked Open Data cloud remains complicated. No mapping formalisation exists to define how to map such heterogeneous sources into RDF in an integrated and interoperable fashion. This paper introduces the RML mapping language, a g...
Conference Paper
In recent years, the concept of machine-interpretable annotations -- for example using RDFa -- has been gaining support in the Web community. Websites are increasingly adding these annotations to their content, in order to increase their discoverability and visibility to external agents and services. This paper highlights two problems with current...

Network

Cited By