Overview of knowledge graph usage in drug discovery.

Overview of knowledge graph usage in drug discovery.

Source publication
Article
Full-text available
Indication expansion aims to find new indications for existing targets in order to accelerate the process of launching a new drug for a disease on the market. The rapid increase in data types and data sources for computational drug discovery has fostered the use of semantic knowledge graphs (KGs) for indication expansion through target centric appr...

Contexts in source publication

Context 1
... this section, we review the approaches that have resorted to the use of KGs for drug discovery regardless of whether the purpose was drug-disease, gene-disease, or drug-drug interaction prediction. Table 1 shows a comprehensive overview of the reviewed methods. ...
Context 2
... 3 shows that each hop-based method predicts tens of thousands of gene-disease links, a number of associations that is unlikely to be validated by experimental means. Therefore, we assessed the performance of the hop-based approaches for early retrieval by looking at metrics for the top-100 predictions (see Supplementary Table S1). In particular, Precision@100 shows that one-hop with RNA tissue and one-hop without tissue constraints are the best approaches for early recognition. ...

Similar publications

Article
Full-text available
p> Background. Cocoa is a traditional crop and source of economic income for Santo Domingo de los Tsáchilas, Ecuador. Objective. The objective of this research was to evaluate the sustainability of cocoa-producing farms in the province of Santo Domingo de los Tsáchilas, Ecuador. Methodology . The methodology used to evaluate sustainability was a "M...

Citations

... The current state-of-the-art approaches in KG-based intelligent drug discovery primarily conform to a standard pipeline (as depicted in Figure 5-(b), PAIRENTITY), which entails the acquisition of valid embeddings of nodes and edges in constructed KGs, followed by the prediction of missing edges, corresponding to novel applications [65,122,123,124,117,125,126,127]. This methodology has been demonstrated in various studies, for instance, Zitnik et al. [65] construct a KG of protein-protein interactions, drug-protein target interactions, and polypharmacy side effects, represented as drug-drug interactions, where each side effect is represented as an edge of a distinct type. ...
Preprint
Full-text available
The integration of Artificial Intelligence (AI) into the field of drug discovery has been a growing area of interdisciplinary scientific research. However, conventional AI models are heavily limited in handling complex biomedical structures (such as 2D or 3D protein and molecule structures) and providing interpretations for outputs, which hinders their practical application. As of late, Graph Machine Learning (GML) has gained considerable attention for its exceptional ability to model graph-structured biomedical data and investigate their properties and functional relationships. Despite extensive efforts, GML methods still suffer from several deficiencies, such as the limited ability to handle supervision sparsity and provide interpretability in learning and inference processes, and their ineffectiveness in utilising relevant domain knowledge. In response, recent studies have proposed integrating external biomedical knowledge into the GML pipeline to realise more precise and interpretable drug discovery with limited training instances. However, a systematic definition for this burgeoning research direction is yet to be established. This survey presents a comprehensive overview of long-standing drug discovery principles, provides the foundational concepts and cutting-edge techniques for graph-structured data and knowledge databases, and formally summarises Knowledge-augmented Graph Machine Learning (KaGML) for drug discovery. A thorough review of related KaGML works, collected following a carefully designed search methodology, are organised into four categories following a novel-defined taxonomy. To facilitate research in this promptly emerging field, we also share collected practical resources that are valuable for intelligent drug discovery and provide an in-depth discussion of the potential avenues for future advancements.
... diseases, targets and drugs) and relations (e.g. disease-target pairs) [30][31][32]. Text-mining analytics and knowledge graph visualization should be integrated seamlessly so that they work best in drug-target discovery applications. ...
Article
Full-text available
Target discovery and identification processes are driven by the increasing amount of biomedical data. The vast numbers of unstructured texts of biomedical publications provide a rich source of knowledge for drug target discovery research and demand the development of specific algorithms or tools to facilitate finding disease genes and proteins. Text mining is a method that can automatically mine helpful information related to drug target discovery from massive biomedical literature. However, there is a substantial lag between biomedical publications and the subsequent abstraction of information extracted by text mining to databases. The knowledge graph is introduced to integrate heterogeneous biomedical data. Here, we describe e-TSN (Target significance and novelty explorer, http://www.lilab-ecust.cn/etsn/), a knowledge visualization web server integrating the largest database of associations between targets and diseases from the full scientific literature by constructing significance and novelty scoring methods based on bibliometric statistics. The platform aims to visualize target-disease knowledge graphs to assist in prioritizing candidate disease-related proteins. Approved drugs and associated bioactivities for each interested target are also provided to facilitate the visualization of drug-target relationships. In summary, e-TSN is a fast and customizable visualization resource for investigating and analyzing the intricate target-disease networks, which could help researchers understand the mechanisms underlying complex disease phenotypes and improve the drug discovery and development efficiency, especially for the unexpected outbreak of infectious disease pandemics like COVID-19.
... Shi et al. [19,18] use pyRDF2Vec to calculate semantic similarity between concepts in several datasets. Gurbuz et al. [8] evaluate many different techniques, including pyRDF2Vec, for explainable target-disease link prediction. Steenwinckel et al. [21] compare their newly proposed technique, INK, to state-of-the-art techniques such as pyRDF2Vec. ...
Preprint
Full-text available
This paper introduces pyRDF2Vec, a Python software package that reimplements the well-known RDF2Vec algorithm along with several of its extensions. By making the algorithm available in the most popular data science language, and by bundling all extensions into a single place, the use of RDF2Vec is simplified for data scientists. The package is released under a MIT license and structured in such a way to foster further research into sampling, walking, and embedding strategies, which are vital components of the RDF2Vec algorithm. Several optimisations have been implemented in \texttt{pyRDF2Vec} that allow for more efficient walk extraction than the original algorithm. Furthermore, best practices in terms of code styling, testing, and documentation were applied such that the package is future-proof as well as to facilitate external contributions.
Chapter
This paper introduces pyRDF2Vec, a Python software package that reimplements the well-known RDF2Vec algorithm along with several of its extensions. By making the algorithm available in the most popular data science language, and by bundling all extensions into a single place, the use of RDF2Vec is simplified for data scientists. The package is released under an MIT license and structured in such a way to foster further research into sampling, walking, and embedding strategies, which are vital components of the RDF2Vec algorithm. Several optimisations have been implemented in pyRDF2Vec that allow for more efficient walk extraction than the original algorithm. Furthermore, best practices in terms of code styling, testing, and documentation were applied such that the package is future-proof as well as to facilitate external contributions.KeywordsRDF2Vecwalk-based embeddingsopen source