Conference PaperPDF Available

A new method for prioritizing drug repositioning candidates extracted by literature-based discovery

November 2015

November 2015

DOI:10.1109/BIBM.2015.7359766

Conference: Bioinformatics and Biomedicine (BIBM), 2015 IEEE International Conference
At: DC, USA

Authors:

Majid Rastegar-Mojarad

Mayo Clinic - Rochester

Ravikumar Komandur Elayavilli

Mayo Foundation for Medical Education and Research

Dingcheng Li

Mayo Clinic - Rochester

Rashmi Prasad

University of Wisconsin - Milwaukee

Show all 5 authorsHide

…

Figures - uploaded by Majid Rastegar-Mojarad

Content may be subject to copyright.

Content uploaded by Majid Rastegar-Mojarad

Content may be subject to copyright.

A new method for prioritizing drug repositioning

candidates extracted by literature-based discovery

Majid Rastegar-Mojarad1,2, Ravikumar Komandur Elayavilli1, Dingcheng Li1, Rashmi Prasad2, Hongfang Liu1

1Department of Health Sciences Research, Mayo Clinic, USA

2University of Wisconsin-Milwaukee, Milwaukee, WI, USA

Email: {Mojarad.Majid, komandurelayavilli.ravikumar, Li.Dingcheng, Liu.Hongfang}@mayo.edu

prasadr@uwm.edu

Abstract— Drug repositioning has been a topic of great

attention to researchers and pharmaceutical companies due to its

significant impact on the cost of drug discovery. There are

several approaches to identify potentially novel drug candidates

through repurposing. Literature mining has played a critical role

in mining such information from scientific articles. In this paper,

we used drug-gene and gene-disease semantic predications

extracted from Medline abstracts to generate a list of potential

drug-disease pairs. We further ranked the generated pairs, by

assigning scores based on the predicates that qualify drug-gene

and gene-disease relationships. On comparing the top-ranked

drug-disease pairs against the Comparative Toxicogenomics

Database (CTD), a curated database for drug-disease relations,

we found that a significant percentage of top ranked pairs

appeared in CTD. Co-occurrence of these high-ranked pairs in

Medline abstracts further improves the confidence in our

approach to rank the inferred drug-disease relations higher in

the list. Finally, manual evaluation of top ten pairs ranked by our

approach revealed that nine of them have some biological

significance based on expert judgment.

Keywords; Drug repositioning; Literature-based discovery;

Semantic Predication

I. INTRODUCTION

New drug development costs between 500 million and 2

billion dollars, takes 10-15 years [1], and the success rate is

less than 10% [2]. A well-known alternative way to reduce the

risk and cost of developing new drugs is drug repositioning [3],

i.e., finding new targets for drugs that are already available in

the market. Drug repositioning (or drug repurposing) reduces

the bulk of the effort during the early stages of drug

development, resulting in significant reduction of time and

cost. Drug repositioning alone accounts for approximately 30%

of the newly US Food and Drug Administration (FDA)-

approved drugs and vaccines in recent years [4]. To identify

new indications for available drugs, several approaches have

been studied using various types of data such as clinical data

[5], genetic data [6], [7], and biomedical literature [8], [9].

Literature mining plays a critical role in identifying the

indirect (hidden) relationships between a drug and its potential

targets, since it is nearly impossible for experts to manually

review the ever-increasing body of scientific literature to

identify hidden relationships. Literature-based discovery

(LBD) [10] is a popular approach for unfolding potential novel

findings from the biomedical literature, and involves the

application of the relation of transitivity to discovering

relations. Specifically, LBD systems relate two entities, with a

common entity to provide the link between them. For example,

in order to generate a list of potential drug-disease relations, the

LBD system may attempt to find a common entity (often

genes) that potentially links them. Once the indirect connection

between the drug-disease pairs are established, it is necessary

to eliminate the false positives, and identify only true relations

(novel discoveries).

Distinguishing novel discoveries from the others is not a

trivial task. Typically, the LBD method consists of two steps:

1) extracting and mining relations from the text, and 2)

eliminating the false positives and identifying only the true

relations. As a final step, however, it is also important to have a

rigorous validation of the candidate relations before we

proceeding to laboratory or clinical investigations, since these

are not only expensive but also time consuming. The

effectiveness of a LBD system, therefore, lies in its rigorous

validation. Most prior studies lack such vigorous validation,

including ranking of the generated candidates generated

through LBD process. Though there are a few prior attempts

[12], [13] in this direction, this area has been largely

underexplored.

In this study, we intend to address the critical issue of the

validation of candidate pairs identified through LBD. We

propose and evaluate the effectiveness of two ranking methods

for prioritizing potential drug repositioning candidates

generated by LBD.

The rest of the paper proceeds as follows. First, we discuss

the background and related work in this domain. Here, we

provide an overview for repositioning and LBD and the

resources often used for LBD (semantic medline database in

particular) before we review some of the prior approaches for

ranking the discoveries by LBD systems. Subsequently, we

discuss our approach to rank the LBD-based drug-disease

pairs. Finally, we present the evaluation, results and discussion,

and highlight the limitations of the study.

II. BACKGROUND AND RELATED WORKS

Drug repositioning [15] – also known as repurposing or

reprofiling – is the process of discovering new indications for

existing or shelved drugs. It enables researchers to speed up the

process of developing drugs, with lower cost and risk. There

have been many approaches proposed for drug repositioning,

which could be categorized in different ways. Dudley et al [16]

reviewed computational methods for drug repositioning and

categorized the methods into two classes: drug-based and

disease-based, based on whether drug or disease perspective

initiates the discovery. In another review paper [17], Hurle et

al. reviewed computational techniques for systematic analysis

of transcriptomics, side effects, and genetics data. Wei et al [8]

categorized the drug repositioning methods into literature-

based and ontology-based.

Literature-based discovery (LBD) strives to find novel

connections or correlations between concepts by using

scientific literature. Many LBD studies have been conducted

[10], [18]–[20] in the biomedical domain to generate new

hypotheses that potentially could lead to new discoveries. In

1986, Don Swanson hypothsized that fish oil may have

beneficial effects in patients with Raynaud’s syndrome. He

came up with this hypothesis after reviewing the literature and

observing that (1) Raynaud’s syndrome (A) patients have blood

viscosity (B) disorder, and (2) Fish oil (C) can reduce blood

viscosity. Later, the hypothesis was verified by clinical trials.

Swanson designed a software called ArrowSmith [21] and

implemented the model, called Swanson’s ABC model [10], to

identify more LBDs. LBD deploys general text mining

techniques such as named entity recognition and information

extraction. Sub tasks in LBD are: named entity recognition

(term recognition), term normalization, information extraction,

association mining, and ranking. One of the essential tasks for

a LBD system is to decide if two concepts are correlated or not

(relation extraction). Most commonly used approaches are co-

occurrence analysis [20], Association Rules [22], TF-IDF, Z-

Score, and Mutual Information Measure [19]. Some

approaches are available to identify associations between

concepts and terms, which do not co-occur with one another in

the biomedical literature [12][23]–[25]. Another approach to

identify correlated concepts is using semantic predictions [18].

Hristovski et al. [18] proposed a approach to augment co-

occurrence analysis with semantic predications provided by

two natural language processing systems. Ahlers et al [13] used

semantic medline to propose discovery patterns for the use of

antipsychotic agents in treatment of cancer. Cohen et al. [12]

proposed an approach named Predication-based Semantic

Indexing to generate discovery patterns.

Semantic Medline Database (SemMedDB) [26] contains

approximately 70 million semantic predications, which

extracted by a rule-based system, SemRep [11], from Medline

titles and abstracts. Each semantic predication is a subject-

relation (predicate)-object triple. Subject and object are

concepts from the UMLS Metathesaurus and predicate is a

relation from the UMLS Semantic Network. There are 30

different types of predicate in Semantic Medline database such

as: affects, causes, associated with, treats, etc. Besides these

predicates, there are negative predicates, which show negative

relation between subject and object.

SemMedDB is a relational database and has been used in

several studies to facilitate knowledge discovery [27]–[29].

Workman and Stoddart [27] proposed to use Semantic Medline

as a potential decision support system for point of care.

Previously, our group integrated semantic predications into a

system, called Ask Mayo Expert (AME), to retrieve the most

relevant literature to support the evidence-based clinical

decision making process at point of care [28]. In another study

[29], SemMedDB is utilized to investigate the significance of

extracting information from multiple sentences specifically in

the context of drug-disease relation discovery.

The challenging and expensive task after generating LBDs

is validation. The discoveries can be confirmed or rejected

through human judgment, laboratory methods, or clinical

investigations. The validation could be facilitated with ranking

and prioritizing the generated LBDs, which is the last step in a

LBD system. There have been several studies, which proposed

the ranking algorithm. Wren proposed an algorithm called

average minimum weight (AMV) [30]. The algorithm

calculates a weight for each discoveries (A-C) based on the

strength of A-B and B-C relations. The strength of each

relation is calculated based on mutual information. The

algorithm considers all possible B concepts, which have

relation with A and C in the calculation. Another approach to

rank the findings is proposed by Yetisgen-Yildiz and Pratt [31],

[32]. They used the number of B concepts, which link A to C

as the indication of a strong correlation. The method, which

called Linking Term Count with Average Minimum Weight

(LTC-AMW), uses AMV in case of tie. Swanson et al. [33]

introduced a measure to rank the discoveries based on MeSH

terms in literature called Literature Cohesiveness. AMV and

LTC-AMV are generic and can be used in different LBD

systems, but these algorithms do not consider semantic in their

calculation.

III. METHOD

Our approach to identify drug-repositioning candidates

from literature was inspired by Swanson’s ABC model [10].

Our premise in inferring a correlation between the concept A

and concept C depends upon how strong the association was in

the two associations (A to B and B to C). In our study drug is

concept A, disease is target concept C, and gene serves as an

intermediate Concept B linking A and C. We retrieved the

drug-gene and gene-disease semantic predications from

SemMedDB [26] to infer the link between drug and disease.

Consider the following two examples of semantic predications

extracted from SemMedDB:

• Example 1: Strepsils (Drug), INTERACTS WITH,

CA2 (Gene)

• Example 2: CA2 (Gene), AUGMENTS, Chagas

(Disease)

From the pairs in the above examples, the LBD system

generates Strepsils - Chagas as a potential drug-disease pair.

While the ABC model is common to all the LBD studies,

using semantic predications allow us 1) to consider the

predicates that qualify the drug-gene and gene-disease

relationships 2) to explore a systematic approach to eliminate

potential false positive drug-disease pairs from the potential list

and provide a meaningful ranking to the final drug-disease

pairs. From the initial list we can use series of methods to

eliminate erroneous extraction. However, filtering approaches

are not exhaustive thereby leaving room to a large list of drug-

disease pairs. It is pertinent that we rank these pairs based on

certain parameters, which may help identify a threshold, below

which the drug-disease candidates can be discarded.

As a first step, we use two approaches to filter the drug-

gene and gene-disease pairs. 1) Pairs qualified by predicates,

which are negated such as “did not inhibit” were not at all

considered. 2) We also do not consider pairs qualified by

predicates such as “co-exists”, which do not semantically

define a relationship between the pairs. For subsequent steps

we relied on the notion that the assertions of NLP extraction

based on semantic predicates will also have the potential to

establish biological relatedness between the drug-disease pair.

Hence we used the predicates that qualify the binary

relationships as a feature to rank the final drug-disease

relationships. For example to rank the above pair, Strepsils –

Chagas, we used the predicate between drug-gene,

“INTERACTS WITH”, and the predicate between gene-

disease, “AUGMENTS”. The semantic predicates of both the

drug-gene and gene-disease pair play a determining role in

qualifying a drug-disease pair. We attempted to find a

meaningful co-relation between the predicates that qualify

drug-gene and gene-disease relationships, which we called

them, intermediate predicates, and the likelihood of generating

a true drug-disease pair. The importance of predicate to

determine the relevance of drug-disease relations is even more

important given the fact that the individual semantic

predication can occur in more than one document. In order to

assert a relationship that is inferred from two relationships from

multiple documents, we propose that the predicate co-relation

between the two relationships is one of the key factors.

Besides, the relation and the predicates may have many-to-

many relationships meaning that more than one predicate can

qualify a relation between the two entities. For example, there

is only one citation for the relationship between Strepsils and

CA2, while there are six citations that contain the relationship

between CA2 and Chagas. Out of these six, SemRep identified

three of them as “ASSOCIATED WITH”, two of them as,

“AUGMENTS”, and one as “AFFECTS” relationship. The

actual predicate type also plays a critical role in our ranking

schema.

As a final step, we rely on curated resources to further

refine our ranking approach. While NLP assertions do play a

role in identifying biologically related drug-disease pairs, it is

quite pertinent to take advantage of the existing curated

resources to further filter the irrelevant pairs. There are

numerous resources such as UMLS and Comparative

Toxicogenomics Database (CTD) [34], which catalog drug-

disease relationships. In this study we used UMLS as the gold

standard to evaluate the effect of intermediate predicates in

generating a true drug-disease pair. To identify already known

drug-disease relations in our generated list, we cross-referenced

the generated list of drug-disease pairs with UMLS drug-

disease relations. To limit our study to drug-repositioning

candidates, we only considered drug-disease relations in

UMLS, which their type is “May_be_treated_by”. As

SemMedDB stores Concept Unique Identifier (CUI), assigned

by UMLS to each biomedical entity, we used CUIs to cross-

reference our list and UMLS.

In this study we explored two different ranking approaches

based on two assumptions. In the first approach we considered

the predicate of drug-gene and gene-disease to be independent

of each other, while in the second we considered the

dependencies between the predicates of the two pairs.

A. Ranking based on predicate independence

In this approach, we had the fundamental assumption that

the predicates of drug-gene pair and gene-disease pair are

independent of each other while estimating their relevance in

pairing a drug with disease. Besides this assumption we also

had other following criteria for scoring the relevance of drug-

disease pair:

1. Percentage frequency of the individual predicates in

drug-gene (PpDG)1 and gene-disease (PpGD) relations

was one major criterion for determining the relevancy

of the drug-disease pair. We further refined this notion

by considering only those predicates, whose drug-

disease pair is represented in UMLS drug-disease

relations as a “May_be_treated_by” relations. We

showed the refined version with (PpDG-U)2 and (PpGD-U).

The one issue with the choice of UMLS based

validation is the possibility of lag in the curation of

drug-disease relation in the UMLS. There is a

possibility in eliminating lot of potentially relevant

predicates in identifying the right drug-disease pairs.

2. As an additional validation step to normalize the

percentages, we also determined the respective

percentage frequency of the drug-gene (PpDG-S) and

gene-disease (PpGD-S)3 predicates in the literature mined

relations in SemMedDB.

3. The raw score for a given drug-disease pair inferred

from the individual pair (drug-gene (DG) and gene-

disease (GD)) is calculated as per the equation 1.

  log

log









 1

Where n shows the number of semantic predications

between the drug-gene extracted from literature and m the

same number for the gene-disease relationship. Figure 1 shows

the steps of calculating the independence scores. For example,

consider the above drug-disease pair (Strepsils - Chagas). In

order to calculate the score for the pair, we added the ratio of

log scores of the individual predicates as outlined in equation

(1). For this example, we added the score of the only predicate,

“INTERACTS WITH” that defines the relationship between

drug-gene (Example 1) pair with the score of all six predicates

between gene-disease pair (Example 2). As mentioned before,

more than one predicate may qualify a drug-gene/gene-disease

pair, which we consider the summation of the ratio of log

scores of all of them. At this point we do not consider the

semantic relatedness of the predicates while calculating their

scores.

1 The first “P” stands for percentage and the second one stands for predicate

and “DG” stands for “Drug-Gene”.

2 “U” stands for “UMLS”.

3 In this notation, “S” stands for “SemMedDB”.

B. Ranking based on predicate inter-dependence

In the second ranking, the predicates of drug-gene pair and

gene-disease pair are dependent on each other while estimating

their relevance in pairing a drug with disease. Here are the

steps:

1. Instead of the individual Percentage frequency we

compute the Percentage Frequency of the combined

predicates between the drug-gene and drug-disease

pair (PpDG-pGD). We limit this calculation to only those

drug-disease pairs that are represented in UMLS drug-

disease relations, which showed with this notation

(PpDG-pGD-U).

Fig. 1. Steps of calculating independence scores

2. Our approach to normalizing the percentages, were

very similar to the earlier one except that we used the

percentage frequency of the combined predicates from

SemMedDB (PpDG-pGD-S) as outlined in the following

equation:

 # #

∑##

,



 2

where n and m presents the number of all different

predicates between drug-gene and gene-disease,

respectively. # shows the frequency of the drug-

gene predicate in SemMedDB, and # shows the

frequency of gene-disease.

3. Using the percentage frequency we calculated the raw

score for a given generated drug-disease pair as given

in the following equation (3).

 log





 3

where n presents the number of combinations which

generate that drug-disease pair. For example, if there

are 2 different predicates between drug-gene and 3

different predicates between gene-disease, 6 different

combinations can generate the same drug-disease pair.

C. Validation and evaluation

To validate our ranking methods, we used two resources,

CTD and Medline citations. CTD contains curated drug-disease

relations. We cross-referenced the ranked drug-disease pairs

with CTD and studied existence of any correlation between our

ranking methods and being true drug-disease pairs (existence in

CTD). Also, we calculated the percentage of top ranked drug-

disease pairs, which appeared in CTD and compared it with the

same percentage for low ranked pairs. These results are used to

validate and compare our two methods, predicate independence

and inter-dependence. In the second step, we measured the

correlation between the score assigned to each generated pair

and co-occurrence of drug-disease in Medline abstracts. The

logic behind this validation is that more co-occurred drug-

disease pairs, more likely have relationship and our methods

should assign higher score to those pairs. In order to count the

number of co-occurrence of drug-disease pairs, we indexed all

Medline abstracts via ElasticSearch and searched the pairs in

titles and abstracts. Then the percentage of top ranked pairs,

which co-occurred in Medline citations are calculated and

compared with the low ranked pairs. As the last step of

validation, we reviewed top 10 ranked drug-disease pairs

manually and investigated the type of their relationship.

IV. RESULTS

All drug-gene and gene-disease semantic predications were

retrieved from SemMedDB. There were 19,993 drug-gene

pairs (12,666 unique) and 59,945 gene-disease pairs (33,489

unique). When we applied Swanson’s model to these pairs, it

resulted in the generation of 653,108 potential drug-disease

pairs (245,102 unique). All generated drug-disease pairs were

further cross-referenced with UMLS and 1,204 of the

generated pairs were found in this resource. Using these 1,204

pairs and following our ranking methods, independent and

inter-dependence, we calculated percentage frequency related

to each drug-gene and gene-disease predicate. From there and

by eq. 1 and eq. 3, two scores (for each method) are calculated

for the generated potential drug-disease pairs.

To validate our ranking methods, we calculated the

correlation between the scores and the number of co-

occurrence of pairs in Medline abstracts. The results showed

that inter-dependence method is correlated with co-occurrence

of the drug-disease pairs in Medline abstracts (using T-test, P-

value < 2.2e-16). We calculated the percentage of high and low

ranked drug-disease pairs, which co-occurred in Medline

abstracts. Figure 2 shows this comparison for the both

methods. In this figure Y-axis shows the percentage of pairs

co-occurred in Medline and X-axis shows the number of top

ranked pairs.

Fig. 2. Comparison of the percentage of high and low ranked drug-disease

pairs co-occurred in Medline abstracts.

We did the similar experiment on the percentage of

appearance of high and low ranked drug-disease pairs in CTD.

Figure 3 illustrates the result of this experiment.

Fig. 3. Comparison of the percentage of high and low ranked drug-disease

pairs appeared in CTD.

Table I includes the result of our manual investigation of

top ten ranked drug-disease pairs.

TABLE I. TOP TEN RANKED DRUG DISEASE PAIRS

Drug

Disease Type Reference

Omalizumab Asthma Treatment [35]

Nifedipine Tetanus Treatment Wikipedia

Nifedipine Ischemia Treatment [36]

Omalizumab Dermatitis, atopic Treatment [37]

Nifedipine Heart failure Treatment [38]

Nifedipine Renal tubular disorder Relation [39]

Calan Hypertensive disease Treatment

Airol Asthma -

Ezetimibe Coronary heart disease Treatment [40]

Cyclosporine Asthma Treatment [41]

V. DISCUSSIO N

In our study we found that interdependence based ranking

of drug-disease pairs (especially the top ranked pairs) identified

through LBD had strong literature evidence than the pairs

ranked using independent ranking approach. Figure 2 shows

that 82% of the top 100 drug-disease pairs, ranked using inter-

dependence approach had strong literature evidence. These

pairs were found to co-occur within a single abstract in

Medline. However there is a noticeable decline in the

percentage of pairs as we go below 100. We observed that pairs

ranked using independent ranking approach had relatively

lower co-occurrence evidence in the biomedical literature.

We observed a similar trend when we evaluated the

confidence levels of the top ranked pairs identified using both

approaches against a curated knowledgebase such as CTD.

Figure 3 further confirms the distinct advantage of the inter-

dependence ranking over the independence ranking. Finally,

manual evaluation of top ten pairs ranked by inter-dependence

approach revealed that the pairs have some biological

significance based on expert judgment. This indicates that

inter-dependence method has higher chance in identifying

biologically relevant drug-disease pairs. We also like to draw

the attention to the fact that nine out of ten top ranked drug-

disease pairs are valid relations, which belong to DRUG-

TREATS-DISESE relationship category.

There are two main limitations in this study. First, we did

not have a gold standard of drug-disease treatment pairs to

evaluate the performance of our approaches. Evaluation of our

system against a gold standard alone will help us to accurately

benchmark its actual performance. Second, there is an inherent

limitation either in the choice of resource (choice of CTD as a

resource) or the measure (literature co-occurrence) to evaluate

the confidence levels of top ranked drug-disease pairs

identified by the system. CTD though a manually curated

resource do not annotate the type of relationship between the

drug and disease. Hence while evaluating our system against

CTD we ignored the semantic predications extracted by the

system, which would have resulted in loss of valuable

information. Alternatively, we relied on document level co-

occurrence in literature as a measure to validate drug-disease

relationship. Document level co-occurrence of a relation is not

a strong indicator for a valid drug-disease relation. There are

also limitations in our ranking methods. Using some rigorous

statistical validation may further refine the notion of semantic

predication as evidence for relation between biomedical entities

for LBD. In future, we plan to create a gold standard of drug-

disease treatment relations to evaluate our methods more

accurately and compare our methods with other approaches.

We intend to improve our methods to be able to determine a

threshold score, which the pairs below that score considered as

false positive candidates.

VI. CONCLUSION

In this study, we proposed and evaluated two methods for

ranking and prioritizing potential drug-repositioning

discoveries extracted from literature. We used drug-gene and

gene-disease predications, extracted by SemRep, to generate

potential drug-disease pairs. The predicates between dug-gene

and gene-disease pairs are used to rank the generated drug-

disease pairs. Our results showed using combination of drug-

gene and gene-disease predicates can be a metric to rank more

likely true drug-repositioning candidates higher in the list.

ACKNOWLEDGMENT

This study was made possible by National Institute of

Health R01 GM102282-01.

REFERENCES

[1] C. P. Adams and V. V. Brantner, “Estimating The Cost Of New Drug

Development: Is It Really $802 Million?,” Health Aff, vol. 25, no. 2,

pp. 420–428, Mar. 2006.

[2] J. Gilbert, P. Henske, and A. Singh, Rebuilding Big Pharma’s

business model. In Vivo, 2003.

[3] E. L. Tobinick, “The value of drug repositioning in the current

pharmaceutical market,” Drug News Perspect., Mar. 2009.

[4] G. Jin and S. T. C. Wong, “Toward better drug repositioning:

prioritizing and integrating existing methods into efficient pipelines,”

Drug Discovery Today, vol. 19, no. 5, pp. 637–644, May 2014.

[5] H. Xu, M. C. Aldrich, Q. Chen, H. Liu, N. B. Peterson, Q. Dai, M.

Levy, A. Shah, X. Han, X. Ruan, M. Jiang, Y. Li, J. S. Julien, J.

Warner, C. Friedman, D. M. Roden, and J. C. Denny, “Validating

drug repurposing signals using electronic health records: a case study

of metformin associated with reduced cancer mortality,” J Am Med

Inform Assoc, Jul. 2014.

[6] P. Sanseau, P. Agarwal, M. R. Barnes, T. Pastinen, J. B. Richards, L.

R. Cardon, and V. Mooser, “Use of genome-wide association studies

for drug repositioning,” Nat Biotech, vol. 30, no. 4, Apr. 2012.

[7] M. Rastegar-Mojarad, Z. Ye, J. M. Kolesar, S. J. Hebbring, and S. M.

Lin, “Opportunities for drug repositioning from phenome-wide

association studies,” Nat Biotech, vol. 33, no. 4, pp. 342–345, Apr.

2015.

[8] C.-P. Wei, K.-A. Chen, and L.-C. Chen, “Mining Biomedical

Literature and Ontologies for Drug Repositioning Discovery,” in

Advances in Knowledge Discovery and Data Mining, V. S. Tseng, T.

B. Ho, Z.-H. Zhou, A. L. P. Chen, and H.-Y. Kao, Eds. Springer

International Publishing, 2014, pp. 373–384.

[9] C. Andronis, A. Sharma, V. Virvilis, S. Deftereos, and A. Persidis,

“Literature mining, ontologies and information visualization for drug

repurposing,” Brief. Bioinformatics, vol. 12, no. 4, Jul. 2011.

[10] D. R. Swanson, “Migraine and magnesium: eleven neglected

connections,” Perspect. Biol. Med., vol. 31, no. 4, pp. 526–557, 1988.

[11] T. C. Rindflesch and M. Fiszman, “The interaction of domain

knowledge and linguistic structure in natural language processing:

interpreting hypernymic propositions in biomedical text,” J Biomed

Inform, vol. 36, no. 6, pp. 462–477, Dec. 2003.

[12] T. Cohen, D. Widdows, R. W. Schvaneveldt, P. Davies, and T. C.

Rindflesch, “Discovering discovery patterns with predication-based

Semantic Indexing,” Journal of Biomedical Informatics, vol. 45, no. 6,

pp. 1049–1065, Dec. 2012.

[13] C. B. Ahlers, D. Hristovski, H. Kilicoglu, and T. C. Rindflesch,

“Using the Literature-Based Discovery Paradigm to Investigate Drug

Mechanisms,” AMIA Annu Symp Proc, vol. 2007, pp. 6–10, 2007.

[14] O. Bodenreider, “The Unified Medical Language System (UMLS):

integrating biomedical terminology,” Nucl. Acids Res., vol. 32, no.

suppl 1, pp. D267–D270, Jan. 2004.

[15] T. T. Ashburn and K. B. Thor, “Drug repositioning: identifying and

developing new uses for existing drugs,” Nat Rev Drug Discov, vol. 3,

no. 8, pp. 673–683, Aug. 2004.

[16] J. T. Dudley, T. Deshpande, and A. J. Butte, “Exploiting drug-disease

relationships for computational drug repositioning,” Brief.

Bioinformatics, vol. 12, no. 4, pp. 303–311, Jul. 2011.

[17] M. R. Hurle, L. Yang, Q. Xie, D. K. Rajpal, P. Sanseau, and P.

Agarwal, “Computational Drug Repositioning: From Data to

Therapeutics,” Clinical Pharmacology & Therapeutics, Apr. 2013.

[18] D. Hristovski, C. Friedman, T. C. Rindflesch, and B. Peterlin,

“Exploiting Semantic Relations for Literature-Based Discovery,”

AMIA Annu Symp Proc, vol. 2006, pp. 349–353, 2006.

[19] M. Yetisgen-Yildiz and W. Pratt, “A new evaluation methodology for

literature-based discovery systems,” J Biomed Inform, vol. 42, no. 4,

pp. 633–643, Aug. 2009.

[20] M. Weeber, H. Klein, L. T. W, J. Berg, and D. R. S. Has, “Using

concepts in literature-based discovery: Simulating Swanson’s

Raynaud-fish oil and migrainemagnesium discoveries,” J. Am. Soc.

Inf. Sci. Tech, pp. 548–557, 2001.

[21] N. R. Smalheiser and D. R. Swanson, “Using ARROWSMITH: a

computer-assisted approach to formulating and assessing scientific

hypotheses,” Comput Methods Programs Biomed, vol. 57, no. 3, pp.

149–153, Nov. 1998.

[22] D. Hristovski, B. Peterlin, and S. Dzeroski, “Literature-based

Discovery Support System and Its Application to Disease Gene

Identification,” Proc AMIA Symp, p. 928, 2001.

[23] M. D. Gordon and S. Dumais, “Using Latent Semantic Indexing for

Literature Based Discovery,” J. Am. Soc. Inf. Sci., vol. 49, no. 8, pp.

674–685, Jun. 1998.

[24] R. J. Cole and P. D. Bruza, “A Bare Bones Approach to Literature-

Based Discovery: An Analysis of the Raynaud’s/Fish-Oil and

Migraine-Magnesium Discoveries in Semantic Space,” in Discovery

Science, A. Hoffmann, H. Motoda, and T. Scheffer, Eds. Springer

Berlin Heidelberg, 2005, pp. 84–98.

[25] T. Cohen, R. Schvaneveldt, and D. Widdows, “Reflective Random

Indexing and indirect inference: A scalable method for discovery of

implicit connections,” Journal of Biomedical Informatics, vol. 43, no.

2, pp. 240–256, Apr. 2010.

[26] H. Kilicoglu, D. Shin, M. Fiszman, G. Rosemblat, and T. C.

Rindflesch, “SemMedDB: a PubMed-scale repository of biomedical

semantic predications,” Bioinformatics, vol. 28, Dec. 2012.

[27] M. J. Cairelli, C. M. Miller, M. Fiszman, T. E. Workman, and T. C.

Rindflesch, “Semantic MEDLINE for discovery browsing: using

semantic predications and the literature-based discovery paradigm to

elucidate a mechanism for the obesity paradox,” AMIA Annu Symp

Proc, vol. 2013, pp. 164–173, 2013.

[28] M. Rastegar-Mojarad, D. Li, and H. Liu, “Operationalizing Semantic

Medline for meeting the information needs at point of care,” presented

at the AMIA Clinical Research Informatics Summit, 2015.

[29] M. Rastegar-Mojarad, R. Komandur Elayavilli, D. Li, and H. Liu,

“Assessing the Need of Discourse-Level Analysis in Identifying

Evidences for Drug-Disease Relations in Scientific Literature,”

presented at the Medinfo, 2015.

[30] J. D. Wren, “Extending the mutual information measure to rank

inferred literature relationships,” BMC Bioinformatics, vol. 5, no. 1, p.

145, Oct. 2004.

[31] W. Pratt and M. Yetisgen-Yildiz, “LitLinker: Capturing Connections

Across the Biomedical Literature,” in Proceedings of the 2Nd

International Conference on Knowledge Capture, New York, NY,

USA, 2003, pp. 105–112.

[32] M. Yetisgen-Yildiz and W. Pratt, “Using statistical and knowledge-

based approaches for literature-based discovery,” J Biomed Inform,

vol. 39, no. 6, pp. 600–611, Dec. 2006.

[33] D. R. Swanson, N. R. Smalheiser, and V. I. Torvik, “Ranking indirect

connections in literature-based discovery: The role of Medical Subject

HEADINGS (MeSH),” J. AM. SOC. INFORMATION SCI.

TECHNOL, vol. 57, pp. 1427–1439, 2006.

[34] A. P. Davis, C. J. Grondin, K. Lennon-Hopkins, C. Saraceni-Richards,

D. Sciaky, B. L. King, T. C. Wiegers, and C. J. Mattingly, “The

Comparative Toxicogenomics Database’s 10th year anniversary:

update 2015,” Nucleic Acids Res., Oct. 2014.

[35] R. C. Strunk and G. R. Bloomberg, “Omalizumab for Asthma,” New

England Journal of Medicine, vol. 354, no. 25, Jun. 2006.

[36] R. A. Kloner, “Nifedipine in Ischemic Heart Disease,” Circulation,

vol. 92, no. 5, pp. 1074–1078, Sep. 1995.

[37] M. C. Fernández-Antón Martínez, V. Leis-Dosil, F. Alfageme-Roldán,

A. Paravisini, S. Sánchez-Ramón, and R. Suárez Fernández,

“Omalizumab for the treatment of atopic dermatitis,” Actas

Dermosifiliogr, vol. 103, no. 7, pp. 624–628, Sep. 2012.

[38] C. V. Leier, T. J. Patrick, J. Hermiller, K. D. Pacht, P. Huss, R. D.

Magorien, and D. V. Unverferth, “Nifedipine in congestive heart

failure: effects on resting and exercise hemodynamics and regional

blood flow,” Am. Heart J., vol. 108, no. 6, pp. 1461–1468, Dec. 1984.

[39] J. R. Diamond, J. Y. Cheung, and L. S. Fang, “Nifedipine-induced

renal dysfunction. Alterations in renal hemodynamics,” Am. J. Med.,

vol. 77, no. 5, pp. 905–909, Nov. 1984.

[40] C. M. Rotella, A. Zaninelli, C. Le Grazie, M. E. Hanson, and G. F.

Gensini, “Ezetimibe/simvastatin vs simvastatin in coronary heart

disease patients with or without diabetes,” Lipids Health Dis, vol. 9, p.

80, Jul. 2010.

[41] E. Nizankowska, J. Soja, G. Pinis, G. Bochenek, K. Sładek, B.

Domagała, A. Pajak, and A. Szczeklik, “Treatment of steroid-

dependent bronchial asthma with cyclosporin,” Eur. Respir. J., vol. 8,

no. 7, pp. 1091–1099, Jul. 1995.

Literature-based discovery approaches for evidence-based healthcare: a systematic review

Article

Oct 2021

PurposeLiterature-Based Discovery (LBD) is a text mining technique used to generate novel hypotheses from vast amounts of literature sources, by identifying links between concepts from disparate sources. One of the main areas where it has been predominantly applied is the healthcare domain, whereby promising results, in the form of novel hypotheses, have been reported. The purpose of this work was to conduct a systematic literature review of recent publications on LBD in the healthcare domain in order to assess the trends in the approaches used and to identify issues and challenges for such systems.Methods The review was conducted following the principles of the Kitchenham method. The selected studies have been scrutinized and the derived findings have been reported following the PRISMA guidelines.ResultsThe review results reveal useful information regarding the application areas, the data sources considered, the approaches used, the performance in terms of accuracy and reliability and future research challenges. The results of this review will be beneficial to LBD researchers and other stakeholders in the healthcare domain, by providing them with useful insights on the approaches to adopt, data sources to consider, evaluation model to use and challenges to reflect on.Conclusion The synthesis of the results of this work has shed light on recent issues and challenges that drive new LBD models and provides avenues for their application in other diverse areas in the healthcare domain. To the best of our knowledge, no such recent review has been conducted.

Using Literature Based Discovery to Gain Insights Into the Metabolomic Processes of Cardiac Arrest

Article

Full-text available

Jun 2021

In this paper, we describe how we applied LBD techniques to discover lecithin cholesterol acyltransferase (LCAT) as a druggable target for cardiac arrest. We fully describe our process which includes the use of high-throughput metabolomic analysis to identify metabolites significantly related to cardiac arrest, and how we used LBD to gain insights into how these metabolites relate to cardiac arrest. These insights lead to our proposal (for the first time) of LCAT as a druggable target; the effects of which are supported by in vivo studies which were brought forth by this work. Metabolites are the end product of many biochemical pathways within the human body. Observed changes in metabolite levels are indicative of changes in these pathways, and provide valuable insights toward the cause, progression, and treatment of diseases. Following cardiac arrest, we observed changes in metabolite levels pre- and post-resuscitation. We used LBD to help discover diseases implicitly linked via these metabolites of interest. Results of LBD indicated a strong link between Fish Eye disease and cardiac arrest. Since fish eye disease is characterized by an LCAT deficiency, it began an investigation into the effects of LCAT and cardiac arrest survival. In the investigation, we found that decreased LCAT activity may increase cardiac arrest survival rates by increasing ω-3 polyunsaturated fatty acid availability in circulation. We verified the effects of ω-3 polyunsaturated fatty acids on increasing survival rate following cardiac arrest via in vivo with rat models.

Computationally repurposing drugs for breast cancer subtypes using a network-based approach

Article

Full-text available

Apr 2022
BMC BIOINFORMATICS

‘De novo’ drug discovery is costly, slow, and with high risk. Repurposing known drugs for treatment of other diseases offers a fast, low-cost/risk and highly-efficient method toward development of efficacious treatments. The emergence of large-scale heterogeneous biomolecular networks, molecular, chemical and bioactivity data, and genomic and phenotypic data of pharmacological compounds is enabling the development of new area of drug repurposing called ‘in silico’ drug repurposing, i.e., computational drug repurposing (CDR). The aim of CDR is to discover new indications for an existing drug (drug-centric) or to identify effective drugs for a disease (disease-centric). Both drug-centric and disease-centric approaches have the common challenge of either assessing the similarity or connections between drugs and diseases. However, traditional CDR is fraught with many challenges due to the underlying complex pharmacology and biology of diseases, genes, and drugs, as well as the complexity of their associations. As such, capturing highly non-linear associations among drugs, genes, diseases by most existing CDR methods has been challenging. We propose a network-based integration approach that can best capture knowledge (and complex relationships) contained within and between drugs, genes and disease data. A network-based machine learning approach is applied thereafter by using the extracted knowledge and relationships in order to identify single and pair of approved or experimental drugs with potential therapeutic effects on different breast cancer subtypes. Indeed, further clinical analysis is needed to confirm the therapeutic effects of identified drugs on each breast cancer subtype.

Expanding a database-derived biomedical knowledge graph via multi-relation extraction from biomedical abstracts

Article

Full-text available

Oct 2022

Background Knowledge graphs support biomedical research efforts by providing contextual information for biomedical entities, constructing networks, and supporting the interpretation of high-throughput analyses. These databases are populated via manual curation, which is challenging to scale with an exponentially rising publication rate. Data programming is a paradigm that circumvents this arduous manual process by combining databases with simple rules and heuristics written as label functions, which are programs designed to annotate textual data automatically. Unfortunately, writing a useful label function requires substantial error analysis and is a nontrivial task that takes multiple days per function. This bottleneck makes populating a knowledge graph with multiple nodes and edge types practically infeasible. Thus, we sought to accelerate the label function creation process by evaluating how label functions can be re-used across multiple edge types. Results We obtained entity-tagged abstracts and subsetted these entities to only contain compounds, genes, and disease mentions. We extracted sentences containing co-mentions of certain biomedical entities contained in a previously described knowledge graph, Hetionet v1. We trained a baseline model that used database-only label functions and then used a sampling approach to measure how well adding edge-specific or edge-mismatch label function combinations improved over our baseline. Next, we trained a discriminator model to detect sentences that indicated a biomedical relationship and then estimated the number of edge types that could be recalled and added to Hetionet v1. We found that adding edge-mismatch label functions rarely improved relationship extraction, while control edge-specific label functions did. There were two exceptions to this trend, Compound-binds-Gene and Gene-interacts-Gene, which both indicated physical relationships and showed signs of transferability. Across the scenarios tested, discriminative model performance strongly depends on generated annotations. Using the best discriminative model for each edge type, we recalled close to 30% of established edges within Hetionet v1. Conclusions Our results show that this framework can incorporate novel edges into our source knowledge graph. However, results with label function transfer were mixed. Only label functions describing very similar edge types supported improved performance when transferred. We expect that the continued development of this strategy may provide essential building blocks to populating biomedical knowledge graphs with discoveries, ensuring that these resources include cutting-edge results.

Contexts and contradictions: a roadmap for computational drug repurposing with knowledge inference

Article

Full-text available

Jul 2022

The cost of drug development continues to rise and may be prohibitive in cases of unmet clinical need, particularly for rare diseases. Artificial intelligence-based methods are promising in their potential to discover new treatment options. The task of drug repurposing hypothesis generation is well-posed as a link prediction problem in a knowledge graph (KG) of interacting of drugs, proteins, genes and disease phenotypes. KGs derived from biomedical literature are semantically rich and up-to-date representations of scientific knowledge. Inference methods on scientific KGs can be confounded by unspecified contexts and contradictions. Extracting context enables incorporation of relevant pharmacokinetic and pharmacodynamic detail, such as tissue specificity of interactions. Contradictions in biomedical KGs may arise when contexts are omitted or due to contradicting research claims. In this review, we describe challenges to creating literature-scale representations of pharmacological knowledge and survey current approaches toward incorporating context and resolving contradictions.

Ontology-based identification and prioritization of candidate drugs for epilepsy from literature

Article

Full-text available

Jan 2022

Background Drug repurposing can improve the return of investment as it finds new uses for existing drugs. Literature-based analyses exploit factual knowledge on drugs and diseases, e.g. from databases, and combine it with information from scholarly publications. Here we report the use of the Open Discovery Process on scientific literature to identify non-explicit ties between a disease, namely epilepsy, and known drugs, making full use of available epilepsy-specific ontologies. Results We identified characteristics of epilepsy-specific ontologies to create subsets of documents from the literature; from these subsets we generated ranked lists of co-occurring neurological drug names with varying specificity. From these ranked lists, we observed a high intersection regarding reference lists of pharmaceutical compounds recommended for the treatment of epilepsy. Furthermore, we performed a drug set enrichment analysis, i.e. a novel scoring function using an adaptive tuning parameter and comparing top-k ranked lists taking into account the varying length and the current position in the list. We also provide an overview of the pharmaceutical space in the context of epilepsy, including a final combined ranked list of more than 70 drug names. Conclusions Biomedical ontologies are a rich resource that can be combined with text mining for the identification of drug names for drug repurposing in the domain of epilepsy. The ranking of the drug names related to epilepsy provides benefits to patients and to researchers as it enables a quick evaluation of statistical evidence hidden in the scientific literature, useful to validate approaches in the drug discovery process.

Temporal attention networks for biomedical hypothesis generation

Article

Feb 2024
J BIOMED INFORM

Analysis of Drug Repositioning and Prediction Techniques: A Concise Review

Article

Mar 2022
CURR TOP MED CHEM

High cost and risks are common issues in traditional drug research and development. Usually, it takes a long time to research and develop a drug, the effects of which are limited to relatively few targets. At present, studies are aiming to identify unknown new uses for existing drugs. Drug repositioning enables drugs to be quickly launched into clinical practice at a low cost because they have undergone clinical safety testing during the development process, which can greatly reduce costs and the risks of failed development. In addition to existing drugs with known indications, drugs that were shelved because of clinical trial failure can also be options for repositioning. In fact, many widely used drugs are identified via drug repositioning at present. This article reviews some popular research areas in the field of drug repositioning and briefly introduces the advantages and disadvantages of these methods, aiming to provide useful insights into future development in this field.

Drug Repurposing for COVID-19 via Knowledge Graph Completion

Article

Full-text available

Oct 2020

Objective: To discover candidate drugs to repurpose for COVID-19 using literature-derived knowledge and knowledge graph completion methods. Methods: We propose a novel, integrative, and neural network-based literature-based discovery (LBD) approach to identify drug candidates from both PubMed and COVID-19-focused research literature. Our approach relies on semantic triples extracted using SemRep (via SemMedDB). We identified an informative subset of semantic triples using filtering rules and an accuracy classifier developed on a BERT variant, and used this subset to construct a knowledge graph. Five SOTA, neural knowledge graph completion algorithms were used to predict drug repurposing candidates. The models were trained and assessed using a time slicing approach and the predicted drugs were compared with a list of drugs reported in the literature and evaluated in clinical trials. These models were complemented by a discovery pattern-based approach. Results: Accuracy classifier based on PubMedBERT achieved the best performance (F1= 0.854) in classifying semantic predications. Among five knowledge graph completion models, TransE outperformed others (MR = 0.923, Hits@1=0.417). Some known drugs linked to COVID-19 in the literature were identified, as well as some candidate drugs that have not yet been studied. Discovery patterns enabled generation of plausible hypotheses regarding the relationships between the candidate drugs and COVID-19. Among them, five highly ranked and novel drugs (paclitaxel, SB 203580, alpha 2-antiplasmin, pyrrolidine dithiocarbamate, and butylated hydroxytoluene) with their mechanistic explanations were further discussed. Conclusion: We show that an LBD approach can be feasible for discovering drug candidates for COVID-19, and for generating mechanistic explanations. Our approach can be generalized to other diseases as well as to other clinical questions.

Drug Repurposing for COVID-19 via Knowledge Graph Completion

Article

Feb 2021

Objective To discover candidate drugs to repurpose for COVID-19 using literature-derived knowledge and knowledge graph completion methods. Methods We propose a novel, integrative, and neural network-based literature-based discovery (LBD) approach to identify drug candidates from PubMed and other COVID-19-focused research literature. Our approach relies on semantic triples extracted using SemRep (via SemMedDB). We identified an informative and accurate subset of semantic triples using filtering rules and an accuracy classifier developed on a BERT variant. We used this subset to construct a knowledge graph, and applied five state-of-the-art, neural knowledge graph completion algorithms (TransE, RotatE, DistMult, ComplEx, and STELP) to predict drug repurposing candidates. The models were trained and assessed using a time slicing approach and the predicted drugs were compared with a list of drugs reported in the literature and evaluated in clinical trials. These models were complemented by a discovery pattern-based approach. Results Accuracy classifier based on PubMedBERT achieved the best performance (F1 = 0.854) in classifying semantic predications. Among five knowledge graph completion models, TransE outperformed others (MR = 0.923, [email protected] = 0.417). Some known drugs linked to COVID-19 in the literature were identified, as well as others that have not yet been studied. Discovery patterns enabled identification of additional candidate drugs and generation of plausible hypotheses regarding the links between the candidate drugs and COVID-19. Among them, five highly ranked and novel drugs (paclitaxel, SB 203580, alpha 2-antiplasmin, metoclopramide, and oxymatrine) and the mechanistic explanations for their potential use are further discussed. Conclusion We showed that a LBD approach can be feasible not only for discovering drug candidates for COVID-19, but also for generating mechanistic explanations. Our approach can be generalized to other diseases as well as to other clinical questions. Source code and data are available at https://github.com/kilicogluh/lbd-covid.

Assessing the Need of Discourse-Level Analysis in Identifying Evidence of Drug-Disease Relations in Scientific Literature

Conference Paper

Full-text available

Aug 2015

Relation extraction typically involves extraction of relations between two or more entities occurring within a single or multiple sentences. The current state of the art techniques predominantly involve the extraction of relations only from a single sentence (i.e., sentence-level relation extraction). In this study, we investigated the significance of extracting information from multiple sentences specifically in the context of drug-disease relation discovery. We used multiple resources such as Semantic Medline, a literature based resource, and Medline search (for filtering spurious results) and inferred 8,772 potential drug-disease pairs. Our analysis revealed that 6,450 (73.5%) of the 8,772 potential drug-disease relations did not occur in a single sentence. Moreover, only 537 of the drug-disease pairs matched the curated gold standard in Comparative Toxicogenomics Database (CTD), a trusted resource for drug-disease relations. Among the 537, nearly 75% (407) of the drug-disease pairs occur in multiple sentences. Our analysis revealed that the drug-disease pairs inferred from Semantic Medline or retrieved from CTD could be extracted from multiple sentences in the literature. This highlights the significance of the need of discourse-level analysis in extracting the relations from biomedical literature.

Assessing the Need of Discourse-Level Analysis in Identifying Evidence of Drug-Disease Relations in Scientific Literature

Conference Paper

Full-text available

Aug 2015
Stud Health Tech Informat

Relation extraction typically involves the extraction of relations between two or more entities occurring within a single or multiple sentences. In this study, we investigated the significance of extracting information from multiple sentences specifically in the context of drug-disease relation discovery. We used multiple resources such as Semantic Medline, a literature based resource, and Medline search (for filtering spurious results) and inferred 8,772 potential drug-disease pairs. Our analysis revealed that 6,450 (73.5%) of the 8,772 potential drug-disease relations did not occur in a single sentence. Moreover, only 537 of the drug-disease pairs matched the curated gold standard in Comparative Toxicogenomics Database (CTD), a trusted resource for drug-disease relations. Among the 537, nearly 75% (407) of the drug-disease pairs occur in multiple sentences. Our analysis revealed that the drug-disease pairs inferred from Semantic Medline or retrieved from CTD could be extracted from multiple sentences in the literature. This highlights the significance of the need of discourse-level analysis in extracting the relations from biomedical literature.

Operationalizing Semantic Medline for meeting the information needs at point of care

Conference Paper

Full-text available

Mar 2015

Scientific literature is one of the popular resources for providing decision support at point of care. It is highly desirable to bring the most relevant literature to support the evidence-based clinical decision making process. Motivated by the recent advance in semantically enhanced information retrieval, we have developed a system, which aims to bring semantically enriched literature, Semantic Medline, to meet the information needs at point of care. This study reports our work towards operationalizing the system for real time use. We demonstrate that the migration of a relational database implementation to a NoSQL (Not only SQL) implementation significantly improves the performance and makes the use of Semantic Medline at point of care decision support possible.

Using latent semantic indexing for literature based discovery

Article

Jan 1998

Latent semantic indexing (LSI) is a statistical technique for improving information retrieval effectiveness. Here, we use LSI to assist in literature-based discoveries. The idea behind literature-based discoveries is that different authors have already published certain underlying scientific ideas that, when taken together, can be connected to hypothesize a new discovery, and that these connections can be made by exploring the scientific literature. We explore latent semantic indexing's effectiveness on two discovery processes: uncovering "nearby" relationships that are necessary to initiate the literature based discovery process; and discovering more distant relationships that may genuinely generate new discovery hypotheses.

LitLinker: capturing connections across the biomedical literature

Conference Paper

Jan 2003

Mining Biomedical Literature and Ontologies for Drug Repositioning Discovery

Conference Paper

May 2014

Drug development is time-consuming, costly, and risky. Approximate 80% to 90% of drug development projects fail before they ever get into clinical trials. To reduce the high risk of failure for drug development, pharmaceutical companies are exploring the drug repositioning approach for drug development. Previous studies have shown the feasibility of using computational methods to help extract plausible drug repositioning candidates, but they all encountered some limitations. In this study, we propose a novel drug-repositioning discovery method that takes into account multiple information sources, including more than 18,000,000 biomedical research articles and some existing ontologies that cover detailed relations between drugs, proteins and diseases. We design two experiments to evaluate our proposed drug repositioning discovery method. Overall, our evaluation results demonstrate the capability and superiority of our proposed drug repositioning method for discovering potential, novel drug-disease relationships.

Rebuilding big pharma's business model. the business and medicine report

Article

Jan 2003

Nifedipine-induced renal dysfunction

Article

Nov 1984
AM J MED

Nifedipine caused acute, reversible deterioration in renal function in four patients with chronic renal insufficiency. The absence of hypotension, clinical course, benign urinary sediments, and normal results of renal ultrasound examinations excluded acute tubular necrosis, pyelonephritis, interstitial nephritis, obstructive uropathy, and acute glomerulonephritis. It is postulated that this slow calcium channel blocker produced deleterious intrarenal hemodynamic alterations in the setting of moderate to severe renal functional impairment. Nifedipine may alter renal function by blocking calcium entry into renal vascular smooth muscle, thereby reducing the efficacy of vasoconstrictor hormones in regulation of renal blood flow and glomerular filtration rate. An alternative explanation is that nifedipine may inhibit the compensatory synthesis of vasodilatory prostaglandin E2 analogous to the clinical observation of acute deterioration in renal function by nonsteroidal anti-inflammatory drugs in patients with pre-existing renal insufficiency. These observations suggest that clinicians should monitor renal function closely and exercise caution when administering nifedipine to patients with underlying renal insufficiency.

The value of drug repositioning in the current pharmaceutical market

Article

Mar 2009
DRUG NEWS PERSPECT

Edward Tobinick

Drug repositioning is the process of developing new indications for existing drugs or biologics. Increasing interest in drug repositioning has occurred due to sustained high failure rates and costs involved in attempts to bring new drugs to market, It has been estimated that it may cost more than USD 800 million to develop a new drug de novo. In addition, due to regulatory requirements regarding safety, efficacy and quality, the time required to develop a new drug de novo has been estimated to be 10 to 17 years. De novo drug discovery has failed to efficiently supply pharmaceutical company pipelines. A rational approach to drug repositioning may include a cross-disciplinary focus on the elucidation of the mechanisms of disease, allowing matching of disease pathways with appropriately targeted therapeutic agents. Repurposed drugs or biologics have the advantage of decreased development costs and decreased time to Launch due to previously collected pharmacokinetic, toxicology and safety data. For these reasons, repurposing should be a primary strategy in drug discovery for every broadly focused, research-based pharmaceutical company. Copyright © 2009 Prous Science, S.A.U. or its licensors. All rights reserved.

Opportunities for drug repositioning from phenome-wide association studies

Article

Apr 2015
NAT BIOTECHNOL

Results from large-scale phenome-wide association studies (PheWAS) allow association of genetic variants with a wide spectrum of human disorders and have provided considerable insight into disease etiologies. The PheWAS strategy relies on electronically available phenotypic data collected from patient cohorts. PheWAS is similar to a genome-wide association study…

A new method for prioritizing drug repositioning candidates extracted by literature-based discovery

Figures

Recommended publications

Translating a Trillion Points of Data into Therapies, Diagnostics, and New Insights into Disease

Machine learning on adverse drug reactions for pharmacovigilance

New frontiers for anti-biofilm drug development

Drug Repositioning: Bringing New Life to Shelved Assets and Existing Drugs