Content uploaded by Majid Rastegar-Mojarad
Author content
All content in this area was uploaded by Majid Rastegar-Mojarad on Jan 10, 2016
Content may be subject to copyright.
A new method for prioritizing drug repositioning
candidates extracted by literature-based discovery
Majid Rastegar-Mojarad1,2, Ravikumar Komandur Elayavilli1, Dingcheng Li1, Rashmi Prasad2, Hongfang Liu1
1Department of Health Sciences Research, Mayo Clinic, USA
2University of Wisconsin-Milwaukee, Milwaukee, WI, USA
Email: {Mojarad.Majid, komandurelayavilli.ravikumar, Li.Dingcheng, Liu.Hongfang}@mayo.edu
prasadr@uwm.edu
Abstract— Drug repositioning has been a topic of great
attention to researchers and pharmaceutical companies due to its
significant impact on the cost of drug discovery. There are
several approaches to identify potentially novel drug candidates
through repurposing. Literature mining has played a critical role
in mining such information from scientific articles. In this paper,
we used drug-gene and gene-disease semantic predications
extracted from Medline abstracts to generate a list of potential
drug-disease pairs. We further ranked the generated pairs, by
assigning scores based on the predicates that qualify drug-gene
and gene-disease relationships. On comparing the top-ranked
drug-disease pairs against the Comparative Toxicogenomics
Database (CTD), a curated database for drug-disease relations,
we found that a significant percentage of top ranked pairs
appeared in CTD. Co-occurrence of these high-ranked pairs in
Medline abstracts further improves the confidence in our
approach to rank the inferred drug-disease relations higher in
the list. Finally, manual evaluation of top ten pairs ranked by our
approach revealed that nine of them have some biological
significance based on expert judgment.
Keywords; Drug repositioning; Literature-based discovery;
Semantic Predication
I. INTRODUCTION
New drug development costs between 500 million and 2
billion dollars, takes 10-15 years [1], and the success rate is
less than 10% [2]. A well-known alternative way to reduce the
risk and cost of developing new drugs is drug repositioning [3],
i.e., finding new targets for drugs that are already available in
the market. Drug repositioning (or drug repurposing) reduces
the bulk of the effort during the early stages of drug
development, resulting in significant reduction of time and
cost. Drug repositioning alone accounts for approximately 30%
of the newly US Food and Drug Administration (FDA)-
approved drugs and vaccines in recent years [4]. To identify
new indications for available drugs, several approaches have
been studied using various types of data such as clinical data
[5], genetic data [6], [7], and biomedical literature [8], [9].
Literature mining plays a critical role in identifying the
indirect (hidden) relationships between a drug and its potential
targets, since it is nearly impossible for experts to manually
review the ever-increasing body of scientific literature to
identify hidden relationships. Literature-based discovery
(LBD) [10] is a popular approach for unfolding potential novel
findings from the biomedical literature, and involves the
application of the relation of transitivity to discovering
relations. Specifically, LBD systems relate two entities, with a
common entity to provide the link between them. For example,
in order to generate a list of potential drug-disease relations, the
LBD system may attempt to find a common entity (often
genes) that potentially links them. Once the indirect connection
between the drug-disease pairs are established, it is necessary
to eliminate the false positives, and identify only true relations
(novel discoveries).
Distinguishing novel discoveries from the others is not a
trivial task. Typically, the LBD method consists of two steps:
1) extracting and mining relations from the text, and 2)
eliminating the false positives and identifying only the true
relations. As a final step, however, it is also important to have a
rigorous validation of the candidate relations before we
proceeding to laboratory or clinical investigations, since these
are not only expensive but also time consuming. The
effectiveness of a LBD system, therefore, lies in its rigorous
validation. Most prior studies lack such vigorous validation,
including ranking of the generated candidates generated
through LBD process. Though there are a few prior attempts
[12], [13] in this direction, this area has been largely
underexplored.
In this study, we intend to address the critical issue of the
validation of candidate pairs identified through LBD. We
propose and evaluate the effectiveness of two ranking methods
for prioritizing potential drug repositioning candidates
generated by LBD.
The rest of the paper proceeds as follows. First, we discuss
the background and related work in this domain. Here, we
provide an overview for repositioning and LBD and the
resources often used for LBD (semantic medline database in
particular) before we review some of the prior approaches for
ranking the discoveries by LBD systems. Subsequently, we
discuss our approach to rank the LBD-based drug-disease
pairs. Finally, we present the evaluation, results and discussion,
and highlight the limitations of the study.
II. BACKGROUND AND RELATED WORKS
Drug repositioning [15] – also known as repurposing or
reprofiling – is the process of discovering new indications for
existing or shelved drugs. It enables researchers to speed up the
process of developing drugs, with lower cost and risk. There
have been many approaches proposed for drug repositioning,
which could be categorized in different ways. Dudley et al [16]
reviewed computational methods for drug repositioning and
categorized the methods into two classes: drug-based and
disease-based, based on whether drug or disease perspective
initiates the discovery. In another review paper [17], Hurle et
al. reviewed computational techniques for systematic analysis
of transcriptomics, side effects, and genetics data. Wei et al [8]
categorized the drug repositioning methods into literature-
based and ontology-based.
Literature-based discovery (LBD) strives to find novel
connections or correlations between concepts by using
scientific literature. Many LBD studies have been conducted
[10], [18]–[20] in the biomedical domain to generate new
hypotheses that potentially could lead to new discoveries. In
1986, Don Swanson hypothsized that fish oil may have
beneficial effects in patients with Raynaud’s syndrome. He
came up with this hypothesis after reviewing the literature and
observing that (1) Raynaud’s syndrome (A) patients have blood
viscosity (B) disorder, and (2) Fish oil (C) can reduce blood
viscosity. Later, the hypothesis was verified by clinical trials.
Swanson designed a software called ArrowSmith [21] and
implemented the model, called Swanson’s ABC model [10], to
identify more LBDs. LBD deploys general text mining
techniques such as named entity recognition and information
extraction. Sub tasks in LBD are: named entity recognition
(term recognition), term normalization, information extraction,
association mining, and ranking. One of the essential tasks for
a LBD system is to decide if two concepts are correlated or not
(relation extraction). Most commonly used approaches are co-
occurrence analysis [20], Association Rules [22], TF-IDF, Z-
Score, and Mutual Information Measure [19]. Some
approaches are available to identify associations between
concepts and terms, which do not co-occur with one another in
the biomedical literature [12][23]–[25]. Another approach to
identify correlated concepts is using semantic predictions [18].
Hristovski et al. [18] proposed a approach to augment co-
occurrence analysis with semantic predications provided by
two natural language processing systems. Ahlers et al [13] used
semantic medline to propose discovery patterns for the use of
antipsychotic agents in treatment of cancer. Cohen et al. [12]
proposed an approach named Predication-based Semantic
Indexing to generate discovery patterns.
Semantic Medline Database (SemMedDB) [26] contains
approximately 70 million semantic predications, which
extracted by a rule-based system, SemRep [11], from Medline
titles and abstracts. Each semantic predication is a subject-
relation (predicate)-object triple. Subject and object are
concepts from the UMLS Metathesaurus and predicate is a
relation from the UMLS Semantic Network. There are 30
different types of predicate in Semantic Medline database such
as: affects, causes, associated with, treats, etc. Besides these
predicates, there are negative predicates, which show negative
relation between subject and object.
SemMedDB is a relational database and has been used in
several studies to facilitate knowledge discovery [27]–[29].
Workman and Stoddart [27] proposed to use Semantic Medline
as a potential decision support system for point of care.
Previously, our group integrated semantic predications into a
system, called Ask Mayo Expert (AME), to retrieve the most
relevant literature to support the evidence-based clinical
decision making process at point of care [28]. In another study
[29], SemMedDB is utilized to investigate the significance of
extracting information from multiple sentences specifically in
the context of drug-disease relation discovery.
The challenging and expensive task after generating LBDs
is validation. The discoveries can be confirmed or rejected
through human judgment, laboratory methods, or clinical
investigations. The validation could be facilitated with ranking
and prioritizing the generated LBDs, which is the last step in a
LBD system. There have been several studies, which proposed
the ranking algorithm. Wren proposed an algorithm called
average minimum weight (AMV) [30]. The algorithm
calculates a weight for each discoveries (A-C) based on the
strength of A-B and B-C relations. The strength of each
relation is calculated based on mutual information. The
algorithm considers all possible B concepts, which have
relation with A and C in the calculation. Another approach to
rank the findings is proposed by Yetisgen-Yildiz and Pratt [31],
[32]. They used the number of B concepts, which link A to C
as the indication of a strong correlation. The method, which
called Linking Term Count with Average Minimum Weight
(LTC-AMW), uses AMV in case of tie. Swanson et al. [33]
introduced a measure to rank the discoveries based on MeSH
terms in literature called Literature Cohesiveness. AMV and
LTC-AMV are generic and can be used in different LBD
systems, but these algorithms do not consider semantic in their
calculation.
III. METHOD
Our approach to identify drug-repositioning candidates
from literature was inspired by Swanson’s ABC model [10].
Our premise in inferring a correlation between the concept A
and concept C depends upon how strong the association was in
the two associations (A to B and B to C). In our study drug is
concept A, disease is target concept C, and gene serves as an
intermediate Concept B linking A and C. We retrieved the
drug-gene and gene-disease semantic predications from
SemMedDB [26] to infer the link between drug and disease.
Consider the following two examples of semantic predications
extracted from SemMedDB:
• Example 1: Strepsils (Drug), INTERACTS WITH,
CA2 (Gene)
• Example 2: CA2 (Gene), AUGMENTS, Chagas
(Disease)
From the pairs in the above examples, the LBD system
generates Strepsils - Chagas as a potential drug-disease pair.
While the ABC model is common to all the LBD studies,
using semantic predications allow us 1) to consider the
predicates that qualify the drug-gene and gene-disease
relationships 2) to explore a systematic approach to eliminate
potential false positive drug-disease pairs from the potential list
and provide a meaningful ranking to the final drug-disease
pairs. From the initial list we can use series of methods to
eliminate erroneous extraction. However, filtering approaches
are not exhaustive thereby leaving room to a large list of drug-
disease pairs. It is pertinent that we rank these pairs based on
certain parameters, which may help identify a threshold, below
which the drug-disease candidates can be discarded.
As a first step, we use two approaches to filter the drug-
gene and gene-disease pairs. 1) Pairs qualified by predicates,
which are negated such as “did not inhibit” were not at all
considered. 2) We also do not consider pairs qualified by
predicates such as “co-exists”, which do not semantically
define a relationship between the pairs. For subsequent steps
we relied on the notion that the assertions of NLP extraction
based on semantic predicates will also have the potential to
establish biological relatedness between the drug-disease pair.
Hence we used the predicates that qualify the binary
relationships as a feature to rank the final drug-disease
relationships. For example to rank the above pair, Strepsils –
Chagas, we used the predicate between drug-gene,
“INTERACTS WITH”, and the predicate between gene-
disease, “AUGMENTS”. The semantic predicates of both the
drug-gene and gene-disease pair play a determining role in
qualifying a drug-disease pair. We attempted to find a
meaningful co-relation between the predicates that qualify
drug-gene and gene-disease relationships, which we called
them, intermediate predicates, and the likelihood of generating
a true drug-disease pair. The importance of predicate to
determine the relevance of drug-disease relations is even more
important given the fact that the individual semantic
predication can occur in more than one document. In order to
assert a relationship that is inferred from two relationships from
multiple documents, we propose that the predicate co-relation
between the two relationships is one of the key factors.
Besides, the relation and the predicates may have many-to-
many relationships meaning that more than one predicate can
qualify a relation between the two entities. For example, there
is only one citation for the relationship between Strepsils and
CA2, while there are six citations that contain the relationship
between CA2 and Chagas. Out of these six, SemRep identified
three of them as “ASSOCIATED WITH”, two of them as,
“AUGMENTS”, and one as “AFFECTS” relationship. The
actual predicate type also plays a critical role in our ranking
schema.
As a final step, we rely on curated resources to further
refine our ranking approach. While NLP assertions do play a
role in identifying biologically related drug-disease pairs, it is
quite pertinent to take advantage of the existing curated
resources to further filter the irrelevant pairs. There are
numerous resources such as UMLS and Comparative
Toxicogenomics Database (CTD) [34], which catalog drug-
disease relationships. In this study we used UMLS as the gold
standard to evaluate the effect of intermediate predicates in
generating a true drug-disease pair. To identify already known
drug-disease relations in our generated list, we cross-referenced
the generated list of drug-disease pairs with UMLS drug-
disease relations. To limit our study to drug-repositioning
candidates, we only considered drug-disease relations in
UMLS, which their type is “May_be_treated_by”. As
SemMedDB stores Concept Unique Identifier (CUI), assigned
by UMLS to each biomedical entity, we used CUIs to cross-
reference our list and UMLS.
In this study we explored two different ranking approaches
based on two assumptions. In the first approach we considered
the predicate of drug-gene and gene-disease to be independent
of each other, while in the second we considered the
dependencies between the predicates of the two pairs.
A. Ranking based on predicate independence
In this approach, we had the fundamental assumption that
the predicates of drug-gene pair and gene-disease pair are
independent of each other while estimating their relevance in
pairing a drug with disease. Besides this assumption we also
had other following criteria for scoring the relevance of drug-
disease pair:
1. Percentage frequency of the individual predicates in
drug-gene (PpDG)1 and gene-disease (PpGD) relations
was one major criterion for determining the relevancy
of the drug-disease pair. We further refined this notion
by considering only those predicates, whose drug-
disease pair is represented in UMLS drug-disease
relations as a “May_be_treated_by” relations. We
showed the refined version with (PpDG-U)2 and (PpGD-U).
The one issue with the choice of UMLS based
validation is the possibility of lag in the curation of
drug-disease relation in the UMLS. There is a
possibility in eliminating lot of potentially relevant
predicates in identifying the right drug-disease pairs.
2. As an additional validation step to normalize the
percentages, we also determined the respective
percentage frequency of the drug-gene (PpDG-S) and
gene-disease (PpGD-S)3 predicates in the literature mined
relations in SemMedDB.
3. The raw score for a given drug-disease pair inferred
from the individual pair (drug-gene (DG) and gene-
disease (GD)) is calculated as per the equation 1.
log
log
1
Where n shows the number of semantic predications
between the drug-gene extracted from literature and m the
same number for the gene-disease relationship. Figure 1 shows
the steps of calculating the independence scores. For example,
consider the above drug-disease pair (Strepsils - Chagas). In
order to calculate the score for the pair, we added the ratio of
log scores of the individual predicates as outlined in equation
(1). For this example, we added the score of the only predicate,
“INTERACTS WITH” that defines the relationship between
drug-gene (Example 1) pair with the score of all six predicates
between gene-disease pair (Example 2). As mentioned before,
more than one predicate may qualify a drug-gene/gene-disease
pair, which we consider the summation of the ratio of log
scores of all of them. At this point we do not consider the
semantic relatedness of the predicates while calculating their
scores.
1 The first “P” stands for percentage and the second one stands for predicate
and “DG” stands for “Drug-Gene”.
2 “U” stands for “UMLS”.
3 In this notation, “S” stands for “SemMedDB”.
B. Ranking based on predicate inter-dependence
In the second ranking, the predicates of drug-gene pair and
gene-disease pair are dependent on each other while estimating
their relevance in pairing a drug with disease. Here are the
steps:
1. Instead of the individual Percentage frequency we
compute the Percentage Frequency of the combined
predicates between the drug-gene and drug-disease
pair (PpDG-pGD). We limit this calculation to only those
drug-disease pairs that are represented in UMLS drug-
disease relations, which showed with this notation
(PpDG-pGD-U).
Fig. 1. Steps of calculating independence scores
2. Our approach to normalizing the percentages, were
very similar to the earlier one except that we used the
percentage frequency of the combined predicates from
SemMedDB (PpDG-pGD-S) as outlined in the following
equation:
# #
∑##
,
2
where n and m presents the number of all different
predicates between drug-gene and gene-disease,
respectively. # shows the frequency of the drug-
gene predicate in SemMedDB, and # shows the
frequency of gene-disease.
3. Using the percentage frequency we calculated the raw
score for a given generated drug-disease pair as given
in the following equation (3).
log
3
where n presents the number of combinations which
generate that drug-disease pair. For example, if there
are 2 different predicates between drug-gene and 3
different predicates between gene-disease, 6 different
combinations can generate the same drug-disease pair.
C. Validation and evaluation
To validate our ranking methods, we used two resources,
CTD and Medline citations. CTD contains curated drug-disease
relations. We cross-referenced the ranked drug-disease pairs
with CTD and studied existence of any correlation between our
ranking methods and being true drug-disease pairs (existence in
CTD). Also, we calculated the percentage of top ranked drug-
disease pairs, which appeared in CTD and compared it with the
same percentage for low ranked pairs. These results are used to
validate and compare our two methods, predicate independence
and inter-dependence. In the second step, we measured the
correlation between the score assigned to each generated pair
and co-occurrence of drug-disease in Medline abstracts. The
logic behind this validation is that more co-occurred drug-
disease pairs, more likely have relationship and our methods
should assign higher score to those pairs. In order to count the
number of co-occurrence of drug-disease pairs, we indexed all
Medline abstracts via ElasticSearch and searched the pairs in
titles and abstracts. Then the percentage of top ranked pairs,
which co-occurred in Medline citations are calculated and
compared with the low ranked pairs. As the last step of
validation, we reviewed top 10 ranked drug-disease pairs
manually and investigated the type of their relationship.
IV. RESULTS
All drug-gene and gene-disease semantic predications were
retrieved from SemMedDB. There were 19,993 drug-gene
pairs (12,666 unique) and 59,945 gene-disease pairs (33,489
unique). When we applied Swanson’s model to these pairs, it
resulted in the generation of 653,108 potential drug-disease
pairs (245,102 unique). All generated drug-disease pairs were
further cross-referenced with UMLS and 1,204 of the
generated pairs were found in this resource. Using these 1,204
pairs and following our ranking methods, independent and
inter-dependence, we calculated percentage frequency related
to each drug-gene and gene-disease predicate. From there and
by eq. 1 and eq. 3, two scores (for each method) are calculated
for the generated potential drug-disease pairs.
To validate our ranking methods, we calculated the
correlation between the scores and the number of co-
occurrence of pairs in Medline abstracts. The results showed
that inter-dependence method is correlated with co-occurrence
of the drug-disease pairs in Medline abstracts (using T-test, P-
value < 2.2e-16). We calculated the percentage of high and low
ranked drug-disease pairs, which co-occurred in Medline
abstracts. Figure 2 shows this comparison for the both
methods. In this figure Y-axis shows the percentage of pairs
co-occurred in Medline and X-axis shows the number of top
ranked pairs.
Fig. 2. Comparison of the percentage of high and low ranked drug-disease
pairs co-occurred in Medline abstracts.
We did the similar experiment on the percentage of
appearance of high and low ranked drug-disease pairs in CTD.
Figure 3 illustrates the result of this experiment.
Fig. 3. Comparison of the percentage of high and low ranked drug-disease
pairs appeared in CTD.
Table I includes the result of our manual investigation of
top ten ranked drug-disease pairs.
TABLE I. TOP TEN RANKED DRUG DISEASE PAIRS
Drug
Disease Type Reference
Omalizumab Asthma Treatment [35]
Nifedipine Tetanus Treatment Wikipedia
Nifedipine Ischemia Treatment [36]
Omalizumab Dermatitis, atopic Treatment [37]
Nifedipine Heart failure Treatment [38]
Nifedipine Renal tubular disorder Relation [39]
Calan Hypertensive disease Treatment
Airol Asthma -
Ezetimibe Coronary heart disease Treatment [40]
Cyclosporine Asthma Treatment [41]
V. DISCUSSIO N
In our study we found that interdependence based ranking
of drug-disease pairs (especially the top ranked pairs) identified
through LBD had strong literature evidence than the pairs
ranked using independent ranking approach. Figure 2 shows
that 82% of the top 100 drug-disease pairs, ranked using inter-
dependence approach had strong literature evidence. These
pairs were found to co-occur within a single abstract in
Medline. However there is a noticeable decline in the
percentage of pairs as we go below 100. We observed that pairs
ranked using independent ranking approach had relatively
lower co-occurrence evidence in the biomedical literature.
We observed a similar trend when we evaluated the
confidence levels of the top ranked pairs identified using both
approaches against a curated knowledgebase such as CTD.
Figure 3 further confirms the distinct advantage of the inter-
dependence ranking over the independence ranking. Finally,
manual evaluation of top ten pairs ranked by inter-dependence
approach revealed that the pairs have some biological
significance based on expert judgment. This indicates that
inter-dependence method has higher chance in identifying
biologically relevant drug-disease pairs. We also like to draw
the attention to the fact that nine out of ten top ranked drug-
disease pairs are valid relations, which belong to DRUG-
TREATS-DISESE relationship category.
There are two main limitations in this study. First, we did
not have a gold standard of drug-disease treatment pairs to
evaluate the performance of our approaches. Evaluation of our
system against a gold standard alone will help us to accurately
benchmark its actual performance. Second, there is an inherent
limitation either in the choice of resource (choice of CTD as a
resource) or the measure (literature co-occurrence) to evaluate
the confidence levels of top ranked drug-disease pairs
identified by the system. CTD though a manually curated
resource do not annotate the type of relationship between the
drug and disease. Hence while evaluating our system against
CTD we ignored the semantic predications extracted by the
system, which would have resulted in loss of valuable
information. Alternatively, we relied on document level co-
occurrence in literature as a measure to validate drug-disease
relationship. Document level co-occurrence of a relation is not
a strong indicator for a valid drug-disease relation. There are
also limitations in our ranking methods. Using some rigorous
statistical validation may further refine the notion of semantic
predication as evidence for relation between biomedical entities
for LBD. In future, we plan to create a gold standard of drug-
disease treatment relations to evaluate our methods more
accurately and compare our methods with other approaches.
We intend to improve our methods to be able to determine a
threshold score, which the pairs below that score considered as
false positive candidates.
VI. CONCLUSION
In this study, we proposed and evaluated two methods for
ranking and prioritizing potential drug-repositioning
discoveries extracted from literature. We used drug-gene and
gene-disease predications, extracted by SemRep, to generate
potential drug-disease pairs. The predicates between dug-gene
and gene-disease pairs are used to rank the generated drug-
disease pairs. Our results showed using combination of drug-
gene and gene-disease predicates can be a metric to rank more
likely true drug-repositioning candidates higher in the list.
ACKNOWLEDGMENT
This study was made possible by National Institute of
Health R01 GM102282-01.
REFERENCES
[1] C. P. Adams and V. V. Brantner, “Estimating The Cost Of New Drug
Development: Is It Really $802 Million?,” Health Aff, vol. 25, no. 2,
pp. 420–428, Mar. 2006.
[2] J. Gilbert, P. Henske, and A. Singh, Rebuilding Big Pharma’s
business model. In Vivo, 2003.
[3] E. L. Tobinick, “The value of drug repositioning in the current
pharmaceutical market,” Drug News Perspect., Mar. 2009.
[4] G. Jin and S. T. C. Wong, “Toward better drug repositioning:
prioritizing and integrating existing methods into efficient pipelines,”
Drug Discovery Today, vol. 19, no. 5, pp. 637–644, May 2014.
[5] H. Xu, M. C. Aldrich, Q. Chen, H. Liu, N. B. Peterson, Q. Dai, M.
Levy, A. Shah, X. Han, X. Ruan, M. Jiang, Y. Li, J. S. Julien, J.
Warner, C. Friedman, D. M. Roden, and J. C. Denny, “Validating
drug repurposing signals using electronic health records: a case study
of metformin associated with reduced cancer mortality,” J Am Med
Inform Assoc, Jul. 2014.
[6] P. Sanseau, P. Agarwal, M. R. Barnes, T. Pastinen, J. B. Richards, L.
R. Cardon, and V. Mooser, “Use of genome-wide association studies
for drug repositioning,” Nat Biotech, vol. 30, no. 4, Apr. 2012.
[7] M. Rastegar-Mojarad, Z. Ye, J. M. Kolesar, S. J. Hebbring, and S. M.
Lin, “Opportunities for drug repositioning from phenome-wide
association studies,” Nat Biotech, vol. 33, no. 4, pp. 342–345, Apr.
2015.
[8] C.-P. Wei, K.-A. Chen, and L.-C. Chen, “Mining Biomedical
Literature and Ontologies for Drug Repositioning Discovery,” in
Advances in Knowledge Discovery and Data Mining, V. S. Tseng, T.
B. Ho, Z.-H. Zhou, A. L. P. Chen, and H.-Y. Kao, Eds. Springer
International Publishing, 2014, pp. 373–384.
[9] C. Andronis, A. Sharma, V. Virvilis, S. Deftereos, and A. Persidis,
“Literature mining, ontologies and information visualization for drug
repurposing,” Brief. Bioinformatics, vol. 12, no. 4, Jul. 2011.
[10] D. R. Swanson, “Migraine and magnesium: eleven neglected
connections,” Perspect. Biol. Med., vol. 31, no. 4, pp. 526–557, 1988.
[11] T. C. Rindflesch and M. Fiszman, “The interaction of domain
knowledge and linguistic structure in natural language processing:
interpreting hypernymic propositions in biomedical text,” J Biomed
Inform, vol. 36, no. 6, pp. 462–477, Dec. 2003.
[12] T. Cohen, D. Widdows, R. W. Schvaneveldt, P. Davies, and T. C.
Rindflesch, “Discovering discovery patterns with predication-based
Semantic Indexing,” Journal of Biomedical Informatics, vol. 45, no. 6,
pp. 1049–1065, Dec. 2012.
[13] C. B. Ahlers, D. Hristovski, H. Kilicoglu, and T. C. Rindflesch,
“Using the Literature-Based Discovery Paradigm to Investigate Drug
Mechanisms,” AMIA Annu Symp Proc, vol. 2007, pp. 6–10, 2007.
[14] O. Bodenreider, “The Unified Medical Language System (UMLS):
integrating biomedical terminology,” Nucl. Acids Res., vol. 32, no.
suppl 1, pp. D267–D270, Jan. 2004.
[15] T. T. Ashburn and K. B. Thor, “Drug repositioning: identifying and
developing new uses for existing drugs,” Nat Rev Drug Discov, vol. 3,
no. 8, pp. 673–683, Aug. 2004.
[16] J. T. Dudley, T. Deshpande, and A. J. Butte, “Exploiting drug-disease
relationships for computational drug repositioning,” Brief.
Bioinformatics, vol. 12, no. 4, pp. 303–311, Jul. 2011.
[17] M. R. Hurle, L. Yang, Q. Xie, D. K. Rajpal, P. Sanseau, and P.
Agarwal, “Computational Drug Repositioning: From Data to
Therapeutics,” Clinical Pharmacology & Therapeutics, Apr. 2013.
[18] D. Hristovski, C. Friedman, T. C. Rindflesch, and B. Peterlin,
“Exploiting Semantic Relations for Literature-Based Discovery,”
AMIA Annu Symp Proc, vol. 2006, pp. 349–353, 2006.
[19] M. Yetisgen-Yildiz and W. Pratt, “A new evaluation methodology for
literature-based discovery systems,” J Biomed Inform, vol. 42, no. 4,
pp. 633–643, Aug. 2009.
[20] M. Weeber, H. Klein, L. T. W, J. Berg, and D. R. S. Has, “Using
concepts in literature-based discovery: Simulating Swanson’s
Raynaud-fish oil and migrainemagnesium discoveries,” J. Am. Soc.
Inf. Sci. Tech, pp. 548–557, 2001.
[21] N. R. Smalheiser and D. R. Swanson, “Using ARROWSMITH: a
computer-assisted approach to formulating and assessing scientific
hypotheses,” Comput Methods Programs Biomed, vol. 57, no. 3, pp.
149–153, Nov. 1998.
[22] D. Hristovski, B. Peterlin, and S. Dzeroski, “Literature-based
Discovery Support System and Its Application to Disease Gene
Identification,” Proc AMIA Symp, p. 928, 2001.
[23] M. D. Gordon and S. Dumais, “Using Latent Semantic Indexing for
Literature Based Discovery,” J. Am. Soc. Inf. Sci., vol. 49, no. 8, pp.
674–685, Jun. 1998.
[24] R. J. Cole and P. D. Bruza, “A Bare Bones Approach to Literature-
Based Discovery: An Analysis of the Raynaud’s/Fish-Oil and
Migraine-Magnesium Discoveries in Semantic Space,” in Discovery
Science, A. Hoffmann, H. Motoda, and T. Scheffer, Eds. Springer
Berlin Heidelberg, 2005, pp. 84–98.
[25] T. Cohen, R. Schvaneveldt, and D. Widdows, “Reflective Random
Indexing and indirect inference: A scalable method for discovery of
implicit connections,” Journal of Biomedical Informatics, vol. 43, no.
2, pp. 240–256, Apr. 2010.
[26] H. Kilicoglu, D. Shin, M. Fiszman, G. Rosemblat, and T. C.
Rindflesch, “SemMedDB: a PubMed-scale repository of biomedical
semantic predications,” Bioinformatics, vol. 28, Dec. 2012.
[27] M. J. Cairelli, C. M. Miller, M. Fiszman, T. E. Workman, and T. C.
Rindflesch, “Semantic MEDLINE for discovery browsing: using
semantic predications and the literature-based discovery paradigm to
elucidate a mechanism for the obesity paradox,” AMIA Annu Symp
Proc, vol. 2013, pp. 164–173, 2013.
[28] M. Rastegar-Mojarad, D. Li, and H. Liu, “Operationalizing Semantic
Medline for meeting the information needs at point of care,” presented
at the AMIA Clinical Research Informatics Summit, 2015.
[29] M. Rastegar-Mojarad, R. Komandur Elayavilli, D. Li, and H. Liu,
“Assessing the Need of Discourse-Level Analysis in Identifying
Evidences for Drug-Disease Relations in Scientific Literature,”
presented at the Medinfo, 2015.
[30] J. D. Wren, “Extending the mutual information measure to rank
inferred literature relationships,” BMC Bioinformatics, vol. 5, no. 1, p.
145, Oct. 2004.
[31] W. Pratt and M. Yetisgen-Yildiz, “LitLinker: Capturing Connections
Across the Biomedical Literature,” in Proceedings of the 2Nd
International Conference on Knowledge Capture, New York, NY,
USA, 2003, pp. 105–112.
[32] M. Yetisgen-Yildiz and W. Pratt, “Using statistical and knowledge-
based approaches for literature-based discovery,” J Biomed Inform,
vol. 39, no. 6, pp. 600–611, Dec. 2006.
[33] D. R. Swanson, N. R. Smalheiser, and V. I. Torvik, “Ranking indirect
connections in literature-based discovery: The role of Medical Subject
HEADINGS (MeSH),” J. AM. SOC. INFORMATION SCI.
TECHNOL, vol. 57, pp. 1427–1439, 2006.
[34] A. P. Davis, C. J. Grondin, K. Lennon-Hopkins, C. Saraceni-Richards,
D. Sciaky, B. L. King, T. C. Wiegers, and C. J. Mattingly, “The
Comparative Toxicogenomics Database’s 10th year anniversary:
update 2015,” Nucleic Acids Res., Oct. 2014.
[35] R. C. Strunk and G. R. Bloomberg, “Omalizumab for Asthma,” New
England Journal of Medicine, vol. 354, no. 25, Jun. 2006.
[36] R. A. Kloner, “Nifedipine in Ischemic Heart Disease,” Circulation,
vol. 92, no. 5, pp. 1074–1078, Sep. 1995.
[37] M. C. Fernández-Antón Martínez, V. Leis-Dosil, F. Alfageme-Roldán,
A. Paravisini, S. Sánchez-Ramón, and R. Suárez Fernández,
“Omalizumab for the treatment of atopic dermatitis,” Actas
Dermosifiliogr, vol. 103, no. 7, pp. 624–628, Sep. 2012.
[38] C. V. Leier, T. J. Patrick, J. Hermiller, K. D. Pacht, P. Huss, R. D.
Magorien, and D. V. Unverferth, “Nifedipine in congestive heart
failure: effects on resting and exercise hemodynamics and regional
blood flow,” Am. Heart J., vol. 108, no. 6, pp. 1461–1468, Dec. 1984.
[39] J. R. Diamond, J. Y. Cheung, and L. S. Fang, “Nifedipine-induced
renal dysfunction. Alterations in renal hemodynamics,” Am. J. Med.,
vol. 77, no. 5, pp. 905–909, Nov. 1984.
[40] C. M. Rotella, A. Zaninelli, C. Le Grazie, M. E. Hanson, and G. F.
Gensini, “Ezetimibe/simvastatin vs simvastatin in coronary heart
disease patients with or without diabetes,” Lipids Health Dis, vol. 9, p.
80, Jul. 2010.
[41] E. Nizankowska, J. Soja, G. Pinis, G. Bochenek, K. Sładek, B.
Domagała, A. Pajak, and A. Szczeklik, “Treatment of steroid-
dependent bronchial asthma with cyclosporin,” Eur. Respir. J., vol. 8,
no. 7, pp. 1091–1099, Jul. 1995.