ArticlePDF Available

Application of the Shortest Path Algorithm for the Discovery of Breast Cancer-Related Genes

Authors:

Abstract and Figures

Breast cancer, the most prevalent cancer in women, develops from breast tissue. Its incidence has increased in recent years due to environmental risk factors. Thus, it is urgent to uncover the mechanism underlying breast cancer to design effective treatments. Identification of all breast cancer-related genes is one way to help elucidate the underlying breast cancer mechanism. In this study, a computational method was built and applied to discover new candidate breast cancer-related genes. Based on the known breast cancer-related genes retrieved from public databases, the shortest path algorithm was applied to discover new candidate genes in the protein-protein interaction network. The analysis results of the selected genes suggest that some of them are deemed breast cancer-related genes according to the most recent published literature, while others have direct or indirect associations with the initiation and development of breast cancer.
Content may be subject to copyright.
Send Orders for Reprints to reprints@benthamscience.ae
Current Bioinformatics, 2016, 11, 51-58 51
Application of the Shortest Path Algorithm for the Discovery of Breast
Cancer-Related Genes
Lei Chen1,§, ZhiHao Xing2,§, Tao Huang3, Yang Shu4, GuoHua Huang5 and Hai-Peng Li*,6
1College of Information Engineering, Shanghai Maritime University, Shanghai 201306, People’s
Republic of China
2The Key Laboratory of Stem Cell Biology, Institute of Health Sciences, Shanghai Jiaotong University
School of Medicine and Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences,
Shanghai 200025, People’s Republic of China
3Department of Genetics and Genomics Sciences, Icahn School of Medicine at Mount Sinai, New York,
NY 10029, USA
4State Key Laboratory of Medical Genomics, Institute of Health Sciences, Shanghai Jiaotong
University School of Medicine and the Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences,
Shanghai 200025, People’s Republic of China
5Institute of Systems Biology, Shanghai University, Shanghai 200444, People’s Republic of China
6CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy
of Sciences, Shanghai 200031, People’s Republic of China
Abstract: Breast cancer, the most prevalent cancer in women, develops from breast tissue. Its incidence has increased in
recent years due to environmental risk factors. Thus, it is urgent to uncover the mechanism underlying breast cancer to
design effective treatments. Identification of all breast cancer-related genes is one way to help elucidate the underlying
breast cancer mechanism. In this study, a computational method was built and applied to discover new candidate breast
cancer-related genes. Based on the known breast cancer-related genes retrieved from public databases, the shortest path
algorithm was applied to discover new candidate genes in the protein-protein interaction network. The analysis results of
the selected genes suggest that some of them are deemed breast cancer-related genes according to the most recent
published literature, while others have direct or indirect associations with the initiation and development of breast cancer.
Keywords: Betweenness, breast cancer, disease gene, protein-protein interaction, shortest path algorithm, weighted network.
1. INTRODUCTION
Breast cancer is a malignant tumor that develops from
breast tissue. It contains a cluster of cancer cells that not only
can invade surrounding tissues but also spread to other parts
of the body, such as the regional lymph nodes, lungs, liver,
and bone-marrow [1]. Most cases of breast cancer occur in
women; however, rarely, men could also acquire the cancer,
which only accounts for a small percentage of all male
tumors. In recent years, the incidence of this disease has
increased [2]. Breast cancer, which is the most prevalent
cancer in women, accounts for 22.9% of invasive cancers
and 16% of all cancers in females. In 2008, 458,503 deaths
were caused by breast cancer worldwide, resulting in 13.7%
cancer deaths in women [3]. The incidence of breast cancer
in developed nations is higher than that in developing
countries. Possible contributory factors include lifestyle and
eating habits [4].
*Address correspondence to this author at the CAS-MPG Partner Institute
for Computational Biology, Shanghai Institutes for Biological Sciences,
Chinese Academy of Sciences, Shanghai 200031, People’s Republic of
China; Tel: 0086-021-54920465; E-mail: lihaipeng@picb.ac.cn
§These authors contributed equally to this work.
Breast cancer is a complicated disease with different
biological features. Although it is difficult to determine
whether an individual could develop breast cancer, several
hereditary and environmental risk factors have been reported
to affect the likelihood of developing the disease; among
them, female gender and older age [5] are the primary risk
factors. Other factors, such as lifestyle, obesity, childbearing
[6], estrogen exposure [7], radiation exposure [8] and genetic
factors, have also reported to have associations with the
formation and development of breast cancer.
It is believed that breast cancer results from cumulative
genetic damage and genetic alterations, resulting in the
activation of proto-oncogenes and inactivation of tumor
suppressor genes. Genetic factors have been observed in one-
fourth of breast cancer patients [9]. Approximately 5-10% of
breast cancer cases may be due to mutations in high
susceptibility genes, such as BRCA1 and BRCA2 [10].
BRCA1 and BRCA2 genes encode human tumor suppressor
proteins that play roles in repairing damaged DNA and
ensure genome stability. When mutations occur in these
genes, BRCA protein cannot function properly. Mutations in
BRCA1 and BRCA2 increase the risk of female breast cancer
and account for approximately 5 to 10 percent of breast
cancers [11]. There are several other genes that are also
2212-392X/16 $58.00+.00 © 2016 Bentham Science Publishers
52 Current Bioinformatics, 2016, Vol. 11, No. 1 Chen et al.
associated with breast cancer, such as PTEN, TP53 and
ATM. The PTEN gene is involved in the regulation of cell
growth, apoptosis and metastasis [12,13]. Defective PTEN
protein causes cells to divide in an uncontrolled way and can
render people to a higher risk of cancerous breast tumors.
The TP53 gene encodes p53 protein. The p53 protein is
crucial in humans because it regulates the cell cycle and
functions as a tumor suppressor. People with this rare
syndrome have a higher risk of breast cancer and several
other cancers [14]. The ATM gene can also help repair
damaged DNA and phosphorylate several key proteins that
initiate the activation of the DNA damage checkpoint in
DNA repair or apoptosis [15]. Patients with ATM mutations
have an increased risk for breast cancers [16].
Known as one of the most common cancers, the concrete
mechanism underlying breast cancer has not yet been
completely understood. Because breast cancer is a complex
disease, and the number of human genes is huge, it is
difficult to discover novel breast cancer-related genes by
experiments only. By contrast, bioinformatics methods,
which have been applied to tackle various disease-related
problems [17-23], can provide an alternative way to
efficiently screen quantities of breast cancer-related genes at
the same time. Using certain bioinformatics methods [24,
25], the associations between the selected genes and breast
cancer can be analyzed, thereby measuring the likelihood of
them being novel breast cancer-related genes.
This study applied an existing bioinformatics method,
which has been used to investigate related genes of age-
related macular degeneration [18] and hepatocellular
carcinoma [23], to identify novel breast cancer-related genes
based on known ones. According to certain studies [26-30],
proteins that can interact with each other always share
similar functions. Thus, a weighted network was constructed
based on the information of protein-protein interactions
retrieved from STRING (Search Tool for the Retrieval of
Interacting Genes/Proteins) [31]. By application of the
shortest path algorithm to this network to identify all of the
shortest paths connecting any two known breast cancer-
related genes, genes occurring in at least one shortest path
were selected as candidate genes. Next, a randomization test
was used to filter these candidate genes. To analyze the
biological functions of the final selected genes, Gene
Ontology (GO) and the KEGG (Kyoto Encyclopedia of
Genes and Genomes) pathway were employed that have
been widely used to annotate enrichment and analyze the
functions of genes [32, 33]. The GO and KEGG pathway
enrichment analysis of the candidate genes indicates that
certain GO terms and KEGG pathways are highly associated
with the development of breast cancer. Further analysis
suggests that some candidate genes have been reported as
breast cancer-related genes in some of the most recently
published literature. We hope that our results would help to
uncover the mechanism of breast cancer and provide new
insights into novel therapies.
2. MATERIALS AND METHODS
2.1. Materials
The breast cancer-related genes were collected from the
following two databases: UniProtKB (Protein Knowledgebase,
http://www.uniprot.org/uniprot/, Release 2013_12) [34] and
TSGene Database (Tumor Suppressor Gene Database, http://
bioinfo.mc.vanderbilt.edu/TSGene/cancer_type.cgi) [35]. In
detail, 224 reviewed genes and 130 tumor suppressor genes
that are related to breast cancer were retrieved from
UniProtKB, whereas 154 breast cancer-related genes were
obtained from the TSGene Database. All of the obtained
genes were combined. Finally, we obtained 369 breast
cancer-related genes, which are available in the Online
Supporting Information S1.
2.2. Methods to Identify Novel Genes and the Screen
Process
The original idea of the method was based on the fact
that proteins that can interact with each other always share
similar functions [26-30]. Thus, the information of protein-
protein interactions, including direct (physical) and indirect
(functional) interactions, retrieved from STRING [31], was
employed. These interactions are derived from genomics,
high-throughput experiments, (conserved) coexpression and
previous knowledge. In the obtained file, each of the
obtained protein-protein interactions contains two proteins
and one score. The score roughly estimates how likely a
given interaction describes a functional linkage between two
proteins. Obviously, the higher the score of the interaction is,
the stronger the functional linkage is. For formulation,
Q(p1,p2) is considered the score of the interaction between
proteins p1 and p2.
The model for the discovery of novel breast cancer-
related genes was built based on the information of protein-
protein interactions. Similar models have been applied for
the identification of related genes of other diseases, such as
age-related macular degeneration [18] and hepatocellular
carcinoma [23]. Here, we provided a brief description of the
method, and readers can refer to the studies of Zhang et al.
and Jiang et al. [18, 23] for detail.
1. According to the information of protein-protein
interaction retrieved from STRING, a weighted
network was constructed by taking proteins as nodes;
two nodes were adjacent if and only if the
corresponding proteins can interact with each other.
Because the score of each interaction ranged between
150 and 999, each edge with end-nodes v1 and v2 was
assigned a weight defined as 1000-Q(p1,p2), where p1
and p2 are corresponding proteins of nodes v1 and v2.
2. All of the shortest paths in the above constructed
network, which connected any two known breast
cancer-related genes, were obtained by the well-
known shortest path algorithm, Dijkstra’s algorithm
[36], which is integrated into Maple.
3. For each node/gene in the network, the number of
shortest paths that contained the node/gene as an
inner node was counted. This value was called
betweenness in the present study. Genes with
betweennesses greater than zero were picked up as
candidate genes, which would be further considered
in the following procedures.
4. To avoid false discoveries, we randomly selected 500
gene sets whose sizes were equal to the size of the
Algorithm for the Discovery of Breast Cancer-Related Genes Current Bioinformatics, 2016, Vol. 11, No. 1 53
gene set consisting of known breast cancer-related
genes. Next, we calculated the betweenness for each
candidate gene from each of these 500 gene sets.
5. The permutation FDR was calculated for each
candidate gene, which was defined as “the number of
gene sets on which the betweennesses were larger
than the betweenness of the known breast cancer-
related gene set”/500. Genes with small permutation
FDRs were picked up as significant candidate genes.
3. RESULTS AND DISCUSSION
3.1. Candidate Genes
According to the method, all of the shortest paths
connecting any two known breast cancer-related genes were
searched in the weighted network constructed by the
information of protein-protein interactions. The betweenness
of each node/gene in the network was computed based on
these paths. The value of a certain gene’s betweenness
indicates the strength of the direct and indirect associations
between the gene and known breast cancer-related genes
[37]. In detail, a high betweenness suggests a strong
association, whereas a low betweenness suggests a weak
association. Thus, we selected 621 genes whose
betweennesses were greater than zero and termed these
genes as candidate genes. These genes were listed in the
Online Supporting Information S2.
Until now, we obtained 621 candidate genes for breast
cancer. However, some of them may be false discoveries
because they may have special positions in the protein-
protein interaction network, resulting in high betweenness in
each case. It is necessary to execute a randomization test
mentioned in the fourth and fifth steps. As a result, we
obtained the permutation FDR for each of the 621 candidate
genes and listed these permutation FDRs in Online
Supporting Information S2. To exclude false discoveries, we
selected 62 genes with permutation FDRs smaller than 0.01
as significant candidate genes. These 62 genes and their
betweennesses are listed in Table 1. In the following
sections, we analyzed the likelihood of these genes to be
novel breast cancer-related genes.
3.2. Results from DAVID
As mentioned in Section 3.1, 62 genes were obtained as
significant candidate genes for breast cancer. To analyze the
relationship between them and breast cancer, the functional
annotation tool DAVID (Database for Annotation,
Visualization and Integrated Discovery) [38] was employed
to understand the biological meaning underlying these 62
genes. In fact, DAVID analyzed the enrichments of 62 genes
on GO terms or KEGG pathways, thereby inferring the
relationship between genes and some biological processes.
This method has been applied to study various disease-
related and gene-related problems [18, 23, 39, 40]. The
analysis results in this study can be found in Online
Supporting Information S3. Thus, the next two sections
provided the detailed analysis of 62 genes on GO terms and
KEGG pathways, respectively.
3.2.1. GO Term Analysis
From Online Supporting Information S3, 122 GO terms
were found to be enriched by the 62 genes. We analyzed and
discussed the top ten GO terms sorted by P-value, including
seven biological process (BP) GO terms, two cellular
component (CC) GO terms and one molecular function (MF)
GO term, respectively. It is necessary to note that the P-
values of these GO terms were all less than 0.05, indicating
that the 62 significant candidate genes were highly enriched
in these GO terms. Fig. (1) shows these ten GO terms and
the ‘count’ obtained by DAVID that is defined as the number
of genes among the 62 genes that shared these GO terms.
The seven BP terms are (I) GO: 0010604 (positive
regulation of macromolecule metabolic process)
(“count”=14); (II) GO: 0010557 (positive regulation of
macromolecule biosynthetic process) (“count”=11); (III)
GO: 0031328 (positive regulation of cellular biosynthetic
process) (“count”=11); (IV) GO: 0009891 (positive
regulation of biosynthetic process) (“count”=11); (V) GO:
0051276 (chromosome organization) (“count”=9); (VI) GO:
0016569 (covalent chromatin modification) (“count”=5); and
(VII) GO: 0045941 (positive regulation of transcription)
“count”=9).
Fig. (1). The top ten GO terms enriched by 62 genes sorted by P-value. The X-axis represents the GO term ID, while the Y-axis represents
the number of genes among the 62 genes that shared the GO terms.
54 Current Bioinformatics, 2016, Vol. 11, No. 1 Chen et al.
The macromolecule biosynthetic process is relatively
important in the development of breast cancer. Previous
studies have shown that the biosynthesis of macromolecules,
including carbohydrates, proteins, lipids and nucleic acids,
needed to be altered to support rapid growth and survival of
tumor cells [41]. Moreover, up-regulated macromolecule
metabolism was found after treating breast cancer with
chemotherapy, providing mechanistic insights into the
resistance to chemotherapy and revealing the possible role of
the macromolecule biosynthetic process in breast cancer
[42]. The next two terms represented the importance of the
biosynthetic process in breast cancer. Like the
macromolecule biosynthesis, an increased biosynthetic
process can provide rapid ATP generation and increased
biosynthesis of macromolecules such as nucleic acids and
hormones for tumor cells [41]. In cancer cell, biosynthetic
pathways are altered toward an anabolic metabolism to meet
the needs of cell proliferation [43]. All of the above indicated
the relationship between biosynthetic pathways and breast
cancer and may help us to explore new mechanisms in breast
cancer research.
GO: 0051276 (chromosome organization) and GO:
0016569 (covalent chromatin modification) are related to
chromatin structure. Dynamic chromatin structure underlies
most DNA-based biological processes, including DNA
replication and DNA-damage repair, transcriptional
regulation and chromosome condensation. Dysregulation of
Table 1. Detailed information of the 62 significant candidate genes.
Ensemble ID Gene Name Betweenness Ensemble ID Gene Name Betweenness
ENSP00000227507 CCND1 2697 ENSP00000261349 LRP6 1868
ENSP00000350720 SMARCA4 1523 ENSP00000364133 TGFBR1 1486
ENSP00000262158 SMAD7 1185 ENSP00000257904 CDK4 1060
ENSP00000222005 CDC37 1041 ENSP00000361405 MMP9 883
ENSP00000358918 SUFU 687 ENSP00000361777 SET 684
ENSP00000228916 SCNN1A 684 ENSP00000171887 TNS1 353
ENSP00000343392 XRCC3 343 ENSP00000298767 WAPAL 343
ENSP00000364403 UBR4 343 ENSP00000339299 TRIO 343
ENSP00000370867 TGM3 343 ENSP00000244926 SCGB1D2 343
ENSP00000317039 RMI1 343 ENSP00000347883 PLEKHA7 343
ENSP00000262741 PIK3R3 343 ENSP00000305556 PCBP1 343
ENSP00000217026 MYBL2 343 ENSP00000351997 MAP2K6 343
ENSP00000337354 LIPA 343 ENSP00000307940 EEF2 343
ENSP00000322180 DSCC1 343 ENSP00000415615 CSNK2B 343
ENSP00000339723 CIR1 343 ENSP00000219172 CENPT 343
ENSP00000306522 CAMTA1 343 ENSP00000416797 CAMSAP3 343
ENSP00000352011 CACNA1G 343 ENSP00000307004 APLF 343
ENSP00000357040 VANGL2 343 ENSP00000368350 TPT1 343
ENSP00000356591 SOAT1 343 ENSP00000396439 RING1 343
ENSP00000349959 RICTOR 343 ENSP00000323867 PRKAG1 343
ENSP00000355245 PAX9 343 ENSP00000279259 FAU 343
ENSP00000353344 ETS2 343 ENSP00000247843 YEATS4 343
ENSP00000280362 PTS 343 ENSP00000258180 KIAA0513 343
ENSP00000276927 IFNA1 343 ENSP00000354778 CNTNAP2 343
ENSP00000346566 CKAP5 343 ENSP00000295924 TIPARP 343
ENSP00000323076 NDUFAF3 343 ENSP00000370719 ITSN1 343
ENSP00000347232 BLM 343 ENSP00000322142 ING5 343
ENSP00000334854 ZACN 342 ENSP00000262188 SMARCD3 342
ENSP00000316948 CLK4 341 ENSP00000346240 FBXO2 166
ENSP00000262643 CCNE1 115 ENSP00000282903 PLOD2 6
ENSP00000345785 FZD9 1 ENSP00000362166 MEAF6 1
Algorithm for the Discovery of Breast Cancer-Related Genes Current Bioinformatics, 2016, Vol. 11, No. 1 55
these processes has been related to tumor initiation and
development. The mechanisms of chromatin remodeling
involve covalent histone modifications, DNA methylation
and many other processes [44, 45]. For example, the
relationship between the overexpression of EZH2, a
methyltransferase for H3K27, and tumor progression, has
been observed in breast cancer [46]. These indicate that
chromatin plays a fundamental role and underlies many other
processes in breast cancer.
The inclusion of GO: 0045941 (positive regulation of
transcription) meets our expectations. As an important part
of the central dogma, transcription is involved in almost all
of the cellular processes, and the changes in expression of
many genes have been observed in breast cancer [47, 48].
The two CC terms are (I) GO: 0005654 (nucleoplasm)
(“count”=13); and (II) GO: 0005829 (cytosol) (“count”=14).
These revealed the role of the cytosol and nucleoplasm in the
development of breast cancer, providing some suggestions
about how breast cancer occurs.
The only MF term is GO: 0019207 (kinase regulator
activity) (“count”=5). The relationship between GO:
0019207 and breast cancer cell lines has been revealed in
one previous study [49]. Additionally, kinase activity has
played an important role in fundamental cellular processes,
and disruption of kinase activity may be related to cancer. In
fact, MAP kinase, as an important signal molecule, is highly
associated with breast cancer growth and apoptosis [50].
3.2.2. KEGG Pathway Analysis
From Online Supporting Information S3, 13 KEGG
pathways were found to be enriched by these 62 genes. Fig.
(2) shows the 13 KEGG pathways and the ‘count’ obtained
by DAVID that is defined as the number of genes among the
62 genes that shared these pathways.
From Fig. (2), these 13 KEGG pathways are (I) hsa05200
(Pathways in cancer) (“count”=8); (II) hsa05212 (Pancreatic
cancer) (“count”=4); (III) hsa05220 (Chronic myeloid
leukemia) (“count”=4); (IV) hsa04310 (Wnt signaling
pathway) (“count”=5); (V) hsa05210 (Colorectal cancer)
(“count”=4); (VI) hsa05222 (Small cell lung cancer)
(“count”=4); (VII) hsa05219 (Bladder cancer) (“count”=3);
(VIII) hsa05223 (Non-small cell lung cancer) (“count”=3);
(IX) hsa05214 (Glioma) (“count”=3); (X) hsa04115 (p53
signaling pathway) (“count”=3); (XI) hsa05218 (Melanoma)
(“count”=3); (XII) hsa05215 (Prostate cancer) (“count”=3);
(XIII) hsa00100 (Steroid biosynthesis) (“count”=2).
These 62 genes are enriched in many obvious cancer
KEGG pathways, including hsa05200 (Pathways in cancer),
hsa05212 (Pancreatic cancer), hsa05220 (Chronic myeloid
leukemia), hsa05210 (Colorectal cancer), hsa05222 (Small
cell lung cancer), hsa05219 (Bladder cancer), hsa05223
(Non-small cell lung cancer), hsa04115 (p53 signaling
pathway), hsa05214 (Glioma), hsa05218 (Melanoma), and
hsa05215 (Prostate cancer), indicating that these genes are
highly associated with tumor initiation and development and
may play the same roles in breast cancer. Aberrant regulation
of the Wnt signaling pathway has been found in breast
cancer [51]. Blocking the Wnt signaling pathway with iCRT-
3 inhibitor and SOX4 knockdown resulted in the inhibition
of cell proliferation and induced apoptosis in TNBC (triple-
negative breast cancer) [52], suggesting an internal
relationship between the Wnt signaling pathway and breast
cancer. Steroid biosynthesis also plays an important role in
breast cancer. For example, prolonged exposure to estrogen,
a steroid, increased the risk of breast cancer possibly by
facilitating the proliferation of breast cells [53].
3.3. Analysis of Some Significant Candidate Genes
Table 1 lists the 62 significant candidate genes obtained
by our method. Among them, 11 genes had a betweenness
larger than 680, while the betweennesses of the remaining
genes were less than 400. It is a great gap, indicating that 11
genes may be acute breast cancer-related genes with a higher
possibility than others. In addition, four genes—CCND1,
LRP6, TGFBR1 and SMAD7—have been recognized as
breast cancer-related genes according to the most recent
published literature, whereas the other 7 genes were highly
Fig. (2). The 13 KEGG pathways enriched by 62 genes. The X-axis represents the KEGG pathway ID, while the Y-axis represents the
number of genes among the 62 genes that shared the KEGG pathways.
56 Current Bioinformatics, 2016, Vol. 11, No. 1 Chen et al.
associated with the initiation and development of tumors and
may provide evidence for further research of breast cancer.
The following paragraphs provide the detailed analyses.
CCND1. The betweenness of this gene was 2,697, which
was the highest among the 62 significant candidate genes.
The CCND1 gene encodes the protein cyclin-D1, which
belongs to the highly conserved cyclin family. CCND1
exhibits periodical expression across the cell cycle and can
regulate cyclin-dependent kinase. Different cyclins work
together to control the entire cell cycle. Mutations and
amplification of the CCND1 gene are observed frequently in
many types of tumors and alter the normal cell cycle,
possibly contributing to tumorigenesis [54, 55]. Moreover,
overexpression of cyclin D1 has been found in breast cancer
and may serve as a marker for metastasis in clinical
treatment [56].
LRP6. LRP6 had a betweenness of 1,868, which was the
second highest value. The protein encoded by LRP6 is a low-
density lipoprotein (LDL) receptor. LDL receptors, which
are located on the cell surface, play an important role in the
endocytosis of lipoprotein. Through the interaction with the
Wnt signal pathway, LRP6 is involved in the regulation of
proliferation and migration in cancer. Previous research has
shown that Wnt signaling activation caused by
overexpression of LRP6 can contribute to the tumorigenesis
of breast cancer [57].
SMARCA4. SMARCA4, with a betweenness of 1,523
(third highest), encodes a member of the SWI/SNF family. It
is related to several important tumor suppressor proteins, and
mutations are found in many cancer cell lines such as breast
cancer [58].
TGFBR1. TGFBR1 showed the fourth highest
betweenness value (1,486). TGFBR1 and type II TGF-beta
receptors together form a heteromeric complex, which can
transduce the TGF-beta signal from the membrane to
cytoplasm. The TGFBR1*6A variant has been associated
with a high risk for breast cancer, so a better understanding
of the biological function of TGFBR1 signaling may help to
assess breast cancer risk and prevent breast cancer [59].
SMAD7. The betweenness of SMAD7 was 1,185 (the
fifth highest). In breast tumors, SMAD7 can act as a negative
regulator of TGF-β, and it is reported that the formation of
bone metastases is inhibited by the overexpression of Smad7
[60, 61].
CDK4. The betweenness of this gene was 1,060, which
was the sixth highest value. CDK4 encodes a protein
belonging to the Ser/Thr protein kinase family. CDK4 was
involved in a protein kinase complex that controls the G1/S
phase transition [62]. A previous report has shown that the
maintenance of breast cancer requires the presence of CDK4
activity, and clinical therapies targeting CDK4 kinase
activity in cancer may be promising [63].
CDC37. The betweenness of CDC37 was 1,041 (the
seventh highest). CDC37 is a molecular chaperone that can
form a complex with Hsp90 and many protein kinases such
as CDK4. Moreover, CDC37/Hsp90 was reported to
contribute to the stabilization of newly synthesized CDK4
[64].
MMP9. The betweenness of MMP9 was 883, which was
the eighth highest value. MMP9 is a member of the matrix
metalloproteinase (MMP) family that has been indicated in
many biological processes, including reproduction,
embryonic development and cancer metastasis. The role of
MMP in breast cancer has been diverse. Unlike several
MMPs that come from stromal cells, MMP9 is mainly
produced by cancer cells in breast cancer. Additionally, in a
mouse breast cancer model, MMP9 forms tumor cells that
are shown to be required for invasion and pulmonary
metastasis [65].
SUFU. SUFU received the ninth highest betweenness
value (687). Suppressor of fused (SUFU) acts as a negative
regulator of the Hedgehog signaling pathway by binding to
Gli [66]. The primary function of the Hedgehog signaling
pathway (Hh) is restraint in embryogenesis, except for
several processes such as adult tissue repair, and aberrant
reactivation of Hh has been associated with several types of
cancers [67]. Thus, as a negative regulator, SUFU may play
a role in breast cancer.
SCNN1A. The betweenness of this gene was 684, which
was the tenth highest value. SCNN1A is involved in the
formation of sodium channels, which control fluid and
electrolyte transport across epithelia.
SET. The betweenness of this gene was 684, which was
the eleventh highest value. The encoded protein SET is a
nucleosome assembly protein that was reported to be an
inhibitor of several tumor suppressor genes [68, 69].
CONCLUSION
The discovery of disease-related genes is an important
research area in biomedicine and genomics. This study
applied an existing computational method to discover new
breast cancer-related genes. The results show that this
method is effective for tackling this problem. We hope some
of the newly discovered genes will be confirmed by solid
experiments.
CONFLICT OF INTEREST
The authors confirm that this article content has no
conflict of interest.
ACKNOWLEDGEMENTS
This work was supported by the National Science
Foundation of China (61202021, 61373028, 11371008,
61303099), Innovation Program of Shanghai Municipal
Education Commission (12YZ120), Shanghai Educational
Development Foundation (12CG55).
SUPPLEMENTARY MATERIAL
Supplementary material is available on the publisher’s
web site along with the published article.
REFERENCES
[1] Chetty R, Kalan MR. Malignant granular cell tumor of the breast. J
Surg Oncol 1992; 49: 135-7.
Algorithm for the Discovery of Breast Cancer-Related Genes Current Bioinformatics, 2016, Vol. 11, No. 1 57
[2] Ottini L, Palli D, Rizzo S, et al. Male breast cancer. Crit Rev Oncol
Hematol 2010; 73: 141-55.
[3] Florescu A, Amir E, Bouganim N, Clemons M. Immune therapy
for breast cancer in 2010—hype or hope? Current Oncology 2011;
18: e9-e18.
[4] McPherson K, Steel CM, Dixon JM. ABC of breast diseases.
Breast cancer-epidemiology, risk factors, and genetics. BMJ 2000;
321: 624-8.
[5] Reeder JG, Vogel VG. Breast cancer prevention. Cancer Treat Res
2008; 141: 149-164.
[6] Collaborative Group on Hormonal Factors in Breast Cancer. Breast
cancer and breastfeeding: collaborative reanalysis of individual
data from 47 epidemiological studies in 30 countries, including
50302 women with breast cancer and 96973 women without the
disease. Lancet 2002; 360: 187-95.
[7] Yager JD, Davidson NE. Estrogen carcinogenesis in breast cancer.
N Engl J Med 2006; 354: 270-82.
[8] Haim A, Portnov BA. Light Pollution as a New Risk Factor for
Human Breast and Prostate Cancers. Springer 2013.
[9] Lichtenstein P, Holm NV, Verkasalo PK, et al. Environmental and
heritable factors in the causation of cancer—analyses of cohorts of
twins from Sweden, Denmark, and Finland. N Engl J Med 2000;
343: 78-85.
[10] Chang-Claude J. Inherited genetic susceptibility to breast cancer.
IARC Sci Publ 2001; 154: 177-90.
[11] Campeau PM, Foulkes WD, Tischkowitz MD. Hereditary breast
cancer: new genetic developments, new therapeutic avenues. Hum
Genet 2008; 124: 31-42.
[12] Chu EC, Tarnawski AS. PTEN regulatory functions in tumor
suppression and cell biology. Med Sci Monit 2004; 10: RA235-
241.
[13] Waite KA, Eng C. Protean PTEN: form and function. Am J Hum
Genet 2002; 70: 829-44.
[14] Borresen-Dale AL. TP53 and breast cancer. Hum Mutat 2003; 21:
292-300.
[15] Lee JH, Paull TT. Activation and regulation of ATM kinase
activity in response to DNA double-strand breaks. Oncogene 2007;
26: 7741-8.
[16] Chen J. Ataxia telangiectasia-related protein is involved in the
phosphorylation of BRCA1 following deoxyribonucleic acid
damage. Cancer Res 2000; 60: 5037-9.
[17] Chen L, Lu J, Huang T, et al. Finding candidate drugs for Hepatitis
C based on chemical-chemical and chemical-protein interactions.
PLoS One 2014; 9: e107767.
[18] Zhang J, Jiang M, Yuan F, et al. Identification of age-related
macular degeneration related genes by applying shortest path
algorithm in protein-protein interaction network. Biomed Res Int
2013; 2013: 523415.
[19] Chen L, Zeng WM, Cai YD, Feng KY, Chou KC. Predicting
Anatomical Therapeutic Chemical (ATC) Classification of Drugs
by Integrating Chemical-Chemical Interactions and Similarities.
PLoS One 2012; 7: e35254.
[20] Ashburn TT, Thor KB. Drug repositioning: identifying and
developing new uses for existing drugs. Nat Rev Drug Discov
2004; 3: 673-83.
[21] Chen L, Lu J, Zhang N, Huang T, Cai YD. A hybrid method for
prediction and repositioning of drug Anatomical Therapeutic
Chemical classes. Mol Biosyst 2014; 10: 868-77.
[22] Brouwers L, Iskar M, Zeller G, van Noort V, Bork P. Network
neighbors of drug targets contribute to drug side-effect similarity.
PLoS One 2011; 6: e22187.
[23] Jiang M, Chen Y, Zhang Y, et al. Identification of hepatocellular
carcinoma related genes with k-th shortest paths in a protein-
protein interaction network. Mol BioSyst 2013; 9: 2720-8.
[24] Hornberg JJ, Bruggeman FJ, Westerhoff HV, Lankelma J. Cancer:
a systems biology disease. Biosystems 2006; 83: 81-90.
[25] Chen L, Zeng W-M, Cai YD, Huang T. Prediction of Metabolic
Pathway Using Graph Property, Chemical Functional Group and
Chemical Structural Set. Curr Bioinfo 2013; 8: 200-7.
[26] Hu LL, Huang T, Cai YD, Chou KC. Prediction of Body Fluids
where Proteins are Secreted into Based on Protein Interaction
Network. PLoS One 2011; 6: e22989.
[27] Hu LL, Huang T, Liu XJ, Cai YD. Predicting Protein Phenotypes
Based on Protein-Protein Interaction Network. PLoS One 2011; 6:
e17668.
[28] Hu LL, Huang T, Shi X, et al. Predicting functions of proteins in
mouse based on weighted protein-protein interaction network and
protein hybrid properties. PLoS One 2011; 6: e14556.
[29] Deng M, Zhang K, Mehta S, Chen T, Sun F. Prediction of protein
function using protein-protein interaction data. J Comput Biol
2003; 10: 947-60.
[30] Ng KL, Ciou JS, Huang CH. Prediction of protein functions based
on function-function correlation relations. Comput Biol Med 2010;
40: 300-5.
[31] Jensen LJ, Kuhn M, Stark M, et al. STRING 8-a global view on
proteins and their functional interactions in 630 organisms. Nucleic
Acids Res 2009; 37: D412-6.
[32] Li G, Zhao Y, Wen L, et al. Identification and Characterization of
MicroRNAs in the Spleen of Common Carp Immune Organ. J Cell
Biochem 2014; 115: 1768-78.
[33] Shameer K, Sowdhamini R. Functional repertoire, molecular
pathways and diseases associated with 3D domain swapping in the
human proteome. J Clin Bioinforma 2012; 2: 8.
[34] Uniprot C. Activities at the Universal Protein Resource (UniProt).
Nucleic Acids Res 2014; 42: D191-D8.
[35] Zhao M, Sun J, Zhao Z. TSGene: a web resource for tumor
suppressor genes. Nucleic Acids Res 2013; 41: D970-D6.
[36] Gormen TH, Leiserson CE, Rivest RL, Stein C. Introduction to
algorithms. MIT press Cambridge: MA 1990.
[37] Craven JBM. Markov networks for detecting overlapping elements
in sequence data. The MIT Press: 2005.
[38] Huang da W, Sherman BT, Lempicki RA. Systematic and
integrative analysis of large gene lists using DAVID bioinformatics
resources. Nat Protoc 2009; 4: 44-57.
[39] Zhang J, Xing ZH, Ma M, et al. Gene ontology and KEGG
enrichment analyses of genes related to age-related macular
degeneration. Biomed Res Int 2014; 2014: 450386.
[40] Yang J, Chen L, Kong X, Huang T, Cai YD. Analysis of tumor
suppressor genes based on gene ontology and the KEGG pathway.
PLoS One 2014; 9: e107202.
[41] Cairns RA, Harris IS, Mak TW. Regulation of cancer cell
metabolism. Nat Rev Cancer 2011; 11: 85-95.
[42] Lee SC, Xu X, Lim YW, et al. Chemotherapy-induced tumor gene
expression changes in human breast cancers. Pharmacogenet
Genomics 2009; 19: 181-92.
[43] Fritz V, Fajas L. Metabolism and proliferation share common
regulatory pathways in cancer cells. Oncogene 2010; 29: 4369-77.
[44] Wang GG, Allis CD, Chi P. Chromatin remodeling and cancer, Part
I: Covalent histone modifications. Trends Mol Med 2007; 13: 363-
72.
[45] Wang GG, Allis CD, Chi P. Chromatin remodeling and cancer, Part
II: ATP-dependent chromatin remodeling. Trends Mol Med 2007;
13: 373-80.
[46] Moss TJ, Wallrath LL. Connections between epigenetic gene
silencing and human disease. Mutat Res 2007; 618: 163-74.
[47] Grieb BC, Chen X, Eischen CM. MTBP is Over-expressed in
Triple Negative Breast Cancer and Contributes to its Growth and
Survival. Mol Cancer Res 2014; 12: 1216-24.
[48] Ghanem T, Bracken J, Kasem A, Jiang WG, Mokbel K. mRNA
expression of DOK1-6 in human breast cancer. World J Clin Oncol
2014; 5: 156-63.
[49] Nagashima T, Oyama M, Kozuka-Hata H, et al. Phosphoproteome
and transcriptome analyses of ErbB ligand-stimulated MCF-7 cells.
Cancer Genomics-Proteomics 2008; 5: 161-8.
[50] Santen RJ, Song RX, McPherson R, et al. The role of mitogen-
activated protein (MAP) kinase in breast cancer. J Steroid Biochem
Mol Biol 2002; 80: 239-56.
[51] Polakis P. Wnt signaling in cancer. Cold Spring Harb Perspect Biol
2012; 4: a008052.
[52] Bilir B, Kucuk O, Moreno CS. Wnt signaling blockage inhibits cell
proliferation and migration, and induces apoptosis in triple-
negative breast cancer cells. J Transl Med 2013; 11: 280.
[53] Nandi S, Guzman RC, Yang J. Hormones and mammary
carcinogenesis in mice, rats, and humans: a unifying hypothesis.
Proc Natl Acad Sci U S A 1995; 92: 3650-7.
[54] Wang L, Wang Z, Gao X, et al. Association between Cyclin D1
polymorphism and oral cancer susceptibility: a meta-analysis.
Tumour Biol 2014; 35: 1149-55.
[55] Kopparapu PK, Boorjian SA, Robinson BD, et al. Expression of
cyclin d1 and its association with disease characteristics in bladder
cancer. Anticancer Res 2013; 33: 5235-42.
58 Current Bioinformatics, 2016, Vol. 11, No. 1 Chen et al.
[56] He Y, Liu Z, Qiao C, et al. Expression and significance of Wnt
signaling components and their target genes in breast carcinoma.
Mol Med Rep 2014; 9: 137-43.
[57] Zhang J, Li Y, Liu Q, Lu W, Bu G. Wnt signaling activation and
mammary gland hyperplasia in MMTV-LRP6 transgenic mice:
implication for breast cancer tumorigenesis. Oncogene 2010; 29:
539-49.
[58] Medina PP, Sanchez-Cespedes M. Involvement of the chromatin-
remodeling factor BRG1/SMARCA4 in human cancer. Epigenetics
2008; 3: 64-8.
[59] Moore-Smith L, Pasche B. TGFBR1 signaling and breast cancer. J
Mammary Gland Biol Neoplasia 2011; 16: 89-95.
[60] Javelaud D, Mohammad KS, McKenna CR, et al. Stable
overexpression of Smad7 in human melanoma cells impairs bone
metastasis. Cancer Res 2007; 67: 2317-24.
[61] Yin JJ, Selander K, Chirgwin JM, et al. TGF-beta signaling
blockade inhibits PTHrP secretion by breast cancer cells and bone
metastases development. J Clin Invest 1999; 103: 197-206.
[62] Reed S. Control of the G1/S transition. Cancer Surv 1996; 29: 7-23.
[63] Yu Q, Sicinska E, Geng Y, et al. Requirement for CDK4 kinase
function in breast cancer. Cancer cell 2006; 9: 23-32.
[64] Stepanova L, Leng X, Parker SB, Harper JW. Mammalian
p50Cdc37 is a protein kinase-targeting subunit of Hsp90 that binds
and stabilizes Cdk4. Genes & development 1996; 10: 1491-502.
[65] Mehner C, Hockla A, Miller E, et al. Tumor cell-produced matrix
metalloproteinase 9 (MMP-9) drives malignant progression and
metastasis of basal-like triple negative breast cancer. Oncotarget
2014; 5: 2736-49.
[66] Evangelista M, Tian H, de Sauvage FJ. The hedgehog signaling
pathway in cancer. Clinical cancer res 2006; 12: 5924-8.
[67] Scales SJ, de Sauvage FJ. Mechanisms of Hedgehog pathway
activation in cancer and implications for therapy. Trends in
pharmacological sciences 2009; 30: 303-12.
[68] Fan Z, Beresford PJ, Oh DY, Zhang D, Lieberman J. Tumor
suppressor NM23-H1 is a granzyme A-activated DNase during
CTL-mediated apoptosis, and the nucleosome assembly protein
SET is its inhibitor. Cell 2003; 112: 659-72.
[69] Chakravarti D, Hong R. SET-ting the stage for life and death. Cell
2003; 112: 589-91.
Received: March 13, 2015 Revised: April 23, 2015 Accepted: June 20, 2015
... • We provide a clustering-based evaluation of the DAG construction experiment to examine its effectiveness. Similar to the approach described in (Chen et al., 2016), we also provide betweenness and cosine similarity-based evaluations of both experiments to identify the most important nodes and observe their similarities to the target features. ...
... These approaches have allowed moving from "one-drug, one-target" paradigm towards "multiple-drugs, multiple-targets" possibilities (Hopkins, 2008;Vidal et al., 2011;Yıldırım et al., 2007). (Chen et al., 2016) apply a shortest path algorithm to identify candidate genes in the proteinprotein network, and the analysis shows that some of them are deemed breast cancer-related genes according to the most recently published literature. In this work, the model is based on the information on protein-protein interactions, and such methods have been previously applied to identify genes of other diseases, such as age-related macular degeneration and hepatocellular carcinoma . ...
Article
Full-text available
Coming up with the correct patient-specific treatment procedure is a non-trivial task. One of the main challenges is that the problem is combinatorial, and each treatment can be followed by numerous treatments. In real-life scenarios, one cannot experiment with all probable treatment combinations that may provide the necessary positive outcome. In addition, the task of correct drug prescription is challenging because, at different time-stages, the prescription may provide different results, a concept not widely explored in the literature on computational modeling. To address this task, we model the problem as a search problem and propose two algorithms that construct a directed acyclic graph (DAG), a directed cyclic graph, and a tree for each patient to explore various treatment combinations in an iterative and recursive manner. Each patient has three corresponding datasets, one for each stage, representing the features the patient has demonstrated during the recovery process. As a result, we provide a framework for identifying treatment options that may not have been explored previously while incorporating the concept of time-stage-based observations in the search procedure as novel contributions to the existing literature. We provide evaluation and scaling methods and identify current limitations and future research directions.
... PPIs are widely used to explore protein or gene-related problems. Several studies have reported that compared with non-interacting proteins, interacting proteins are more likely to have similar functions (Ng et al., 2010;Hu et al., 2011;Chen et al., 2016a;Cai et al., 2017;Zhao et al., 2019;Gao et al., 2021). Such interactions can be used to identify novel genes that are associated with known disease-related genes. ...
Article
Full-text available
Hearing loss is a total or partial inability to hear. Approximately 5% of people worldwide experience this condition. Hearing capacity is closely related to language, social, and basic emotional development; hearing loss is particularly serious in children. The pathogenesis of childhood hearing loss remains poorly understood. Here, we sought to identify new genes potentially associated with two types of hearing loss in children: congenital deafness and otitis media. We used a network-based method incorporating a random walk with restart algorithm, as well as a protein-protein interaction framework, to identify genes potentially associated with either pathogenesis. A following screening procedure was performed and 18 and 87 genes were identified, which potentially involved in the development of congenital deafness or otitis media, respectively. These findings provide novel biomarkers for clinical screening of childhood deafness; they contribute to a genetic understanding of the pathogenetic mechanisms involved.
... Thus, we employed the linkage test. Several studies have reported that interacting proteins are more likely to have similar functions (Ng et al., 2010;Hu et al., 2011a;Hu et al., 2011b;Chen et al., 2016;Cai et al., 2017;Li et al., 2018;Zhang and Chen, 2020;Zhu et al., 2021). Considering the strength of the PPI, proteins that could comprise a PPI with a higher confidence score were more likely to exhibit similar functions. ...
Article
Full-text available
Lymphoma is a serious type of cancer, especially for adolescents and elder adults, although this malignancy is quite rare compared with other types of cancer. The cause of this malignancy remains ambiguous. Genetic factor is deemed to be highly associated with the initiation and progression of lymphoma, and several genes have been related to this disease. Determining the pathogeny of lymphoma by identifying the related genes is important. In this study, we presented a random walk-based method to infer the novel lymphoma-associated genes. From the reported 1,458 lymphoma-associated genes and protein–protein interaction network, raw candidate genes were mined by using the random walk with restart algorithm. The determined raw genes were further filtered by using three screening tests (i.e., permutation, linkage, and enrichment tests). These tests could control false-positive genes and screen out essential candidate genes with strong linkages to validate the lymphoma-associated genes. A total of 108 inferred genes were obtained. Analytical results indicated that some inferred genes, such as RAC3, TEC, IRAK2/3/4, PRKCE, SMAD3, BLK, TXK, PRKCQ, were associated with the initiation and progression of lymphoma.
... In recent years, it is quite popular to adopt networks for investigating various diseases [3,[9][10][11][12][13][14]. Networks can organize data and information in a system level. ...
Article
Full-text available
Oncogene is a special type of genes, which can promote the tumor initiation. Good study on oncogenes is helpful for understanding the cause of cancers. Experimental techniques in early time are quite popular in detecting oncogenes. However, their defects become more and more evident in recent years, such as high cost and long time. The newly proposed computational methods provide an alternative way to study oncogenes, which can provide useful clues for further investigations on candidate genes. Considering the limitations of some previous computational methods, such as lack of learning procedures and terming genes as individual subjects, a novel computational method was proposed in this study. The method adopted the features derived from multiple protein networks, viewing proteins in a system level. A classic machine learning algorithm, random forest, was applied on these features to capture the essential characteristic of oncogenes, thereby building the prediction model. All genes except validated oncogenes were ranked with a measurement yielded by the prediction model. Top genes were quite different from potential oncogenes discovered by previous methods, and they can be confirmed to become novel oncogenes. It was indicated that the newly identified genes can be essential supplements for previous results.
... To investigate the interactions or cross-talks among different brain regions that were responsive to neuropathic pain, DEGs between SNI mice and Sham mice in NAc, mPFC and PAG were mapped onto STRING networks (a widely used network for bioinformatics studies [34][35][36][37][38]. Only the STRING networks with high confidence interactions were included, in other words, the confidence score of the interaction must be greater than 0.900. ...
Article
Full-text available
Objective: Neuropathic pain is directly developed from lesions or somatosensory nervous system diseases that are associated with emotion regulation. In general population, the incidence of neuropathic pain ranges from 7% to 10%, but the underlying mechanism remains largely unknown. Neuropathic pain is often associated with structural and functional abnormalities in multiple brain regions, and its regulation has been shown to correspond with the forebrain, including nucleus accumbens (NAc), medial prefrontal cortex (mPFC) and periaqueductal gray (PAG). Materials and methods: To investigate the molecular mechanism of neuropathic pain across different brain regions, we identified the differentially expressed genes (DEGs) between the spared nerve injury model (SNI) mice suffering neuropathic pain and the control Sham mice in NAc, mPFC and PAG three brain regions, and mapped these genes onto a comprehensively functional association network. Thereafter, novel neuropathic pain genes in these three regions were identified using With Random Walk with Restart (RWR) analysis, such as Asic3, Cd200r1 and MT2, besides well-known Capn11 and CYP2E1. Results: Interactions or cross talks among DEGs in NAc, mPFC and PAG three brain regions were discovered. Conclusions: Our results provide novel insights into neuropathic pain and help to explore therapeutic targets in the treatment.
Article
Full-text available
Current understandings of individual disease etiology and therapeutics are limited despite great need. To fill the gap, we propose a novel computational pipeline which collects potent disease gene cooperative pathways to envision individualized disease etiology and therapies. Our algorithm constructs individualized disease modules de novo which enable us to elucidate the importance of mutated genes in specific patients and to understand the synthetic penetrance of these genes across patients. We reveal that importance of notorious cancer drivers TP53 and PIK3CA fluctuate widely across breast cancers and peak in tumors with distinct numbers of mutations, and that rarely mutated genes such as XPO1 and PLEKHA1 have high disease module importance in specific individuals. Furthermore, individualized module disruption enables us to devise customized singular and combinatorial target therapies which were highly varied across patients demonstrating the need for precision therapeutics pipelines. As the first analysis of de novo individualized disease modules, we illustrate the power of individualized disease modules for precision medicine by providing deep novel insights on the activity of diseased genes in individuals.
Article
Breast cancer is a highly heterogeneous disease. Subtyping the disease and identifying the genomic features driving these subtypes are critical for precision oncology for breast cancer. This study focuses on developing a new computational approach for breast cancer subtyping. We proposed to use Bayesian tensor factorization (BTF) to integrate multi-omics data of breast cancer, which include expression profiles of RNA-sequencing, copy number variation, and DNA methylation measured on 762 breast cancer patients from The Cancer Genome Atlas. We applied a consensus clustering approach to identify breast cancer subtypes using the factorized latent features by BTF. Subtype-specific survival patterns of the breast cancer patients were evaluated using Kaplan-Meier (KM) estimators. The proposed approach was compared with other state-of-the-art approaches for cancer subtyping. The BTF-subtyping analysis identified 17 optimized latent components, which were used to reveal six major breast cancer subtypes. Out of all different approaches, only the proposed approach showed distinct survival patterns (p < 0.05). Statistical tests also showed that the identified clusters have statistically significant distributions. Our results showed that the proposed approach is a promising strategy to efficiently use publicly available multi-omics data to identify breast cancer subtypes.
Article
In the presence study, we introduced cholesterol (CLO) conjugated bovine serum albumin nanoparticles (BSA NPs) as a new system for indirect targeting drug delivery. Tamoxifen, as an anticancer drug, was loaded on BSA NPs (BSA‐TAX NPs); CLO was then conjugated to the BSA‐TAX NPs surface for the targeted delivery of NPs system, by EDC/NHS carbodiimide chemistry (CLO‐BSA‐TAX NPs). The physicochemical properties, toxicity, in vitro, and in vivo biocompatibility of the BSA NPs system were characterized on cancer cell lines (4T1). The results revealed that the BSA NPs system has a regular spherical shape and negative zeta‐potential values. The drug release of BSA NPs system has shown controlled and pH‐dependent drug release behavior. BSA NPs system was biocompatible but it was potentially toxic on the cancer cell line. The CLO‐BSA‐TAX NPs exhibited higher toxicity against cancer cell lines than other NPs formulation (BSA NPs and BSA‐TAX NPs). It can be concluded that the CLO, as an indirect targeting agent, enhances the toxicity and specificity of NPs system on cancer cell lines. It could potentially be suitable approaches to targeting the tumors in clinical cancer therapy. This article is protected by copyright. All rights reserved.
Article
Diabetic retinopathy is a common complication of diabetes mellitus that causes pathogenic damage to the retina. Particularly, the proliferative diabetic retinopathy (PDR) state can cause abnormal angiogenesis in the retina tissues and trigger the retina destruction in advanced stage. In the clinic, the symptoms during the initiation and progression of PDR are relatively unrecognizable. Therefore, various studies have focused on the pathogenesis of PDR. According to published literature, genetic contributions play an irreplaceable role in the initiation and progression of PDR. Although many computational methods, such as shortest path- and random walk with restart-based methods, have been applied in screening the potential pathogenic factors of PDR, advanced computational methods, which may provide essential supplements for previous ones, are still widely needed. In this study, a novel computational method was presented to infer novel PDR-associated genes. Different from previous methods, the method used in this work employed a different network algorithm, that is, the Laplacian heat diffusion algorithm. This algorithm was applied on the protein–protein interaction network reported in the STRING database. Three screening tests were performed to filter the most likely inferred genes. A total of 26 genes were accessed using the proposed method. Compared with the two previous predictions, most of the identified genes were novel, and only one gene was shared. Several inferred genes, such as CSF3, COL18A1, CXCR2, CCR1, FGF23, CXCL11, and IL13, were related to the pathogenesis of PDR.
Preprint
Full-text available
Neuropathic pain is the direct result caused by lesions or somatosensory nervous system diseases that are associated with emotional regulation. The incidence of neuropathic pain in the general population is 7-10% and the mechanisms of neuropathic pain are largely unknown. It is often related to structural and functional abnormalities in multiple brain regions. The forebrain, including nucleus accumbens (NAc), medial prefrontal cortex (mPFC) and periaqueductal gray (PAG) have been shown to correspond with the regulation of neuropathic pain. To investigate the molecular mechanism of neuropathic pain across different brain regions, we identified the differentially expressed genes between the spared nerve injury model (SNI) mice of neuropathic pain and the control Sham mice in NAc, mPFC and PAG and mapped these genes onto comprehensive functional association network. With Random Walk with Restart (RWR) analysis, we identified more novel neuropathic pain genes in NAc, mPFC and PAG, such as Asic3, Cd200r1 and MT2, beside well known Capn11 and CYP2E1. What’s more, we discovered their interactions or cross talks. Our results provided novel insights of neuropathic pain and provided therapeutic targets for treating neuropathic pain.
Article
Full-text available
DAVID bioinformatics resources consists of an integrated biological knowledgebase and analytic tools aimed at systematically extracting biological meaning from large gene/protein lists. This protocol explains how to use DAVID, a high-throughput and integrated data-mining environment, to analyze gene lists derived from high-throughput genomic experiments. The procedure first requires uploading a gene list containing any number of common gene identifiers followed by analysis using one or more text and pathway-mining tools such as gene functional classification, functional annotation chart or clustering and functional annotation table. By following this protocol, investigators are able to gain an in-depth understanding of the biological themes in lists of genes that are enriched in genome-scale studies.
Article
Full-text available
Hepatitis C virus (HCV) is an infectious virus that can cause serious illnesses. Only a few drugs have been reported to effectively treat hepatitis C. To have greater diversity in drug choice and better treatment options, it is necessary to develop more drugs to treat the infection. However, it is time-consuming and expensive to discover candidate drugs using experimental methods, and computational methods may complement experimental approaches as a preliminary filtering process. This type of approach was proposed by using known chemical-chemical interactions to extract interactive compounds with three known drug compounds of HCV, and the probabilities of these drug compounds being able to treat hepatitis C were calculated using chemical-protein interactions between the interactive compounds and HCV target genes. Moreover, the randomization test and expectation-maximization (EM) algorithm were both employed to exclude false discoveries. Analysis of the selected compounds, including acyclovir and ganciclovir, indicated that some of these compounds had potential to treat the HCV. Hopefully, this proposed method could provide new insights into the discovery of candidate drugs for the treatment of HCV and other diseases.
Article
Full-text available
Cancer is a serious disease that causes many deaths every year. We urgently need to design effective treatments to cure this disease. Tumor suppressor genes (TSGs) are a type of gene that can protect cells from becoming cancerous. In view of this, correct identification of TSGs is an alternative method for identifying effective cancer therapies. In this study, we performed gene ontology (GO) and pathway enrichment analysis of the TSGs and non-TSGs. Some popular feature selection methods, including minimum redundancy maximum relevance (mRMR) and incremental feature selection (IFS), were employed to analyze the enrichment features. Accordingly, some GO terms and KEGG pathways, such as biological adhesion, cell cycle control, genomic stability maintenance and cell death regulation, were extracted, which are important factors for identifying TSGs. We hope these findings can help in building effective prediction methods for identifying TSGs and thereby, promoting the discovery of effective cancer treatments.
Article
Full-text available
Identifying disease genes is one of the most important topics in biomedicine and may facilitate studies on the mechanisms underlying disease. Age-related macular degeneration (AMD) is a serious eye disease; it typically affects older adults and results in a loss of vision due to retina damage. In this study, we attempt to develop an effective method for distinguishing AMD-related genes. Gene ontology and KEGG enrichment analyses of known AMD-related genes were performed, and a classification system was established. In detail, each gene was encoded into a vector by extracting enrichment scores of the gene set, including it and its direct neighbors in STRING, and gene ontology terms or KEGG pathways. Then certain feature-selection methods, including minimum redundancy maximum relevance and incremental feature selection, were adopted to extract key features for the classification system. As a result, 720 GO terms and 11 KEGG pathways were deemed the most important factors for predicting AMD-related genes.
Book
Humans are diurnal organisms whose biological clock and temporal organization depend on natural light/dark cycles. Changes in the photoperiod are a signal for seasonal acclimatization of physiological and immune systems as well as behavioral patterns. The invention of electrical light bulbs created more opportunities for work and leisure. However, exposure to artificial light at night (LAN) affects our biological clock, and suppresses pineal melatonin (MLT) production. Among its other properties, MLT is an antioncogenic agent, and therefore its suppression increases the risks of developing breast and prostate cancers (BC&PC). To the best of our knowledge, this book is the first to address the linkage between light pollution and BC&PC in humans. It explains several state-of-the-art theories, linking light pollution with BC&PC. It also illustrates research hypotheses about health effects of light pollution using the results of animal models and population-based studies. © Springer Science+Business Media Dordrecht 2013. All rights are reserved.