ArticlePDF Available

STITCH 2: An interaction network database for small molecules and proteins

November 2009
Nucleic Acids Research 38(Database issue):D552-6

November 2009
38(Database issue):D552-6

DOI:10.1093/nar/gkp937

Source
PubMed

License
CC BY-NC 2.0

Authors:

Michael Kuhn

International School of Management, Germany

Damian Szklarczyk

Swiss Institute of Bioinformatics

Andrea Franceschini

University of Zurich

Show all 8 authorsHide

Over the last years, the publicly available knowledge on interactions between small molecules and proteins has been steadily increasing. To create a network of interactions, STITCH aims to integrate the data dispersed over the literature and various databases of biological pathways, drug–target relationships and binding affinities. In STITCH 2, the number of relevant interactions is increased by incorporation of BindingDB, PharmGKB and the Comparative Toxicogenomics Database. The resulting network can be explored interactively or used as the basis for large-scale analyses. To facilitate links to other chemical databases, we adopt InChIKeys that allow identification of chemicals with a short, checksum-like string. STITCH 2.0 connects proteins from 630 organisms to over 74 000 different chemicals, including 2200 drugs. STITCH can be accessed at http://stitch.embl.de/.

Interaction network around aspirin. Human proteins predicted to interact with aspirin according to different sources of evidence are shown. Edges are colored according to the source of evidence (magenta: experimental information, cyan: manually curated databases, yellow: text-mining). Clicking on the node ‘aspirin’ will display a pop-up showing the structure and description.

…

Different structural scaffolds corresponding to aspirin. For the drug aspirin, a link to PubChem and a short description is shown. Different salts of aspirin that will have the same bioactivity have been consolidated and merged with the main, uncharged form. Below each chemical structure, the first part of the InChIKey is shown, corresponding to an encoded (hashed) description of the structure. This short string can be used to search for more information about the compound on the Internet.

…

Interactions of prostaglandin-endoperoxide synthase 1 (PTGS1). (a) The highest-scoring interaction partners of PTGS1 are non-steroidal anti-inflammatory drugs (NSAIDs). As the confidence scores for these interactions are very high, no interacting proteins are shown. (b) The user may ask STITCH to display more of the interaction context and to let at least one-third of the interaction partners be proteins. In this case, STITCH is skipping 19 high-scoring chemicals in order to include four interacting proteins. In both networks, the color of the edge corresponds to the type of connected nodes (e.g. green: chemical–protein interaction) and the width of the edge correlates with the confidence score.

…

Figures - uploaded by Damian Szklarczyk

Content may be subject to copyright.

Content uploaded by Damian Szklarczyk

Content may be subject to copyright.

Available via license: CC BY-NC 2.0

Content may be subject to copyright.

STITCH 2: an interaction network database for

small molecules and proteins

Michael Kuhn

, Damian Szklarczyk

, Andrea Franceschini

, Monica Campillos

Christian von Mering

, Lars Juhl Jensen

, Andreas Beyer

and Peer Bork

4,5,

Biotechnology Center, TU Dresden, 01062 Dresden, Germany,

Novo Nordisk Foundation Center for Protein

Research, Faculty of Health Sciences, University of Copenhagen, Blegdamsvej 3b, 2200 Copenhagen,

Denmark,

Institute of Molecular Biology and Swiss Institute of Bioinformatics, University of Zurich, Switzerland,

European Molecular Biology Laboratory, Meyerhof street 1, 69117 Heidelberg and

Max-Delbru

¨ck-Centre

for Molecular Medicine, Robert-Ro

¨ssle-Strasse 10, 13092 Berlin, Germany

Received September 15, 2009; Revised October 8, 2009; Accepted October 9, 2009

ABSTRACT

Over the last years, the publicly available knowledge

on interactions between small molecules and

proteins has been steadily increasing. To create a

network of interactions, STITCH aims to integrate

the data dispersed over the literature and various

databases of biological pathways, drug–target

relationships and binding affinities. In STITCH 2,

the number of relevant interactions is increased

by incorporation of BindingDB, PharmGKB and

the Comparative Toxicogenomics Database. The

resulting network can be explored interactively

or used as the basis for large-scale analyses.

To facilitate links to other chemical databases,

we adopt InChIKeys that allow identification of

chemicals with a short, checksum-like string.

STITCH 2.0 connects proteins from 630 organisms

to over 74 000 different chemicals, including 2200

drugs. STITCH can be accessed at http://stitch

.embl.de/.

INTRODUCTION

The eﬀects of small molecules on organisms have long

been the focus of biochemistry and pharmacology. Over

the last years there has been a considerable increase in the

number of high-throughput screens that have been per-

formed using chemical libraries (1–3). At the same time,

the molecular targets of individual chemicals are being

studied in ever greater detail (4,5). There also is a great

interest in chemical biology approaches, using small

molecules to perturb cellular functions (6). For the

design and interpretation of these studies, the context

of the chemicals and proteins needs to be considered.

For example, in the case of high-content screening for

speciﬁc cellular eﬀects, it is important to know whether

the active chemicals already have known activities that can

explain the observed eﬀects, or whether novel mechanisms

of actions might be present. Therefore, we have developed

a ‘search tool for interactions of chemicals’ (STITCH)

both as a large-scale, downloadable database of interac-

tion data and as an interactive web tool for the explora-

tion of interaction networks (Figure 1). Since its ﬁrst

release (7), STITCH is being accessed by over one

hundred scientists each week and has been used as a

source of protein–chemical associations e.g. by

Prathipati et al. (8), who used the STITCH network to

automatically extract the targets of anti-tuberculosis

compounds in Mycobacterium tuberculosis.

Here, we present the second version of STITCH. In

addition to the sources of protein–chemical interactions

included in the previous version—PDSP K

Database (9),

Protein Data Bank (PDB) (10), KEGG (11), Reactome

(12), NCI-Nature Pathway Interaction Database

(http://pid.nci.nih.gov), DrugBank (13) and MATADOR

(14)—we now further include interactions imported from

GLIDA (15), PharmGKB (16,17), Comparative

Toxicogenomics Database (CTD) (18) and BindingDB

(19). These added databases mainly provide information

on interactions between human proteins and drugs or

drug-like molecules.

The imported sources of information are scored sepa-

rately and then combined with information from text-

mining (7). Databases which contain manually annotated

interactions receive high scores, while interactions based

on experimental information are scored by the conﬁdence

or relevance of the reported interaction. The number of

high-conﬁdence (score 0.7) human chemical–protein

interactions increased from 51 000 to 85 000. For these

high-conﬁdence interactions, the number of interacting

*To whom correspondence should be addressed. Tel: +49 6221 387 8526; Fax: +49 6221 387 517; Email: bork@embl.de

D552–D556 Nucleic Acids Research, 2010, Vol. 38, Database issue Published online 6 November 2009

doi:10.1093/nar/gkp937

ßThe Author(s) 2009. Published by Oxford University Press.

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/

by-nc/2.5/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

human proteins increases from 5300 to 7400 (as STITCH

is locus-based, only one gene product is counted per gene).

INCREASING THE NUMBER OF SPECIFIED

ACTIONS

The STITCH network is created by mapping interactions

from the sources mentioned above and from text-mining

onto a consolidated set of chemicals that has been derived

from PubChem, assigning a conﬁdence score for each

interaction (7). The newly-derived protein–chemical and

chemical–chemical associations are then complemented

with protein–protein interactions from the STRING

database (20). In the previous version of STITCH (7),

we began to import ‘actions’ derived from natural

language processing (NLP), pathway and interaction

databases. These actions specify the nature of the interac-

tion independent of the source of interaction information.

For example, a ‘binding’ action could be derived from a

binding aﬃnity database and an ‘inhibition’ action could

be imported from NLP. We have greatly extended the

set of available actions by further importing action

types from GLIDA (15), PharmGKB (16,17), CTD (18),

BindingDB (19) and a manually annotated set of

interactions. This set of interactions has been curated

from DrugBank (13) records, results from NLP analysis

of PubMed abstracts, Medical Subject Headings (MeSH)

pharmacological actions, Anatomical Therapeutic

Chemical classiﬁcation (ATC) entries and a review paper

on drugs and their targets (21). An action has been

assigned to 81% of the high-conﬁdence human

chemical–protein interactions. The number of available

edges with a high-conﬁdence action annotation increased

from 44 000 to 65 000 human chemical–protein

interactions.

HANDLING OF CHEMICAL STRUCTURES

As described previously (7), STITCH creates a

consolidated set of chemicals from PubChem (22) by

merging stereo isomers and salt forms of the same

molecule into one compound. This is done to ensure

that all information about the same biologically active

entity is merged. While this works very well for drugs

that can be supplied in diﬀerent formulations (e.g. diﬀer-

ent salt forms), it also has limitations, especially regarding

carbohydrates. It is our long-term goal to associate

interactions both with the individual isomer and the

Figure 1. Interaction network around aspirin. Human proteins predicted to interact with aspirin according to diﬀerent sources of evidence

are shown. Edges are colored according to the source of evidence (magenta: experimental information, cyan: manually curated databases, yellow:

text-mining). Clicking on the node ‘aspirin’ will display a pop-up showing the structure and description.

Nucleic Acids Research, 2010, Vol. 38, Database issue D553

merged structure. For now, we have taken the step to

explicitly display all the diﬀerent compounds that have

been deemed biologically equivalent (Figure 2).

Recently, the International Union of Pure and Applied

Chemistry (IUPAC) has standardized an open format for

chemical structures, namely the IUPAC International

Chemical Identiﬁer (InChI). In addition to the existing

capability to search chemical structures using SMILES

string, we have now also implemented a search for

InChIs. We use the tool Open Babel to convert InChIs

to SMILES strings, which are in turn searched against

our chemical database by using hashed ﬁngerprints as

implemented in the open-source Chemical Development

Kit (23). Furthermore, we have implemented a search

for InChIKeys, which are short strings that represent an

encoded (hashed) form of the chemical structure.

InChIKeys consist of two parts, the ﬁrst of which is

based on the chemical connectivity, whereas the second

part contains information about stereochemistry,

tautomers and other structural variations. As STITCH

currently considers structures with the same connectivity

to be equivalent (thus merging stereo isomers), only the

ﬁrst part of the InChIKey is queried against our chemical

database. We also use this part of the InChIKey to

provide links to Google and ChemSpider.

USER INTERFACE IMPROVEMENTS

Many proteins, especially drug targets, have a large

number of high-scoring interactions with small molecules

in the STITCH network. In this case, a network centered

on such a protein will only show chemicals unless very

many interaction partners are requested to be shown

(Figure 3a). In order to allow the user to see more of

the context of the query protein, we now oﬀer the

option to show a network in which proteins and chemicals

each make up more than a third of the nodes (Figure 3b).

When this option is selected, only a limited number of the

highest-scoring chemicals are displayed. Further chemicals

are omitted in favor of proteins (or vice versa) and their

number is shown to the user. If the network consists of

only chemicals, but no proteins are available at the current

Figure 2. Diﬀerent structural scaﬀolds corresponding to aspirin. For the drug aspirin, a link to PubChem and a short description is shown. Diﬀerent

salts of aspirin that will have the same bioactivity have been consolidated and merged with the main, uncharged form. Below each chemical structure,

the ﬁrst part of the InChIKey is shown, corresponding to an encoded (hashed) description of the structure. This short string can be used to search

for more information about the compound on the Internet.

D554 Nucleic Acids Research, 2010, Vol. 38, Database issue

settings (e.g. due to a minimum score limit), then the

option to show more context is not shown.

Previously, STITCH required the user to select an

organism when searching for interactions with a

chemical. Now, this is not required anymore. When no

organism is selected, the organism with the highest-

scoring interaction partners is selected. In case of

multiple organisms with equal scores, human and several

model organisms are preferentially selected. (Human

is one of the highest-ranking species for 60% of the

chemicals with protein–chemical interactions.) For

example, the binding between the antipsychotic agent

ﬂuspiperone and the 5–hydroxytryptamine (serotonin)

receptor 7 has only been studied in mouse and rat.

Consequently, a user searching for this compound would

be directed to the protein–chemical interaction network

in mouse. It is also possible to restrict the search to diﬀer-

ent levels of the NCBI taxonomy (24), e.g. bacteria, fungi

or rodents.

While central repositories of gene annotations exist, no

such information is available in a centralized manner for

chemicals. To be able to display text annotation for

chemicals, we have imported information from the follow-

ing databases: DrugBank (13), National Cancer Institute

(NCI) thesaurus (25), MeSH descriptors and qualiﬁers.

Using STITCH’s dictionary of chemical synonyms we

mapped compounds from these databases to STITCH

identiﬁers. In case where descriptions are available for

diﬀerent forms of the same compound (e.g. diﬀerent salt

forms, which have been merged in STITCH), we have

automatically assigned the description of the main com-

pound. Any remaining inconsistencies were manually

resolved. For each chemical we have assigned the text

annotation from only one source, prioritizing sources as

follows: NCI (descriptions), DrugBank (descriptions),

DrugBank (pharmacology), DrugBank (drug category),

MeSH (pharmacological action), NCI (tags) and MeSH

(scope note). Descriptions are available for 33 352

chemicals, covering 33% of the chemicals with

interactions.

USE CASES

The STITCH homepage oﬀers several short tutorials to

introduce the diﬀerent query options (e.g. searching for a

single identiﬁer or multiple chemical structures). A search

for ‘aspirin’ on the homepage will lead to the interaction

network shown in Figure 1. Here, the main interactors of

the drug are shown in human (which is selected automat-

ically as described above). The known main targets,

PTGS1 and PTGS2, are connected by very high scores.

While most interaction partners are backed up by evidence

from manually curated databases and are therefore very

reliable, one interaction is derived only from text-mining:

COX1 is actually a false positive arising from an ambigu-

ous synonym.

Taken together, STITCH 2 oﬀers an enlarged set of

protein–chemical interactions, extended inter-database

operability, increased query options and an improved

user interface. STITCH can be accessed at http://stitch

.embl.de/. Users can explore the interaction network

interactively or download the complete set of interactions.

In addition, we provide an application programming

interface (API) to let scripts resolve identiﬁers and

Figure 3. Interactions of prostaglandin-endoperoxide synthase 1 (PTGS1). (a) The highest-scoring interaction partners of PTGS1 are non-steroidal

anti-inﬂammatory drugs (NSAIDs). As the conﬁdence scores for these interactions are very high, no interacting proteins are shown. (b) The user may

ask STITCH to display more of the interaction context and to let at least one-third of the interaction partners be proteins. In this case, STITCH

is skipping 19 high-scoring chemicals in order to include four interacting proteins. In both networks, the color of the edge corresponds to the type

of connected nodes (e.g. green: chemical–protein interaction) and the width of the edge correlates with the conﬁdence score.

Nucleic Acids Research, 2010, Vol. 38, Database issue D555

retrieve interaction networks either as an image or in

standard network formats (20).

FUNDING

Klaus Tschira Foundation (to M.K. and A.B.). Novo

Nordisk Foundation Center for Protein Research

(partial). Funding for open access charge: European

Molecular Biology Laboratory.

Conﬂict of interest statement. None declared.

REFERENCES

1. Han,L., Wang,Y. and Bryant,S.H. (2009) A survey of across-target

bioactivity results of small molecules in PubChem. Bioinformatics,

25, 2251–2255.

2. Zanzoni,A., Soler-Lo

´pez,M. and Aloy,P. (2009) A network

medicine approach to human disease. FEBS lett.,583, 1759–1765.

3. Peterson,R.T. (2008) Chemical biology and the limits of

reductionism. Nature Chem. Biol.,4, 635–638.

4. Ovaa,H. and van Leeuwen,F. (2008) Chemical biology approaches

to probe the proteome. Chembiochem : Eur. J. Chem. Biol.,9,

2913–2919.

5. Rix,U. and Superti-Furga,G. (2009) Target proﬁling of small

molecules by chemical proteomics. Nature Chem. Biol.,5, 616–624.

6. Edwards,A.M., Bountra,C., Kerr,D.J. and Willson,T.M. (2009)

Open access chemical and clinical probes to support drug discovery.

Nature Chem. Biol.,5, 436–440.

7. Kuhn,M., von Mering,C., Campillos,M., Jensen,L.J. and Bork,P.

(2008) STITCH: interaction networks of chemicals and proteins.

Nucleic Acids Res.,36, D684–D688.

8. Prathipati,P., Ma,N.L., Manjunatha,U.H. and Bender,A. (2009)

Fishing the target of antitubercular compounds: in silico target

deconvolution model development and validation. J. Proteome Res.,

8, 2788–2798.

9. Roth,B., Lopez,E., Patel,S. and Kroeze,W. (2000) The

multiplicity of serotonin receptors: uselessly diverse molecules or

an embarrassment of riches? Neuroscientist,6, 262.

10. Berman,H.M., Westbrook,J., Feng,Z., Gilliland,G., Bhat,T.N.,

Weissig,H., Shindyalov,I.N. and Bourne,P.E. (2000) The Protein

Data Bank. Nucleic Acids Res.,28, 242.

11. Kanehisa,M., Goto,S., Hattori,M., Aoki-Kinoshita,K.F., Itoh,M.,

Kawashima,S., Katayama,T., Araki,M. and Hirakawa,M. (2006)

From genomics to chemical genomics: new developments in

KEGG. Nucleic Acids Res.,34, D354–D357.

12. Joshi-Tope,G., Gillespie,M., Vastrik,I., D’Eustachio,P., Schmidt,E.,

de Bono,B., Jassal,B., Gopinath,G.R., Wu,G.R., Matthews,L. et al.

(2005) Reactome: a knowledgebase of biological pathways.

Nucleic Acids Res.,33, D428–D432.

13. Wishart,D.S., Knox,C., Guo,A.C., Shrivastava,S., Hassanali,M.,

Stothard,P., Chang,Z. and Woolsey,J. (2006) DrugBank: a

comprehensive resource for in silico drug discovery and exploration.

Nucleic Acids Res.,34, D668–D672.

14. Gu

¨nther,S., Kuhn,M., Dunkel,M., Campillos,M., Senger,C.,

Petsalaki,E., Ahmed,J., Urdiales,E.G., Gewiess,A., Jensen,L.J. et al.

(2008) SuperTarget and Matador: Resources for exploring drug-

target relationships. Nucleic Acids Res.,36, D919–D922.

15. Okuno,Y., Yang,J., Taneishi,K., Yabuuchi,H. and Tsujimoto,G.

(2006) GLIDA: GPCR-ligand database for chemical genomic drug

discovery. Nucleic Acids Res.,34, D673–D677.

16. Gong,L., Owen,R.P., Gor,W., Altman,R.B. and Klein,T.E. (2008)

PharmGKB: an integrated resource of pharmacogenomic

data and knowledge. Curr. Protoc. Bioinformatics, Chapter

14(Unit 14):17.

17. Hewett,M., Oliver,D.E., Rubin,D.L., Easton,K.L., Stuart,J.M.,

Altman,R.B. and Klein,T.E. (2002) PharmGKB: the

Pharmacogenetics Knowledge Base. Nucleic Acids Res.,30,

163–165.

18. Davis,A.P., Murphy,C.G., Saraceni-Richards,C.A.,

Rosenstein,M.C., Wiegers,T.C. and Mattingly,C.J. (2009)

Comparative Toxicogenomics Database: a knowledgebase and

discovery tool for chemical-gene-disease networks. Nucleic Acids

Res.,37, D786–D792.

19. Liu,T., Lin,Y., Wen,X., Jorissen,R.N. and Gilson,M.K. (2007)

BindingDB: a web-accessible database of experimentally determined

protein-ligand binding aﬃnities. Nucleic Acids Res.,35,

D198–D201.

20. Jensen,L.J., Kuhn,M., Stark,M., Chaﬀron,S., Creevey,C., Muller,J.,

Doerks,T., Julien,P., Roth,A., Simonovic,M. et al. (2009) STRING

8—a global view on proteins and their functional interactions in

630 organisms. Nucleic Acids Res.,37, D412–D416.

21. Imming,P., Sinning,C. and Meyer,A. (2006) Drugs, their targets

and the nature and number of drug targets. Nature Rev. Drug Disc.,

5, 821–834.

22. Wheeler,D.L., Barrett,T., Benson,D.A., Bryant,S.H., Canese,K.,

Chetvernin,V., Church,D.M., DiCuccio,M., Edgar,R., Federhen,S.

et al. (2007) Database resources of the National Center for

Biotechnology Information. Nucleic Acids Res.,35, D5–D12.

23. Steinbeck,C., Hoppe,C., Kuhn,S., Floris,M., Guha,R. and

Willighagen,E. (2006) Recent developments of the chemistry

development kit (CDK)—an open-source java library for

chemo- and bioinformatics. Curr. Pharm. Des.,12, 2111–2120.

24. Wheeler,D.L., Barrett,T., Benson,D.A., Bryant,S.H., Canese,K.,

Chetvernin,V., Church,D.M., DiCuccio,M., Edgar,R., Federhen,S.

et al. (2007) Database resources of the National Center for

Biotechnology Information. Nucleic Acids Res.,35, D5–D12.

25. Sioutos,N., de Coronado,S., Haber,M.W., Hartel,F.W., Shaiu,W.L.

and Wright,L.W. (2007) NCI Thesaurus: a semantic model

integrating cancer-related clinical and molecular information.

J. Biomed. Inform.,40, 30–43.

D556 Nucleic Acids Research, 2010, Vol. 38, Database issue

LSTM-SAGDTA: Predicting Drug-target Binding Affinity with an Attention Graph Neural Network and LSTM Approach

Article

Full-text available

Jan 2024
CURR PHARM DESIGN

Introduction: Drug development is a challenging and costly process, yet it plays a crucial role in improving healthcare outcomes. Drug development requires extensive research and testing to meet the demands for economic efficiency, cures, and pain relief. Methods: Drug development is a vital research area that necessitates innovation and collaboration to achieve significant breakthroughs. Computer-aided drug design provides a promising avenue for drug discovery and development by reducing costs and improving the efficiency of drug design and testing. Results: In this study, a novel model, namely LSTM-SAGDTA, capable of accurately predicting drug-target binding affinity, was developed. We employed SeqVec for characterizing the protein and utilized the graph neural networks to capture information on drug molecules. By introducing self-attentive graph pooling, the model achieved greater accuracy and efficiency in predicting drug-target binding affinity. Conclusion: Moreover, LSTM-SAGDTA obtained superior accuracy over current state-of-the-art methods only by using less training time. The results of experiments suggest that this method represents a high-precision solution for the DTA predictor.

A novel cholesterol metabolism-related ferroptosis pathway in hepatocellular carcinoma

Article

Full-text available

Jan 2024

Background Emerging studies have reported the contribution of cholesterol to hepatocellular carcinoma (HCC) progression. However, the specific role and mechanism of cholesterol metabolism on spontaneous and progressive HCC development from the point of view of ferroptosis are still worth exploring. The present study aimed to reveal a novel mechanism of cholesterol metabolism-related ferroptosis in hepatocellular carcinoma cells. Methods Two microarray datasets (GSE25097, GSE22058) related to HCC were downloaded from Gene Expression Omnibus (GEO) datasets. Metabolomics analysis was performed by ultra performance liquid chromatography - tandem mass spectrometer (UPLC-MS/MS). The cholesterol-related proteins were downloaded from HMBD. Ferroptosis-related genes were extracted from FerrDb database. Data sets were separated into two groups. GSE25097 was used to identify ferroptosis-related genes, and GSE22058 was used to verify results. During these processes, chemical–protein interaction (CPI), protein–protein interaction (PPI), the Gene Ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were conducted. Multivariate logistic regression analysis was used to test the associated pathway. Results We identified 8 differentially expressed ferroptosis-related genes (HAMP, PTGS2, IL1B, ALOX15B, CDKN2A, RRM2, NQO1 and KIF20A) and 4 differentially expressed cholesterol-related genes (LCAT, CH25H, CEL and CYP7A1). Furthermore, based on the predicted results with STITCH, we identified indomethacin and IL1B as the essential node for cholesterol-mediated ferroptosis in hepatocellular carcinoma cell. Multivariate logistic regression analysis showed the activities of plasma IL1B in liver cancer patients enrolled have been significantly affected by the level of plasma cholesterol (P < 0.001) and the test result of IL1B is a predictor variable causing the changes of serum Fe levels (P < 0.001). Conclusions Our findings shed new light on the association between cholesterol metabolism and ferroptosis in HCC, and suggest that IL1B is the necessary node for cholesterol to lead to ferroptosis process in HCC. Also, we identified the potential role of indomethacin in adjuvant therapy of HCC with complications of abnormal cholesterol metabolism.

Network pharmacology integrated with experimental verification reveals the antipyretic characteristics and mechanism of Zi Xue powder

Article

Full-text available

Dec 2023

Context Zi Xue Powder (ZXP) is a traditional formula for the treatment of fever. However, the potential mechanism of action of ZXP remains unknown. Objective This study elucidates the antipyretic characteristics of ZXP and the mechanism by which ZXP alleviates fever. Materials and methods The key targets and underlying fever-reducing mechanisms of ZXP were predicted using network pharmacology and molecular docking. The targets of ZXP anti-fever active ingredient were obtained by searching TCMSP, STITCH and HERB. Moreover, male Sprague-Dawley rats were randomly divided into four groups: control, lipopolysaccharide (LPS), ZXP (0.54, 1.08, 2.16 g/kg), and positive control (acetaminophen, 0.045 g/kg); the fever model was established by intraperitoneal LPS injection. After the fever model was established at 0.5 h, the rats were administered treatment by gavage, and the anal temperature changes of each group were observed over 10 h after treatment. After 10 h, ELISA and Western blot analysis were used to further investigate the mechanism of ZXP. Results Network pharmacology analysis showed that MAPK was a crucial pathway through which ZXP suppresses fever. The results showed that ZXP (2.16 g/kg) decreased PGE2, CRH, TNF-a, IL-6, and IL-1β levels while increasing AVP level compared to the LPS group. Furthermore, the intervention of ZXP inhibited the activation of MAPK pathway in LPS-induced fever rats. Conclusions This study provides new insights into the mechanism by which ZXP reduces fever and provides important information and new research ideas for the discovery of antipyretic compounds from traditional Chinese medicine.

HHCDB: a database of human heterochromatin regions

Article

Full-text available

Oct 2023

Heterochromatin plays essential roles in eukaryotic genomes, such as regulating genes, maintaining genome integrity and silencing repetitive DNA elements. Identifying genome-wide heterochromatin regions is crucial for studying transcriptional regulation. We propose the Human Heterochromatin Chromatin Database (HHCDB) for archiving heterochromatin regions defined by specific or combined histone modifications (H3K27me3, H3K9me2, H3K9me3) according to a unified pipeline. 42 839 743 heterochromatin regions were identified from 578 samples derived from 241 cell-types/cell lines and 92 tissue types. Genomic information is provided in HHCDB, including chromatin location, gene structure, transcripts, distance from transcription start site, neighboring genes, CpG islands, transposable elements, 3D genomic structure and functional annotations. Furthermore, transcriptome data from 73 single cells were analyzed and integrated to explore cell type-specific heterochromatin-related genes. HHCDB affords rich visualization through the UCSC Genome Browser and our self-developed tools. We have also developed a specialized online analysis platform to mine differential heterochromatin regions in cancers. We performed several analyses to explore the function of cancer-specific heterochromatin-related genes, including clinical feature analysis, immune cell infiltration analysis and the construction of drug-target networks. HHCDB is a valuable resource for studying epigenetic regulation, 3D genomics and heterochromatin regulation in development and disease. HHCDB is freely accessible at http://hhcdb.edbc.org/.

Metabolomic insights in advanced cardiomyopathy of chronic chagasic and idiopathic patients that underwent heart transplant

Article

Full-text available

Apr 2024

Heart failure (HF) studies typically focus on ischemic and idiopathic heart diseases. Chronic chagasic cardiomyopathy (CCC) is a progressive degenerative inflammatory condition highly prevalent in Latin America that leads to a disturbance of cardiac conduction system. Despite its clinical and epidemiological importance, CCC molecular pathogenesis is poorly understood. Here we characterize and discriminate the plasma metabolomic profile of 15 patients with advanced HF referred for heart transplantation – 8 patients with CCC and 7 with idiopathic dilated cardiomyopathy (IDC) – using gas chromatography/quadrupole time-of-flight mass spectrometry. Compared to the 12 heart donor individuals, also included to represent the control (CTRL) scenario, patients with advanced HF exhibited a metabolic imbalance with 21 discriminating metabolites, mostly indicative of accumulation of fatty acids, amino acids and important components of the tricarboxylic acid (TCA) cycle. CCC vs. IDC analyses revealed a metabolic disparity between conditions, with 12 CCC distinctive metabolites vs. 11 IDC representative metabolites. Disturbances were mainly related to amino acid metabolism profile. Although mitochondrial dysfunction and loss of metabolic flexibility may be a central mechanistic event in advanced HF, metabolic imbalance differs between CCC and IDC populations, possibly explaining the dissimilar clinical course of Chagas’ patients.

Article

Full-text available

Apr 2024

Ischemic stroke (IS) is a common cerebrovascular disease whose pathogenesis involves a variety of immune molecules, immune channels and immune processes. 6-methyladenosine (m6A) modification regulates a variety of immune metabolic and immunopathological processes, but the role of m6A in IS is not yet understood. We downloaded the data set GSE58294 from the GEO database and screened for m6A-regulated differential expression genes. The RF algorithm was selected to screen the m6A key regulatory genes. Clinical prediction models were constructed and validated based on m6A key regulatory genes. IS patients were grouped according to the expression of m6A key regulatory genes, and immune markers of IS were identified based on immune infiltration characteristics and correlation. Finally, we performed functional enrichment, protein interaction network analysis and molecular prediction of the immune biomarkers. We identified a total of 7 differentially expressed genes in the dataset, namely METTL3, WTAP, YWHAG, TRA2A, YTHDF3, LRPPRC and HNRNPA2B1. The random forest algorithm indicated that all 7 genes were m6A key regulatory genes of IS, and the credibility of the above key regulatory genes was verified by constructing a clinical prediction model. Based on the expression of key regulatory genes, we divided IS patients into 2 groups. Based on the expression of the gene LRPPRC and the correlation of immune infiltration under different subgroups, LRPPRC was identified as an immune biomarker for IS. GO enrichment analyses indicate that LRPPRC is associated with a variety of cellular functions. Protein interaction network analysis and molecular prediction indicated that LRPPRC correlates with a variety of immune proteins, and LRPPRC may serve as a target for IS drug therapy. Our findings suggest that LRPPRC is an immune marker for IS. Further analysis based on LRPPRC could elucidate its role in the immune microenvironment of IS.

Experimental validation and comprehensive analysis of m6A methylation regulators in intervertebral disc degeneration subpopulation classification

Article

Full-text available

Apr 2024

Intervertebral disc degeneration (IVDD) is one of the most prevalent causes of chronic low back pain. The role of m6A methylation modification in disc degeneration (IVDD) remains unclear. We investigated immune-related m6A methylation regulators as IVDD biomarkers through comprehensive analysis and experimental validation of m6A methylation regulators in disc degeneration. The training dataset was downloaded from the GEO database and analysed for differentially expressed m6A methylation regulators and immunological features, the differentially regulators were subsequently validated by a rat IVDD model and RT-qPCR. Further screening of key m6A methylation regulators based on machine learning and LASSO regression analysis. Thereafter, a predictive model based on key m6A methylation regulators was constructed for training sets, which was validated by validation set. IVDD patients were then clustered based on the expression of key m6A regulators, and the expression of key m6A regulators and immune infiltrates between clusters was investigated to determine immune markers in IVDD. Finally, we investigated the potential role of the immune marker in IVDD through enrichment analysis, protein-to-protein network analysis, and molecular prediction. By analysising of the training set, we revealed significant differences in gene expression of five methylation regulators including RBM15, YTHDC1, YTHDF3, HNRNPA2B1 and ALKBH5, while finding characteristic immune infiltration of differentially expressed genes, the result was validated by PCR. We then screen the differential m6A regulators in the training set and identified RBM15 and YTHDC1 as key m6A regulators. We then used RBM15 and YTHDC1 to construct a predictive model for IVDD and successfully validated it in the training set. Next, we clustered IVDD patients based on the expression of RBM15 and YTHDC1 and explored the immune infiltration characteristics between clusters as well as the expression of RBM15 and YTHDC1 in the clusters. YTHDC1 was finally identified as an immune biomarker for IVDD. We finally found that YTHDC1 may influence the immune microenvironment of IVDD through ABL1 and TXK. In summary, our results suggest that YTHDC1 is a potential biomarker for the development of IVDD and may provide new insights for the precise prevention and treatment of IVDD.

Exploration of compatibility rules and discovery of active ingredients in TCM formulas by network pharmacology

Article

Apr 2024

Exploring HMMR as a therapeutic frontier in breast cancer treatment, its interaction with various cell cycle genes, and targeting its overexpression through specific inhibitors

Article

Full-text available

Mar 2024

Among women, breast carcinoma is one of the most complex cancers, with one of the highest death rates worldwide. There have been significant improvements in treatment methods, but its early detection still remains an issue to be resolved. This study explores the multifaceted function of hyaluronan-mediated motility receptor (HMMR) in breast cancer progression. HMMR's association with key cell cycle regulators (AURKA, TPX2, and CDK1) underscores its pivotal role in cancer initiation and advancement. HMMR's involvement in microtubule assembly and cellular interactions, both extracellularly and intracellularly, provides critical insights into its contribution to cancer cell processes. Elevated HMMR expression triggered by inflammatory signals correlates with unfavorable prognosis in breast cancer and various other malignancies. Therefore, recognizing HMMR as a promising therapeutic target, the study validates the overexpression of HMMR in breast cancer and various pan cancers and its correlation with certain proteins such as AURKA, TPX2, and CDK1 through online databases. Furthermore, the pathways associated with HMMR were explored using pathway enrichment analysis, such as Gene Ontology, offering a foundation for the development of effective strategies in breast cancer treatment. The study further highlights compounds capable of inhibiting certain pathways, which, in turn, would inhibit the upregulation of HMMR in breast cancer. The results were further validated via MD simulations in addition to molecular docking to explore protein-protein/ligand interaction. Consequently, these findings imply that HMMR could play a pivotal role as a crucial oncogenic regulator, highlighting its potential as a promising target for the therapeutic intervention of breast carcinoma.

Integration of Omics Data and Network Models to Unveil Negative Aspects of SARS-CoV-2, from Pathogenic Mechanisms to Drug Repurposing

Article

Full-text available

Aug 2023

Simple Summary SARS-CoV-2 caused the COVID-19 health emergency, affecting millions of people worldwide. Samples collected from hospitalized or dead patients from the early stages of pandemic have been analyzed over time, and to date they still represent an invaluable source of information to shed light on the molecular mechanisms underlying the organ/tissue damage. In combination with clinical data, omics profiles and network models play a key role providing a holistic view of the pathways, processes and functions most affected by viral infection. In fact, networks are being increasingly adopted for the integration of multiomics data, and recently their use has expanded to the identification of drug targets or the repositioning of existing drugs. Abstract Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) caused the COVID-19 health emergency, affecting and killing millions of people worldwide. Following SARS-CoV-2 infection, COVID-19 patients show a spectrum of symptoms ranging from asymptomatic to very severe manifestations. In particular, bronchial and pulmonary cells, involved at the initial stage, trigger a hyper-inflammation phase, damaging a wide range of organs, including the heart, brain, liver, intestine and kidney. Due to the urgent need for solutions to limit the virus’ spread, most efforts were initially devoted to mapping outbreak trajectories and variant emergence, as well as to the rapid search for effective therapeutic strategies. Samples collected from hospitalized or dead COVID-19 patients from the early stages of pandemic have been analyzed over time, and to date they still represent an invaluable source of information to shed light on the molecular mechanisms underlying the organ/tissue damage, the knowledge of which could offer new opportunities for diagnostics and therapeutic designs. For these purposes, in combination with clinical data, omics profiles and network models play a key role providing a holistic view of the pathways, processes and functions most affected by viral infection. In fact, in addition to epidemiological purposes, networks are being increasingly adopted for the integration of multiomics data, and recently their use has expanded to the identification of drug targets or the repositioning of existing drugs. These topics will be covered here by exploring the landscape of SARS-CoV-2 survey-based studies using systems biology approaches derived from omics data, paying particular attention to those that have considered samples of human origin.

REACTOME: a knowledgebase of biological pathways

Article

Full-text available

Jan 2005
NUCLEIC ACIDS RES

Reactome, located at http://www.reactome.org is a curated, peer-reviewed resource of human biological processes. Given the genetic makeup of an organism, the complete set of possible reactions constitutes its reactome. The basic unit of the Reactome database is a reaction; reactions are then grouped into causal chains to form pathways. The Reactome data model allows us to represent many diverse processes in the human system, including the pathways of intermediary metabolism, regulatory pathways, and signal transduction, and high-level processes, such as the cell cycle. Reactome provides a qualitative framework, on which quantitative data can be superimposed. Tools have been developed to facilitate custom data entry and annotation by expert biologists, and to allow visualization and exploration of the finished dataset as an interactive process map. Although our primary curational domain is pathways from Homo sapiens, we regularly create electronic projections of human pathways onto other organisms via putative orthologs, thus making Reactome relevant to model organism research communities. The database is publicly available under open source terms, which allows both its content and its software infrastructure to be freely used and redistributed.

SuperTarget and Matador: resources for exploring drug-target relationships

Article

Full-text available

Jan 2007

The molecular basis of drug action is often not well understood. This is partly because the very abundant and diverse information generated in the past decades on drugs is hidden in millions of medical articles or textbooks. Therefore, we developed a one-stop data warehouse, SuperTarget that integrates drug-related information about medical indication areas, adverse drug effects, drug metabolization, pathways and Gene Ontology terms of the target proteins. An easy-to-use query interface enables the user to pose complex queries, for example to find drugs that target a certain pathway, interacting drugs that are metabolized by the same cytochrome P450 or drugs that target the same protein but are metabolized by different enzymes. Furthermore, we provide tools for 2D drug screening and sequence comparison of the targets. The database contains more than 2500 target proteins, which are annotated with about 7300 relations to 1500 drugs; the vast majority of entries have pointers to the respective literature source. A subset of these drugs has been annotated with additional binding information and indirect interactions and is available as a separate resource called Matador. SuperTarget and Matador are available at http://insilico.charite.de/supertarget and http://matador.embl.de.

Rix U, Superti-Furga GTarget profiling of small molecules by chemical proteomics. Nat Chem Biol 5: 616-624

Article

Full-text available

Oct 2009
NAT CHEM BIOL

The medical and pharmaceutical communities are facing a dire need for new druggable targets, while, paradoxically, the targets of some drugs that are in clinical use or development remain elusive. Many compounds have been found to be more promiscuous than originally anticipated, which can potentially lead to side effects, but which may also open up additional medical uses. As we move toward systems biology and personalized medicine, comprehensively determining small molecule-target interaction profiles and mapping these on signaling and metabolic pathways will become increasingly necessary. Chemical proteomics is a powerful mass spectrometry-based affinity chromatography approach for identifying proteome-wide small molecule-protein interactions. Here we will provide a critical overview of the basic concepts and recent advances in chemical proteomics and review recent applications, with a particular emphasis on kinase inhibitors and natural products.

Database resources of the National Center for Biotechnology Information

Article

Jan 2000
NUCLEIC ACIDS RES

David L Wheeler

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval and resources that operate on the data in GenBank and a variety of other biological data made available through NCBI’s Web site. NCBI data retrieval resources include Entrez, PubMed, LocusLink and the Taxonomy Browser. Data analysis resources include BLAST, Electronic PCR, OrfFinder, RefSeq, UniGene, Database of Single Nucleotide Polymorphisms (dbSNP), Human Genome Sequencing pages, GeneMap’99, Davis Human–Mouse Homology Map, Cancer Chromosome Aberration Project (CCAP) pages, Entrez Genomes, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, Cancer Genome Anatomy Project (CGAP) pages, SAGEmap, Online Mendelian Inheritance in Man (OMIM) and the Molecular Modeling Database (MMDB). Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov

The Protein Data Bank

Article

Jan 2000

Helen Berman

The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.

ChemInform Abstract: Chemical Biology Approaches to Probe the Proteome

Article

Mar 2009
ChemInform

ChemInform is a weekly Abstracting Service, delivering concise information at a glance that was extracted from about 200 leading journals. To access a ChemInform Abstract of an article which was published elsewhere, please select a “Full Text” option. The original article is trackable via the “References” option.

The Protein Data Bank/ Nucleic Acids Research

Article

Jan 2000

The Multiplicity of Serotonin Receptors: Uselessly Diverse Molecules or an Embarrassment of Riches?

Article

Aug 2000

A large number of 5-HT receptors (>15) have been identified by molecular cloning technology over the past 10 years. This review briefly summarizes available information regarding the functional and therapeutic implications of serotonin receptor diversity for neurology and psychiatry. 5-HT receptors are divided into seven main families: 5-HT1, 5-HT2, 5-HT3, 5-HT4, 5-HT5, 5-HT6, and 5-HT7. Several families (e.g., 5-HT1 family) have many members (e.g., 5-HT1A, 5-HT1B, 5-HT1D, 5-HT1E, 5-HT1F), each of which is encoded by a distinct gene product. In addition to the genomic diversity of 5-HT receptors, splice variants and editing isoforms exist for many of the 5-HT receptors, making the family even more diverse. Evidence that is summarized in this review suggests that 5-HT receptors represent novel therapeutic targets for a number of neurologic and psychiatric diseases including migraine headaches, chronic pain conditions, schizophrenia, anxiety, depression, eating disorders, obsessive compulsive disorder, pervasive developmental disorders, and obesity-related conditions (Type II diabetes, hypertension, obesity syndromes). It is possible that sub-type-selective serotonergic agents may revolutionize the treatment for a number of medical, psychiatric, and neurological disorders.

The Protein Data Bank

Article

Dec 1999
NUCLEIC ACIDS RES

The Protein Data Bank, 1999–

Chapter

Jan 2001

In 1998, members of the Research Collaboratory for Structural Bioinformatics became the managers of the Protein Data Bank archive. This chapter details the systems used for the deposition, annotation and distribution of the data in the archive. This chapter is also available as HTML from the International Tables Online site hosted by the IUCr.

STITCH 2: An interaction network database for small molecules and proteins

Abstract and Figures

Recommended publications

Protein-Protein Docking Benchmark Version 4.0

Computational approaches to protein-protein interaction

The BridgeDb framework: Standardized access to gene, protein and metabolite identifier mapping servi...

Genome Scale Enzyme-Metabolite and Drug-Target Interaction Predictions using the Signature Molecular...