The Francis Crick Institute
Discussion
Started 17th Mar, 2019
Find a transcription factor for a list of genes
Hi,
the short question - I have a list of genes and I would like to find which, if any, transcription factor(s) regulated some (or all) of these genes.
We have done some differential expression analysis against the mouse genome. for the result lists we would like to know if some of the DE genes are regulated/controlled by specific transcription factors.
to be more exact, we are interested to know which TF binds to the regulatory region of our input list of genes.
Which tools or databases are you using to answer this kind of questions?
thanks
Assa
Most recent answer
The simple way is to retrieve the promoter sequences of the differentially expressed genes (typically between 500bp to 1kb upstream of differentially expressed genes) then run de novo motif analysis using MEME. This will give you a list of enriched motifs which you can check against known binding sites of transcription factors. However, this will not really tell you the exact transcription factor that binds to the promoter region of a differentially expressed gene especially if the binding site is for a transcription factor that belongs to a large family.
Bear in mind that whilst this can give you valuable information, it will only tell you what transcription factors bind in close proximity to the target genes you're working with. Things are not that straight forward especially when working with mammals with complex genomes such as the mouse genome because genes can be induced by enhancer elements that are very far away (several kbs away) from the target genes and those enhancers can be upstream or downstream which complicates things even more. This will require more complex and sophisticated analysis to find those enhancer elements that control the expression of the genes in your list.
All replies (4)
Institute of Cancer Research
There are different approaches you can use to find TF binding sites or identify know transcription factors. Here are a few and you can choose depending on the approach or the purpose of your study.
- TRANSFAC (http://genexplain.com/transfac/) predicts the binding sites based on the gene sequence.
- DBD (http://www.transcriptionfactor.org/index.cgi?Home) predicts the TF based on the sequence domain families
- cRegulome ( ) is a tool developed in my lab. It is based on cistrome cancer which is as the name suggests specific to cancer. It is based on ChIP-Seq data so it doesn't predict but reports binding sites for 300+ TF in different types of cancer.
Shiv Nadar University
Hi Assa,
- You have to fetch the promoter regions of DE genes. [to feed MEME analysis]
- Check for enrichment of TF binding sites (if you think there are a few TFs that bind and regulate). Suggest to use MEME-chip for comprehensive analysis that gives much more info.
[or]
- Scan and find the motifs using FIMO tools in MEME suite [http://meme-suite.org/]
This package has JASPAR and other TF binding motif databases.
Best,
Mouli
Broad Institute of MIT and Harvard
You can use IPA which gives you upstream regulators for your differential gene set. It is pretty easy to use and has good tutorials. It does cost money though.
The Francis Crick Institute
The simple way is to retrieve the promoter sequences of the differentially expressed genes (typically between 500bp to 1kb upstream of differentially expressed genes) then run de novo motif analysis using MEME. This will give you a list of enriched motifs which you can check against known binding sites of transcription factors. However, this will not really tell you the exact transcription factor that binds to the promoter region of a differentially expressed gene especially if the binding site is for a transcription factor that belongs to a large family.
Bear in mind that whilst this can give you valuable information, it will only tell you what transcription factors bind in close proximity to the target genes you're working with. Things are not that straight forward especially when working with mammals with complex genomes such as the mouse genome because genes can be induced by enhancer elements that are very far away (several kbs away) from the target genes and those enhancers can be upstream or downstream which complicates things even more. This will require more complex and sophisticated analysis to find those enhancer elements that control the expression of the genes in your list.
Related Publications
Transcriptional regulation is accomplished by several transcription factor proteins that bind to specific DNA elements in the relative vicinity of the gene, and interact with each other and with Polymerase enzyme. Thus the determination of transcription factor-DNA binding is an important step toward understanding transcriptional regulation. An effe...