A semi screenshot for the webpage obtained by following Step 2 of Section 3.3.

A semi screenshot for the webpage obtained by following Step 2 of Section 3.3.

Context in source publication

Context 1
... 3. Click on the Submit button to see the predicted result. For instance, if you use the four protein sequences in the Example window as the input, after 10 seconds or so, you will see a new screen (Figure 3) occurring. On its upper part are listed the names of the subcellular locations numbered from (1) to (22) covered by the current predictor. ...

Similar publications

Cover Page
Full-text available
Yogsothoth carteri (Yogsothothidae, Panacanthocystida, Centroplasthelida, Haptista, Diaphoretickes, Eukaryota) ### License CC BY-NC-ND 4.0 ### Yegor Shɨshkin-Skarð

Citations

Article
Full-text available
Protein subcellular localization is a novel and promising area and is defined as searching for the specific location of proteins inside the cell, such as in the nucleus, in the cytoplasm or on the cell membrane. With the rapid development of next-generation sequencing technology, more and more new protein sequences have been continuously discovered. It is no longer sufficient to merely use traditional wet experimental methods to predict the subcellular localization of these new proteins. Therefore, it is urgent to develop high-throughput computational methods to achieve quick and precise protein subcellular localization predictions. This review summarizes the development of prediction methods for protein subcellular localization over the past decades, expounds on the application of various machine learning methods in this field, and compares the properties and performance of various well-known predictors. The narrative of this review mainly revolves around three main types of methods, namely, the sequence-based methods, the knowledge-based methods, and the fusion methods. A special focus is on the gene ontology (GO)-based methods and the PLoc series methods. Finally, this review looks forward to the future development directions of protein subcellular localization prediction.
Article
Full-text available
Background Protein subcellular localization prediction plays an important role in biology research. Since traditional methods are laborious and time-consuming, many machine learning-based prediction methods have been proposed. However, most of the proposed methods ignore the evolution information of proteins. In order to improve the prediction accuracy, we present a deep learning-based method to predict protein subcellular locations. Results Our method utilizes not only amino acid compositions sequence but also evolution matrices of proteins. Our method uses a bidirectional long short-term memory network that processes the entire protein sequence and a convolutional neural network that extracts features from protein sequences. The position specific scoring matrix is used as a supplement to protein sequences. Our method was trained and tested on two benchmark datasets. The experiment results show that our method yields accurate results on the two datasets with an average precision of 0.7901, ranking loss of 0.0758 and coverage of 1.2848. Conclusion The experiment results show that our method outperforms five methods currently available. According to those experiments, we can see that our method is an acceptable alternative to predict protein subcellular location.
Article
Full-text available
It is well known that DNA-protein binding (DPB) prediction is not only beneficial to understand the regulation mechanism of gene expression but also a challenging task in the field of computational biology. Traditional methods for DPB prediction that depend on manually extracted features may lead to classification errors. Recently, deep learning such as convolutional neural network (CNN) has been successfully applied to classification tasks and improved DPB prediction performance significantly. Yet, these methods are based on the original DNA sequence modeling, ignoring the hidden complex dependency and complementarity between multiple sequence features. In consideration of this problem, we propose a method to fuse different sequence features and analyze them systematically through multi-scale CNN. First, sliding windows of specified lengths are set on distinct DNA sequences to generate multiple sequence features with unequal lengths. Second, multiple feature sequences are fused and encoded for feature representation. Third, multi-scale CNN with different binding motif lengths is used to automatically learn and mine the influence of internal attributes and hidden complex relations between the fusion sequence features and make full use of the complementary advantages of extracted CNN features to predict DPB. When our model is applied to 690 ChIP-seq datasets, it achieves an average AUC of 0.9112, which is significantly better than the latest methods. The results show that our method is effective for DPB prediction and is freely available at http://121.5.71.120/mscDPB/.