"False positives" predicted by DeepFold.

"False positives" predicted by DeepFold.

Source publication
Preprint
Full-text available
Motivation Understanding the relationship between protein structure and function is a fundamental problem in protein science. Given a protein of unknown function, fast identification of similar protein structures from the Protein Data Bank (PDB) is a critical step for inferring its biological function. Such structural neighbors can provide evolutio...

Context in source publication

Context 1
... addition to visualizing the structural motifs learned by DeepFold, we also check what are the "false-positive" hits found by DeepFold. From the Figure 7, we observe that the "wrongly" predicted top structures also share substantial structural similarity with the query proteins, indicating that our DeepFold is potentially capable of finding remotely related structural neighbors which contains similar motif composition. ...

Citations

... Protein structure prediction is usually categorized as being either template-based modeling such as Deep-Fold (J. Li, Cao, and Cheng 2015;Liu et al. 2017), FALCON (Wang et al. 2015), MTMG (J. Li and Cheng 2016), and I-TASSER (Roy, Kucukural, and Zhang 2010); or template-free (ab initio) modeling such as QUARK (Roy, Kucukural, and Zhang 2010;Xu and Zhang 2012) and UniCon3D . ...
Article
Full-text available
Quality Assessment (QA) plays an important role in protein structure prediction. Traditional multimodel QA method usually suffer from searching databases or comparing with other models for making predictions, which usually fail when the poor quality models dominate the model pool. We propose a novel protein single-model QA method which is built on a new representation that converts raw atom information into a series of carbon-alpha (Cα) atoms with side-chain information, defined by their dihedral angles and bond lengths to the prior residue. An LSTM network is used to predict the quality by treating each amino acid as a time-step and consider the final value returned by the LSTM cells. To the best of our knowledge, this is the first time anyone has attempted to use an LSTM model on the QA problem; furthermore, we use a new representation which has not been studied for QA. In addition to angles, we make use of sequence properties like secondary structure parsed from protein structure at each time-step without using any database, which is different than all existed QA methods. Our model achieves an overall correlation of 0.651 on the CASP12 testing dataset. Our experiment points out new directions for QA problem and our method could be widely used for protein structure prediction problem. The software is freely available at GitHub: https://github.com/caorenzhi/AngularQA
... Protein structure prediction is usually categorized as being either template-based modeling such as DeepFold (Li, Cao, and Cheng 2015;Liu et al. 2017), FALCON (Wang et al. 2015), MTMG (Li and Cheng 2016), and I-TASSER (Roy, Kucukural, and Zhang 2010); or template-free (ab initio) modeling such as QUARK (Roy, Kucukural, and Zhang 2010;Xu and Zhang 2012) and ...
Preprint
Full-text available
Quality Assessment (QA) plays an important role in protein structure prediction. Traditional protein QA methods suffer from searching databases or comparing with other models for making predictions, which usually fail. We propose a novel protein single-model QA method which is built on a new representation that converts raw atom information into a series of carbon-alpha (Cα) atoms with side-chain information, defined by their dihedral angles and bond lengths to the prior residue. An LSTM network is used to predict the quality by treating each amino acid as a time-step and consider the final value returned by the LSTM cells. To the best of our knowledge, this is the first time anyone has attempted to use an LSTM model on the QA problem; furthermore, we use a new representation which has not been studied for QA. In addition to angles, we make use of sequence properties like secondary structure at each time-step, without using any database. Our model achieves an overall correlation of 0.651 on the CASP12 testing dataset. Our experiment points out new directions for QA problem and our method could be widely used for protein structure prediction problem. The software is freely available at GitHub: https://github.com/caorenzhi/AngularQA