Ben Shor's research works | Hebrew University of Jerusalem and other places

This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.

It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.

If you're a ResearchGate member, you can follow this page to keep up with this author's work.

If you are this author, and you don't want us to display this page anymore, please let us know.

Integrative modeling meets deep learning: Recent advances in modeling protein assemblies

Literature Review

May 2024

11 Reads

Current Opinion in Structural Biology

Ben Shor

Dina Schneidman

CombFold filtering visualization
For each assembly tree, in each step, CombFold joins two previously assembled subcomplexes, into many new subcomplexes by applying input transformations between pairs of subunits. These new subcomplexes are filtered to discard suboptimal subcomplexes. The first filter is by crossing a threshold of allowed steric clashes between amino acids of different subunits, in this example, the threshold is 5%. The second filter is by not satisfying enough of the distance restraints present in the subcomplex, here the threshold is 70%. The last filter scores each subcomplex based on the used transformation scores and the distance restraints satisfaction rate.

Heteromeric benchmark datasets
Heteromeric complexes (colored by chain) from (a) Benchmark 1 and (b) Benchmark 2.

Accuracy of CombFold on Benchmark 2
(a) The Top-N (N = 1, 5, 10) success rate of CombFold (blue), AFMv3 (orange), and RosettaFold2 (green). (b) TM-score of AFMv3 models vs. CombFold models for Top-5 results (c) The Top-N (N = 1, 5, 10) success rate of CombFold (blue),CombFold with crosslinks (turquoise), AlphaLink (purple) and HADDOCK(brown). (d) TM-score of CombFold models with crosslinks vs. without crosslinks for Top-1 results. (e) Interface contact similarity (ICS) of CombFold vs. AFMv3 for Top-1 model. (f) Comparison of PRODIGY predicted dissociation constants for interfaces of experimental structures vs. interfaces of structure models generated by CombFold. Spearman correlation of 0.55. (g) Distributions of clashscores are calculated using MolProbity for interfaces in the models of CombFold output models (left, N = 17) and the same models after relaxation (right, N = 17). Error bars indicate maxima, mean, and minima from top to bottom respectively.

Accuracy of pairwise predictions for AFMv2 and AFMv3
DockQ scores of pairwise interactions predicted by AFM on Benchmark 1 (AFMv2, N = 469) and Benchmark 2 (AFMv3, N = 445), for which the PAE-based score is over 50. The median score is 0.70 and 0.78 for AFMv2 and AFMv3, respectively. Error bars indicate maxima, mean, and minima from top to bottom respectively.

Modeling the human Elongator holoenzyme complex
(a) CombFold prediction for the human Elongator holoenzyme complex. (b) Part of the complex structure in yeast, as determined by Cryo-EM (PDB 8ASV). (c) The interface between Elp1 (green) and Elp2 (orange) with a likely pathogenic mutation P914L in Elp1 is depicted as sticks (red). (d) The interface between Elp4 (light blue) and Elp6 (sky blue) with a pathogenic mutation R289W in Elp4 is depicted as sticks (red).

CombFold: predicting structures of large protein assemblies using a combinatorial assembly algorithm and AlphaFold2

February 2024

63 Reads

7 Citations

Nature Methods

Ben Shor

Dina Schneidman

Deep learning models, such as AlphaFold2 and RosettaFold, enable high-accuracy protein structure prediction. However, large protein complexes are still challenging to predict due to their size and the complexity of interactions between multiple subunits. Here we present CombFold, a combinatorial and hierarchical assembly algorithm for predicting structures of large protein complexes utilizing pairwise interactions between subunits predicted by AlphaFold2. CombFold accurately predicted (TM-score >0.7) 72% of the complexes among the top-10 predictions in two datasets of 60 large, asymmetric assemblies. Moreover, the structural coverage of predicted complexes was 20% higher compared to corresponding Protein Data Bank entries. We applied the method on complexes from Complex Portal with known stoichiometry but without known structure and obtained high-confidence predictions. CombFold supports the integration of distance restraints based on crosslinking mass spectrometry and fast enumeration of possible complex stoichiometries. CombFold’s high accuracy makes it a promising tool for expanding structural coverage beyond monomeric proteins.

Download

Impact of AlphaFold on structure prediction of protein complexes: The CASP15‐CAPRI experiment

Article

October 2023

194 Reads

23 Citations

Proteins Structure Function and Bioinformatics

Marc Ferdinand Lensink

Guillaume Brysbaert

Nessim Raouraoua

[...]

Shoshana J. Wodak

We present the results for CAPRI Round 54, the 5th joint CASP‐CAPRI protein assembly prediction challenge. The Round offered 37 targets, including 14 homodimers, 3 homo‐trimers, 13 heterodimers including 3 antibody–antigen complexes, and 7 large assemblies. On average ~70 CASP and CAPRI predictor groups, including more than 20 automatics servers, submitted models for each target. A total of 21 941 models submitted by these groups and by 15 CAPRI scorer groups were evaluated using the CAPRI model quality measures and the DockQ score consolidating these measures. The prediction performance was quantified by a weighted score based on the number of models of acceptable quality or higher submitted by each group among their five best models. Results show substantial progress achieved across a significant fraction of the 60+ participating groups. High‐quality models were produced for about 40% of the targets compared to 8% two years earlier. This remarkable improvement is due to the wide use of the AlphaFold2 and AlphaFold2‐Multimer software and the confidence metrics they provide. Notably, expanded sampling of candidate solutions by manipulating these deep learning inference engines, enriching multiple sequence alignments, or integration of advanced modeling tools, enabled top performing groups to exceed the performance of a standard AlphaFold2‐Multimer version used as a yard stick. This notwithstanding, performance remained poor for complexes with antibodies and nanobodies, where evolutionary relationships between the binding partners are lacking, and for complexes featuring conformational flexibility, clearly indicating that the prediction of protein complexes remains a challenging problem.

Impact of AlphaFold on Structure Prediction of Protein Complexes: The CASP15-CAPRI Experiment

Preprint

July 2023

219 Reads

2 Citations

Marc Ferdinand Lensink

Guillaume Brysbaert

Nessim Raouraoua

[...]

Shoshana Wodak J

We present the results for CAPRI Round 54, the 5th joint CASP-CAPRI protein assembly prediction challenge. The Round offered 37 targets, including 14 homo-dimers, 3 homo-trimers, 13 hetero-dimers including 3 antibody-antigen complexes, and 7 large assemblies. On average ~70 CASP and CAPRI predictor groups, including more than 20 automatics servers, submitted models for each target. A total of 21941 models submitted by these groups and by 15 CAPRI scorer groups were evaluated using the CAPRI model quality measures and the DockQ score consolidating these measures. The prediction performance was quantified by a weighted score based on the number of models of acceptable quality or higher submitted by each group among their 5 best models. Results show substantial progress achieved across a significant fraction of the 60+ participating groups. High-quality models were produced for about 40% for the targets compared to 8% two years earlier, a remarkable improvement resulting from the wide use of the AlphaFold2 and AlphaFold-Multimer software. Creative use was made of the deep learning inference engines affording the sampling of a much larger number of models and enriching the multiple sequence alignments with sequences from various sources. Wide use was also made of the AlphaFold confidence metrics to rank models, permitting top performing groups to exceed the results of the public AlphaFold-Multimer version used as a yard stick. This notwithstanding, performance remained poor for complexes with antibodies and nanobodies, where evolutionary relationships between the binding partners are lacking, and for complexes featuring conformational flexibility, clearly indicating that the prediction of protein complexes remains a challenging problem.

Figure 1. The three stages of the CombFold assembly algorithm. The input is the sequences of the subunits in the complex. (1) Structure prediction of all pairwise and some larger subunit subsets using AFM. (2) Selection of representative subunit structures out of all predicted structures, followed by computation of all pairwise transformations present in predicted structures relative to the representative structures. (3) Combinatorial and hierarchical assembly of subunit structures using the computed pairwise transformations. In each iteration, new subcomplexes are assembled using a pairwise transformation to join two previously created subcomplexes.

Figure 2. Accuracy of CombFold on Benchmark 1. (a) The Top-N (N=1, 5, 10) success rate of CombFold (blue) and AFM (orange). AFM only produces 5 predictions. (b) Predicted confidence vs. the TM-score for CombFold. (c) Success rate of AFM in producing pairwise interactions as measured by the pairwise connectivity vs. the TM-score of the models produced by CombFold. (d) TM-score of AFM models vs. CombFold models. (e) eIF2B:eIF2

Figure 3. Accuracy of CombFold on Benchmark 3. (a) The Top-N (N=1, 5, 10) success rate of CombFold (blue) and MoLPC (orange). (b) Top-1 success rate for homomers and heteromers. (c) TM-score comparison for CombFold and MoLPC. (d) Predicted confidence vs. the TM-score for CombFold. (e) The number of complex amino acids vs. the Top-1 TM-score. (f) The success rate of AFM in producing pairwise interactions as measured by the pairwise connectivity vs. the TM-score. (g) High-quality model of F1-ATPase (top) vs. the x-ray structure (bottom). CombFold prediction contains 159 additional amino acids that are not modeled in the x-ray structure, providing full structural coverage. (h) Acceptable-quality model of Erwinia ligand-gated ion channel in complex with nanobodies (top) vs. x-ray structure (bottom). The channel is accurately modeled however the location of nanobodies is incorrect. (i) Incorrect model of zinc resistance-associated protein from Salmonella enterica (top) vs. x-ray structure (bottom).

Figure 5. Modeling the human Elongator holoenzyme complex. (a) CombFold prediction for the human Elongator holoenzyme complex. (b) Part of the complex structure in yeast, as determined by Cryo-EM (PDB 8ASV). (c) The interface between Elp1 (green) and Elp2 (orange) with a likely pathogenic mutation P914L in Elp1 is depicted as sticks (red). (d) The interface between Elp4 (light blue) and Elp6 (sky blue) with a pathogenic mutation R289W in Elp4 is depicted as sticks (red).

Figure 6. Stoichiometry prediction using CombFold. (a) A structure of mitochondrial ATP synthase with bound native cardiolipin (PDB 6TDX). Circled is a symmetrical structure formed from 10 copies of subunit c. (b) CombFold predicted confidence as a function of the number of copies of subunit c. (c) A structure of PelC dodecamer (PDB 5T11) (d) CombFold predicted confidence for PelC dodecamer as a function of the number of copies in input stoichiometry.

Predicting structures of large protein assemblies using combinatorial assembly algorithm and AlphaFold2

May 2023

77 Reads

2 Citations

Ben Shor

Dina Schneidman

Deep learning models, such as AlphaFold2 and RosettaFold, enable high-accuracy protein structure prediction. However, large protein complexes are still challenging to predict due to their size and the complexity of interactions between multiple subunits. Here we present CombFold, a combinatorial and hierarchical assembly algorithm for predicting structures of large protein complexes utilizing pairwise interactions between subunits predicted by AlphaFold2. CombFold accurately predicted (TM-score > 0.7) 72% of the complexes among the Top-10 predictions in two datasets of 60 large, asymmetric assemblies. Moreover, the structural coverage of predicted complexes was 20% higher compared to corresponding PDB entries. We applied the method on complexes from Complex Portal with known stoichiometry but without known structure and obtained high-confidence predictions. CombFold supports the integration of distance restraints based on crosslinking mass spectrometry and fast enumeration of possible complex stoichiometries. CombFold’s high accuracy makes it a promising tool for expanding structural coverage beyond monomeric proteins.

Download

... Two monomer protein structures with a TM-score> 0.5 are considered to have the same topology [71]. For complex models, they are considered acceptable quality if the TM-score is above 0.7 and high quality if the TM-score is above 0.8 [72]. When the reference structure is unknown, the predicted TM-score (pTM) [27] derived from AF2 assumes the existence of a distribution of probable structures and uses the pairwise error matrix to find the expected value of the TM-score for the predicted structure. ...
Reference:
Recent advances and challenges in protein complex model accuracy estimation

CombFold: predicting structures of large protein assemblies using a combinatorial assembly algorithm and AlphaFold2

Citing Article
Full-text available
February 2024

Nature Methods

Ben Shor

Dina Schneidman

... CASP (Critical Assessment of Techniques for Protein Structure Prediction) and CAPRI (Critical Assessment of PRedicted Interactions) are two worldwide experiments that rigorously test computational methods of predicting protein complex structures and estimate their accuracy. The latest competitions [33] offer widely accepted measures for assessing the overall (global) structural quality, interface quality, and local structural quality of predicted complex structures with respect to their true structures, as well as for evaluating the performance of EMA methods of estimating/predicting the accuracy of predicted complex structures, as discussed below. ...
Reference:
A Survey of Deep Learning Methods for Estimating the Accuracy of Protein Quaternary Structure Models

Impact of AlphaFold on structure prediction of protein complexes: The CASP15‐CAPRI experiment

Citing Article
October 2023

Proteins Structure Function and Bioinformatics

Marc Ferdinand Lensink

Guillaume Brysbaert

Nessim Raouraoua

[...]

Shoshana J. Wodak

... Complexes with weak evolutionary signals, lacking structural templates, or assembled from many heterogeneous subunits and containing over 1800 residues proved particularly challenging (Akdel et al., 2022, Ozden et al., 2023. These shortcomings were in part alleviated by optimizing MSA construction (Ozden et al., 2023), such as done in AFProfile (Bryant and Noé, 2023) and ESMPair ; increasing the number and diversity of predicted complex structures, for example, by increasing the number of iterations through the AF2-Multimer network (recycles) or randomly disabling neurons (dropout) (Johansson-Akhe and Wallner, 2022, Wallner, 2023); and assembling higher order complexes from smaller interacting subcomponents individually predicted by AF2-Multimer, such as done in MolPC , CombFold (Shor and Schneidman-Duhovny, 2023), and the method developed by Jeppesen and André. However, none of these strategies individually yielded acceptable structures for all test complexes. ...
Reference:
Growing ecosystem of deep learning methods for modeling protein–protein interactions

Predicting structures of large protein assemblies using combinatorial assembly algorithm and AlphaFold2

Citing Preprint
File available
May 2023

Ben Shor

Dina Schneidman

Ben Shor's research while affiliated with Hebrew University of Jerusalem and other places

What is this page?

Publications (5)

Citations (3)