Figure - available from: Journal of Computer-Aided Molecular Design
This content is subject to copyright. Terms and conditions apply.
Ring bending for elaboration of ring system flexibility: initial 3D structure generation produced a reasonable conformer for tetracycline (upper left); ring bends are identified among atoms of a ring system according to rules, with an example for cyclohexane shown (middle left); iterative application of the bends identifies new ring conformations effectively (bottom left). Ring twisting for macrocyclic search: the initial structure of cyclodecane (upper right) is shown with four atoms marked; those atoms seen through the 2–3 axis (middle right) are pushed through a twisting motion where atom 4 is forced around the axis; iterative application of this strategy results in an effective enumeration of ring conformers for cyclodecane

Ring bending for elaboration of ring system flexibility: initial 3D structure generation produced a reasonable conformer for tetracycline (upper left); ring bends are identified among atoms of a ring system according to rules, with an example for cyclohexane shown (middle left); iterative application of the bends identifies new ring conformations effectively (bottom left). Ring twisting for macrocyclic search: the initial structure of cyclodecane (upper right) is shown with four atoms marked; those atoms seen through the 2–3 axis (middle right) are pushed through a twisting motion where atom 4 is forced around the axis; iterative application of this strategy results in an effective enumeration of ring conformers for cyclodecane

Source publication
Article
Full-text available
ForceGen is a template-free, non-stochastic approach for 2D to 3D structure generation and conformational elaboration for small molecules, including both non-macrocycles and macrocycles. For conformational search of non-macrocycles, ForceGen is both faster and more accurate than the best of all tested methods on a very large, independently curated...

Citations

... An additional challenge here is the presence of flexible macrocyclic ligands. Over the past several years, methods for computational modeling of macrocyclic ligands have made significant progress [2][3][4][5][6][7]. In particular, natural-product based and semisynthetic macrocycles of up to roughly 21-23 total rotatable bonds (including both macrocyclic bonds and exocyclic bonds) have been shown to be tractable, in terms of accuracy and speed of conformational search when utilizing multiple computing-cores [7]. ...
... Over the past several years, methods for computational modeling of macrocyclic ligands have made significant progress [2][3][4][5][6][7]. In particular, natural-product based and semisynthetic macrocycles of up to roughly 21-23 total rotatable bonds (including both macrocyclic bonds and exocyclic bonds) have been shown to be tractable, in terms of accuracy and speed of conformational search when utilizing multiple computing-cores [7]. However, larger peptidic macrocycles remain challenging, often requiring biophysical data (e.g. from NMR) to help restrain the conformational space to be explored [8]. ...
... However, larger peptidic macrocycles remain challenging, often requiring biophysical data (e.g. from NMR) to help restrain the conformational space to be explored [8]. Generally, the macrocycles studied here fell well within the tractable range of the ForceGen methodology [7]. ...
Article
Full-text available
Scaffold replacement as part of an optimization process that requires maintenance of potency, desirable biodistribution, metabolic stability, and considerations of synthesis at very large scale is a complex challenge. Here, we consider a set of over 1000 time-stamped compounds, beginning with a macrocyclic natural-product lead and ending with a broad-spectrum crop anti-fungal. We demonstrate the application of the QuanSA 3D-QSAR method employing an active learning procedure that combines two types of molecular selection. The first identifies compounds predicted to be most active of those most likely to be well-covered by the model. The second identifies compounds predicted to be most informative based on exhibiting low predicted activity but showing high 3D similarity to a highly active nearest-neighbor training molecule. Beginning with just 100 compounds, using a deterministic and automatic procedure, five rounds of 20-compound selection and model refinement identifies the binding metabolic form of florylpicoxamid. We show how iterative refinement broadens the domain of applicability of the successive models while also enhancing predictive accuracy. We also demonstrate how a simple method requiring very sparse data can be used to generate relevant ideas for synthetic candidates.
... Such a uniform treatment of the data set may become inaccurate when dealing with subtle details and/or local structure/conformation effects playing a significant role in aggregate formation. With these limitations in mind, Jain et al. (2019) optimized a MARTINI model to explain the striking different aggregation dynamics and morphologies of the closely related pentapeptides FLPLF and FLGLF. They introduced information on local peptide behavior extracted from previous all-atom simulations, apply supportive dihedral angles, adjust some bead types to finely tune solvent/peptide interactions, and introduce an additional bond between two peptides to mimic a dimer. ...
Chapter
Self-assembling peptides bear tremendous potential in the fields of material sciences, nanoscience, and medicine. In contrary to the popular building blocks used in supramolecular chemistry, which exploit rigid molecular structures with defined geometry, peptides are highly flexible. This feature renders the prediction of their most stable conformations and self-assembly ability, as well as an understanding of the mechanism behind aggregation, more challenging for experimental techniques. In this context, in silico techniques have progressed at a fast pace to provide highly valuable tools to study, predict, and visualize peptides’ behavior and their dynamics to assist with their design. In this chapter, we will provide an overview of popular computational techniques used to investigate the self-assembly of peptides and peptide-containing molecules. Together with the applications, we will briefly discuss the pros and cons of these methodologies and conclude with a perspective on the future directions that this exciting field can lead to.KeywordsSelf-assemblyPeptidesPeptide amphiphilesNanostructuresChiralityMolecular modelingMolecular dynamicsCoarse grainMachine learning
... Over the past several years, methods for computational modeling of macrocyclic ligands have made significant progress [4][5][6][7][8][9]. In particular, natural-product based and semi-synthetic macrocycles of up to roughly 21-23 total rotatable bonds (including both macrocyclic bonds and exocyclic bonds) have been shown to be tractable, in terms of accuracy and speed of conformational search when utilizing multiple computing-cores [9]. ...
... Over the past several years, methods for computational modeling of macrocyclic ligands have made significant progress [4][5][6][7][8][9]. In particular, natural-product based and semi-synthetic macrocycles of up to roughly 21-23 total rotatable bonds (including both macrocyclic bonds and exocyclic bonds) have been shown to be tractable, in terms of accuracy and speed of conformational search when utilizing multiple computing-cores [9]. However, larger peptidic macrocycles remain challenging, especially in cases where "ladders" of trans-annular hydrogen bonds do not form stabilizing networks. ...
... For comparison, the examples shown in Fig. 1 each have 60 or more total rotatable bonds-well beyond the tractable range without biophysical data to reduce the search space. Recently, we have shown how distance and dihedral restraints derived from NMR measurements can be used to elucidate low-energy solution ensembles for peptidic macrocycles [9][10][11]. Figure 2 illustrates how a preferred macrocycle conformation can be derived from either NMR-restrained conformational search [9] or from X-ray crystallography coupled with careful refinement of the bound macrocycle coordinates [12,13]. In many cases, obtaining an X-ray co-crystal structure of sufficient quality can be insurmountable. ...
Article
Full-text available
Systematic optimization of large macrocyclic peptide ligands is a serious challenge. Here, we describe an approach for lead-optimization using the PD-1/PD-L1 system as a retrospective example of moving from initial lead compound to clinical candidate. We show how conformational restraints can be derived by exploiting NMR data to identify low-energy solution ensembles of a lead compound. Such restraints can be used to focus conformational search for analogs in order to accurately predict bound ligand poses through molecular docking and thereby estimate ligand strain and protein-ligand intermolecular binding energy. We also describe an analogous ligand-based approach that employs molecular similarity optimization to predict bound poses. Both approaches are shown to be effective for prioritizing lead-compound analogs. Surprisingly, relatively small ligand modifications, which may have minimal effects on predicted bound pose or intermolecular interactions, often lead to large changes in estimated strain that have dominating effects on overall binding energy estimates. Effective macrocyclic conformational search is crucial, whether in the context of NMR-based restraints, X-ray ligand refinement, partial torsional restraint for docking/ligand-similarity calculations or agnostic search for nominal global minima. Lead optimization for peptidic macrocycles can be made more productive using a multi-disciplinary approach that combines biophysical data with practical and efficient computational methods.
... Global strain is calculated based on the difference between the bound-state energy (from the energy-surrogate calculation above) and the unbound-state minimum energy, which is the global minimum energy from an exhaustive conformational search of the ligand. This is calculated using the ForceGen conformational search method, which has been previously described 34,35 For small, drug-like molecules, the -pquant level of conformational elaboration is likely to be sufficient to identify global minima in the vast majority of cases, based on the greater than 90% success rate of identifying close-to-crystallographic conformers (≤1.0 Å RMSD) beginning from random starting conformations. However, particularly for large, peptidic macrocycles, we adopted an iterative approach to conformational search in order to better ensure adequate sampling. ...
Article
Full-text available
The internal conformational strain incurred by ligands upon binding a target site has a critical impact on binding affinity, and expectations about the magnitude of ligand strain guide conformational search protocols. Estimates for bound ligand strain begin with modeled ligand atomic coordinates from X-ray co-crystal structures. By deriving low-energy conformational ensembles to fit X-ray diffraction data, calculated strain energies are substantially reduced compared with prior approaches. We show that the distribution of expected global strain energy values is dependent on molecular size in a superlinear manner. The distribution of strain energy follows a rectified normal distribution whose mean and variance are related to conformational complexity. The modeled strain distribution closely matches calculated strain values from experimental data comprising over 3000 protein-ligand complexes. The distributional model has direct implications for conformational search protocols as well as for directions in molecular design.
... 30 Since then, other approaches have been explored to combine DG with NOE data as well as other experimental data. 24,31,32 In this study, we use the ETKDG 33 (experimental torsion knowledge distance geometry) conformer generator as implemented in the popular cheminformatics package RDKit 34 to combine it with NOE-derived interproton distances. ETKDG is a DG-type method, which incorporates experimental torsional preferences from crystal structures, resulting in a good performance. ...
Article
Full-text available
Nuclear magnetic resonance (NMR) data from NOESY (nuclear Overhauser enhancement spectroscopy) and ROESY (rotating frame Overhauser enhancement spectroscopy) experiments can easily be combined with distance geometry (DG) based conformer generators by modifying the molecular distance bounds matrix. In this work, we extend the modern DG based conformer generator ETKDG, which has been shown to reproduce experimental crystal structures from small molecules to large macrocycles well, to include NOE-derived interproton distances. In noeETKDG, the experimentally derived interproton distances are incorporated into the distance bounds matrix as loose upper (or lower) bounds to generate large conformer sets. Various subselection techniques can subsequently be applied to yield a conformer bundle that best reproduces the NOE data. The approach is benchmarked using a set of 24 (mostly) cyclic peptides for which NOE-derived distances as well as reference solution structures obtained by other software are available. With respect to other packages currently available, the advantages of noeETKDG are its speed and that no prior force-field parametrization is required, which is especially useful for peptides with unnatural amino acids. The resulting conformer bundles can be further processed with the use of structural refinement techniques to improve the modeling of the intramolecular nonbonded interactions. The noeETKDG code is released as a fully open-source software package available at www.github.com/rinikerlab/customETKDG.
... Details on the ForceGen methodology have been detailed previously. 38,39 The QuanSA initialization procedure (init) automatically builds multiple initial alignments, but it can be influenced by user knowledge and guidance. The -clknown parameter specifies a set of known poses for competitive ligands, in this case an alignment of six crystallographic ligands. ...
Article
Full-text available
We present results on the extent to which physics-based simulation (exemplified by FEP⁺) and focused machine learning (exemplified by QuanSA) are complementary for ligand affinity prediction. For both methods, predictions of activity for LFA-1 inhibitors from a medicinal chemistry lead optimization project were accurate within the applicable domain of each approach. A hybrid model that combined predictions by both approaches by simple averaging performed better than either method, with respect to both ranking and absolute pKi values. Two publicly available FEP⁺ benchmarks, covering 16 diverse biological targets, were used to test the generality of the synergy. By identifying training data specifically focused on relevant ligands, accurate QuanSA models were derived using ligand activity data known at the time of the original series publications. Results across the 16 benchmark targets demonstrated significant improvements both for ranking and for absolute pKi values using hybrid predictions that combined the FEP⁺ and QuanSA predicted affinity values. The results argue for a combined approach for affinity prediction that makes use of physics-driven methods as well as those driven by machine learning, each applied carefully on appropriate compounds, with hybrid prediction strategies being employed where possible.
... using the DOCK 6.7 programs (http://dock.compbio.ucsf.edu/DOCK_6/index.html,Lang et al, 2015), Surflex-Dock 4.5 (https://www.biopharmics.com/;Jain et al., 2019) and Gold 5.1 (https://www.ccdc.cam.ac.uk/solutions/csd-discovery/components/gold, ...
Article
Sickle cell disease (SCD) is a disease resulting from mutation in the globin portion of hemoglobin caused by the replacement of adenine for thymine in the codon of the β globin gene. In Brazil, SCD affects about 0.3% of the black and Caucasian population. Until now, there is no specific treatment and the available drugs have several serious adverse effects which makes the search for new drugs an emergently need. The use of computational techniques can accelerate the drug development process by prioritization of molecules with affinity against essential targets. Adenosine A2b receptor (rA2b) has been studied in SCD due to its relationship with red blood cells concentration of 2,3-diphosphoglycerate which reduces the hemoglobin affinity for oxygen (O2), facilitating its availability for the tissues. Then, development of rA2b antagonists could be helpful for the treatment of SCD. However, there is still no 3D structure of rA2b and to overcome this limitation, homology modeling should be applied. In this scenario, this study aims to build a suitable 3D model of rA2b by SWISS MODEL and to evaluate the structural aspects of rA2b with known antagonists that may be useful for the identification of new potential antagonists by molecular dynamics on a lipid bilayer environment using GROMACS 5.1.4. The complexes with antagonists ZINC223070016 and ZINC17974526 interacted with key residues by hydrophobic contacts and hydrogen bonds which stabilized them at the rA2b binding site. This intermolecular profile can contribute to the development of more potent rA2b antagonists. Communicated by Ramaswamy H. Sarma
... Here, we used an iterative implementation of ForceGen to locate the global minimum solution-state conformation as it is uniquely positioned to handle macrocyclic peptides by non-stochastically searching the conformational space with physical movements. 60,61 For bound-state ligand conformations, atomic coordinates from X-ray crystallographic models generally require some type of re-refinement to overcome small coordinate deviations that lead to erroneously high force field-based energy values. 62 Historically, ligand heavy atoms have been restrained to their original positions with a squarewelled quadratic positional penalty. ...
... The ForceGen and xGen methods employ a variant of the MMFF94s force field, typically applied using a dielectric constant consistent with an aqueous solution (80.0). 60,61 Because the macrocyclic search procedure makes use of force Blue solid lines are for the macrocyclic peptide data set, green-dotted lines are for the non-peptidic macrocycle data set, and the purple-dashed lines are for the small molecule data set. All results are obtained from the electron density fitting approach (xGen). ...
Article
Full-text available
Macrocyclic peptides are an important modality in drug discovery, but molecular design is limited due to the complexity of their conformational landscape. To better understand conformational propensities, global strain energies were estimated for 156 protein-macrocyclic peptide cocrystal structures. Unexpectedly large strain energies were observed when the bound-state conformations were modeled with positional restraints. Instead, low-energy conformer ensembles were generated using xGen that fit experimental X-ray electron density maps and gave reasonable strain energy estimates. The ensembles featured significant conformational adjustments while still fitting the electron density as well or better than the original coordinates. Strain estimates suggest the interaction energy in protein-ligand complexes can offset a greater amount of strain for macrocyclic peptides than for small molecules and non-peptidic macrocycles. Across all molecular classes, the approximate upper bound on global strain energies had the same relationship with molecular size, and bound-state ensembles from xGen yielded favorable binding energy estimates.
... This is directly related to the approach we recently introduced for exploring the conformational space of macrocyclic ligands subject to NMR distance and torsion restraints. 7,8 The ForceGen method is nonstochastic, requires no templates, and does not make use of torsional libraries. Conformational search within macrocyclic rings is accomplished through direct physical movement, forcing rotation around rotatable bonds. ...
... Macrocyclic ligands can be challenging to model, due both to their size and to the complexity of generating accurate and complete conformational samples. 7,8 We made use of all 147 of 182 examples from the ForceGen benchmark for which electron density data were available. That set is dominated by synthetic and nonpeptidic natural-product ligands, and it has been augmented here with three additional cases of peptidic macrocycles, owing to their propensity to develop self−selfinteractions that are energetically important. ...
... The overlap integral of D and L is used as an additional weighted term in addition to the normal behavior of MMFF94sf as implemented within the ForceGen method. 7,8 The default weighting for ligand refinement is 3.0, so, at a resolution of 2.0 Å, the ideal overlap of a carbon atom of a ligand onto a noise-free volume of carbon density yields a reward of approximately −8.0 kcal/mol. ...
Article
We report a new method for X-ray density ligand fitting and refinement that is suitable for a wide variety of small-molecule ligands, including macrocycles. The approach (called "xGen") augments a force field energy calculation with an electron-density fitting restraint that yields an energy reward during restrained conformational search. The resulting conformer pools balance goodness of fit with ligand strain. Real-space refinement from pre-existing ligand coordinates of 150 macrocycles resulted in occupancy weighted conformational ensembles that exhibited low strain energy. The xGen ensembles improved upon electron density fit compared with the PDB reference coordinates without making use of atom-specific B-factors. Similarly, on non-macrocycles, de novo fitting produced occupancy-weighted ensembles of many conformers that were generally better quality density fits than the deposited primary/alternate conformational pairs. The results suggest ubiquitous low-energy ligand conformational ensembles in X-ray diffraction data and provide an alternative to using B-factors as model parameters.
... Further, only molecules that could be docked with a reasonable chance of success were kept (e.g., docking accuracy decreases when molecules are too flexible, thus we kept molecules with less than 20 rotatable bonds and with a MW below 900 Da). We obtained a final collection of about 8,000 molecules acting in different therapeutic areas in 2D that were generated in 3D and protonated using the Surflex tools 31 . All compounds were docked with the 2019 version of Surflex-Dock 32 (pgeom option to explore in depth the catalytic site) into the furin Xray structure co-crystallized with a peptide-like inhibitor 33 (PDB entry 5jxh) or co-crystallized with a small chemical compound 34 (PDB entry 5mim). ...
Preprint
Full-text available
In December 2019, a new coronavirus was identified in the Hubei province of central china and named SRAS-CoV-2. This new virus induces COVID-19, a severe respiratory disease with high death rate. The spike protein (S) of SARS-CoV-2 contains furin-like cleavage sites absent the other SARS-like viruses. The viral infection requires the priming or cleavage of the S protein and such processing seems essential for virus entry into the host cells. Furin is highly expressed in the lung tissue and the expression is further increased in lung cancer, suggesting the exploitation of this mechanism by the virus to mediate enhanced virulence as shown by the higher risk of COVID-19 in these patients. In this study, we used structure- based virtual screening and a collection of about 8,000 unique approved and investigational drugs suitable for docking to search for molecules that could inhibits furin activity. Sulconazole, a broad-spectrum anti-fungal agent, was found to be of potential interest. Using Western blot analysis, Sulconazole was found to inhibit the cleavage of the cell surface furin substrate MT1-MMP that contains two furin cleavage sites similar to those of the SARS- CoV-2 spike protein. Sulconazole and analogs could be interesting for repurposing studies and to probe the yet not fully understood molecular mechanisms involved in cell entry.