Article

The Energy Landscape of Modular Repeat Proteins: Topology Determines Folding Mechanism in the Ankyrin Family

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Proteins consisting of repeating amino acid motifs are abundant in all kingdoms of life, especially in higher eukaryotes. Repeat-containing proteins self-organize into elongated non-globular structures. Do the same general underlying principles that dictate the folding of globular domains apply also to these extended topologies? Using a simplified structure-based model capturing a perfectly funneled energy landscape, we surveyed the predicted mechanism of folding for ankyrin repeat containing proteins. The ankyrin family is one of the most extensively studied classes of non-globular folds. The model based only on native contacts reproduces most of the experimental observations on the folding of these proteins, including a folding mechanism that is reminiscent of a nucleation propagation growth. The confluence of simulation and experimental results suggests that the folding of non-globular proteins is accurately described by a funneled energy landscape, in which topology plays a determinant role in the folding mechanism.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... We use the designed ankyrin repeat protein 4ANK [15] as illustration. We adopt structurally motivated schemes for defining foldons in this system, namely that each repeat, or each half repeat, is one foldon [16,17]. For other types of proteins, different schemes may be more appropriate, and general schemes for approximate foldon assignment exist [13]. ...
... The ankyrin repeat (ANK) is a pervasive 33-residue motif found predominantly in eukaryotes [35]. It has been an excellent basis for constructing model systems for protein folding [16,[36][37][38] and engineering [39][40][41][42][43][44]. Through detailed comparison of ANK sequences, a consensus sequence -one that best represents the entire family -has been defined [15]. ...
... Marchetti Bradley and Barrick, studying the Notch ankyrin domain (comprised of 7 ANK repeats), concluded that the central three ANKs of that protein formed the (early) transition state, based on w value analysis. [46] Ferreiro and coworkers, who computationally evaluated the folding of ANK proteins ranging from 3 to 7 repeats, concluded that the folding nucleus consists not of an integer number of repeats but of one ANK plus the first helix of the following ANK repeat [16]. In order to remain agnostic regarding the nature of the nucleus without introducing unnecessary complexity, we have chosen to characterize the foldon macrobasins at both the ANK and the half-ANK resolution. ...
Article
A general method for facilitating the interpretation of computer simulations of protein folding with minimally frustrated energy landscapes is detailed and applied to a designed ankyrin repeat protein (4ANK). In the method, groups of residues are assigned to foldons and these foldons are used to map the conformational space of the protein onto a set of discrete macrobasins. The free energies of the individual macrobasins are then calculated, informing practical kinetic analysis. Two simple assumptions about the universality of the rate for downhill transitions between macrobasins and the natural local connectivity between macrobasins lead to a scheme for predicting overall folding and unfolding rates, generating chevron plots under varying thermodynamic conditions, and inferring dominant kinetic folding pathways. To illustrate the approach, free energies of macrobasins were calculated from biased simulations of a non-additive structure-based model using two structurally motivated foldon definitions at the full and half ankyrin repeat resolutions. The calculated chevrons have features consistent with those measured in stopped flow chemical denaturation experiments. The dominant inferred folding pathway has an ''inside-out'', nucleation-propagation like character.
... We use the designed ankyrin repeat protein 4ANK [15] as illustration. We adopt structurally motivated schemes for defining foldons in this system, namely that each repeat, or each half repeat, is one foldon [16,17]. For other types of proteins, different schemes may be more appropriate, and general schemes for approximate foldon assignment exist [13]. ...
... The ankyrin repeat (ANK) is a pervasive 33-residue motif found predominantly in eukaryotes [35]. It has been an excellent basis for constructing model systems for protein folding [16,363738 and engineering394041424344. Through detailed comparison of ANK sequences, a consensus sequence – one that best represents the entire family – has been defined [15]. ...
... Marchetti Bradley and Barrick, studying the Notch ankyrin domain (comprised of 7 ANK repeats), concluded that the central three ANKs of that protein formed the (early) transition state, based on w value analysis. [46] Ferreiro and coworkers, who computationally evaluated the folding of ANK proteins ranging from 3 to 7 repeats, concluded that the folding nucleus consists not of an integer number of repeats but of one ANK plus the first helix of the following ANK repeat [16]. In order to remain agnostic regarding the nature of the nucleus without introducing unnecessary complexity, we have chosen to characterize the foldon macrobasins at both the ANK and the half-ANK resolution. ...
Article
Full-text available
A general method for facilitating the interpretation of computer simulations of protein folding with minimally frustrated energy landscapes is detailed and applied to a designed ankyrin repeat protein (4ANK). In the method, groups of residues are assigned to foldons and these foldons are used to map the conformational space of the protein onto a set of discrete macrobasins. The free energies of the individual macrobasins are then calculated, informing practical kinetic analysis. Two simple assumptions about the universality of the rate for downhill transitions between macrobasins and the natural local connectivity between macrobasins lead to a scheme for predicting overall folding and unfolding rates, generating chevron plots under varying thermodynamic conditions, and inferring dominant kinetic folding pathways. To illustrate the approach, free energies of macrobasins were calculated from biased simulations of a non-additive structure-based model using two structurally motivated foldon definitions at the full and half ankyrin repeat resolutions. The calculated chevrons have features consistent with those measured in stopped flow chemical denaturation experiments. The dominant inferred folding pathway has an "inside-out", nucleation-propagation like character.
... In previous studies, the one-dimensionality of the structure of repeat proteins, which are composed of repetitive units of similar size and structure packed together in a linear chain, enabled the deletion or addition of a particular repeating unit, illustrating their modular nature and their higher tolerance for manipulations relative to globular proteins (6)(7)(8)(9). Furthermore, previous works in which repeat proteins were designed to be composed of identical consensus repeats had an additional advantage for dissecting folding energetics because their stability is more homogenously distributed along the proteins (10)(11)(12)(13)(14). Despite their simplicity and homogeneity, which has been useful for folding studies, repeat proteins exhibit complex folding behavior similar to that observed in globular proteins, as reflected, for example, by their high stability, cooperative folding, and multiple folding pathways (11,(15)(16)(17)(18)(19)(20)(21)(22)(23). ...
... For our simulations, we used a simple native-topology-based model (also termed the G o model) that assumes a perfectly funneled energy landscape (33). This model has reproduced experimental kinetic rates and pathways, as well as captured other processes involved in folding, such as folding intermediates, protein dimerization, and assembly (19,(34)(35)(36)(37)(38). More recently, this model was used to study the effects of confinement (39), tethering (40), and modification by natural posttranslational modifications on the folding of the modified protein (41,42). ...
... Our simulations show that these intricate behaviors can be captured by simple models (19,30,31), and point to the importance of the native topology in dictating the folding kinetics and cooperativity of these proteins. In addition, they show that the coupling in the folding of the different repeats, in cases where the interfaces are more stable than the repeats, arises from frustration in forming the interfaces between the two neighboring repeats. ...
Article
Full-text available
Repeat proteins have unique elongated structures that, unlike globular proteins, are quite modular. Despite their simple one-dimensional structure, repeat proteins exhibit intricate folding behavior with a complexity similar to that of globular proteins. Therefore, repeat proteins allow one to quantify fundamental aspects of the biophysics of protein folding. One important feature of repeat proteins is the interfaces between the repeating units. In particular, the distribution of stabilities within and between the repeats was previously suggested to affect their folding characteristics. In this study, we explore how the interface affects folding kinetics and cooperativity by investigating two families of repeat proteins, namely, the Ankyrin and tetratricopeptide repeat proteins, which differ in the number of interfacial contacts that are formed between their units as well as in their folding behavior. By using simple topology-based models, we show that modulating the energetic strength of the interface relative to that of the repeat itself can drastically change the protein stability, folding rate, and cooperativity. By further dissecting the interfacial contacts into several subsets, we isolated the effects of each of these groups on folding kinetics. Our study highlights the importance of interface connectivity in determining the folding behavior.
... In previous studies, the one-dimensionality of the structure of repeat proteins, which are composed of repetitive units of similar size and structure packed together in a linear chain, enabled the deletion or addition of a particular repeating unit, illustrating their modular nature and their higher tolerance for manipulations relative to globular proteins (6-9). Furthermore, previous works in which repeat proteins were designed to be composed of identical consensus repeats had an additional advantage for dissecting folding energetics because their stability is more homogenously distributed along the proteins (10)(11)(12)(13)(14). Despite their simplicity and homogeneity, which has been useful for folding studies, repeat proteins exhibit complex folding behavior similar to that observed in glob- ular proteins, as reflected, for example, by their high stability, cooperative folding, and multiple folding path- ways (11,(15)(16)(17)(18)(19)(20)(21)(22)(23). ...
... For our simulations, we used a simple native-topology-based model (also termed the G o model) that assumes a perfectly funneled energy landscape (33). This model has reproduced experimental kinetic rates and pathways, as well as captured other processes involved in folding, such as folding intermediates, protein dimer- ization, and assembly (19,(34)(35)(36)(37)(38). More recently, this model was used to study the effects of confinement (39), tethering (40), and modification by natural posttranslational modifications on the folding of the modified protein (41,42). ...
... Our simulations show that these intricate behaviors can be captured by simple models (19,30,31), and point to the importance of the native topology in dictating the folding kinetics and cooperativity of these proteins. In addition, they show that the coupling in the folding of the different repeats, in cases where the interfaces are more stable than the repeats, arises from frustration in forming the interfaces between the two neighboring repeats. ...
Article
Full-text available
Protein ubiquitination is central to the regulation of various pathways in eukaryotes. The process of ubiquitination and its cellular outcome were investigated in hundreds of proteins to date. Despite this, the evolution of this regulatory mechanism has not yet been addressed comprehensively. Here, we quantify the rates of evolutionary changes of ubiquitination and SUMOylation (Small Ubiquitin-like MOdifier) sites. We estimate the time at which they first appeared, and compare them to acetylation and phosphorylation sites and to unmodified residues. We observe that the various modification sites studied exhibit similar rates. Mammalian ubiquitination sites are weakly more conserved than unmodified lysine residues, and a higher degree of relative conservation is observed when analyzing bona fide ubiquitination sites. Various reasons can be proposed for the limited level of excess conservation of ubiquitination, including shifts in locations of the sites, the presence of alternative sites, and changes in the regulatory pathways. We observe that disappearance of sites may be compensated by the presence of a lysine residue in close proximity, which is significant when compared to evolutionary patterns of unmodified lysine residues, especially in disordered regions. This emphasizes the importance of analyzing a window in the vicinity of functional residues, as well as the capability of the ubiquitination machinery to ubiquitinate residues in a certain region. Using prokaryotic orthologs of ubiquitinated proteins, we study how ubiquitination sites were formed, and observe that while sometimes sequence additions and rearrangements are involved, in many cases the ubiquitination machinery utilizes an already existing sequence without significantly changing it. Finally, we examine the evolution of ubiquitination, which is linked with other modifications, to infer how these complex regulatory modules have evolved. Our study gives initial insights into the formation of ubiquitination sites, their degree of conservation in various species, and their co-evolution with other posttranslational modifications.
... As in the classical helix-coil transition of secondary structures, the intrinsic stability of the individual folding elements is low compared with the free energy of stabilization gained by forming an ''interface'' between neighbors (1). The delicate balances of free energy in each element allows their folding to decouple and subdomains to emerge (7,8). The present time-resolved experiments show how the one-dimensionality controls the dynamics. ...
... Mutations in this region affect both the rate and its urea dependence in a manner consistent with unfolding by two parallel routes. In prin-ciple, parallel routes are expected from the symmetric topology of repeat proteins (8). These have been experimentally traced in the shorter, fourankyrin-repeat protein Myotrophyn (9). ...
... These have been experimentally traced in the shorter, fourankyrin-repeat protein Myotrophyn (9). In repeat proteins such distinct populated folding routes are not guaranteed to appear because the routes are selected based on the local energetics, and small perturbations easily reroute the transitions (8)(9)(10)(11). For one-dimensional systems the details of the kinetic routes taken through the landscape crucially depend on inhomogeneities in the distribution of energies and entropy losses for folding along the array. ...
Article
Full-text available
Energy landscape theory unites the study of protein folding with the theory of phase transitions. Dimensionality, whose key role in phase transitions is well known, comes to the fore in folding repeat proteins. Repeat proteins are made of near repetitions of 20–40 residues encoding recurring structural motifs. Each repeating element interacts only with its immediate neighbors, forming extended, globally one-dimensional structures that can be interrupted by thermally excited defects (1). The equilibrium properties of repeat proteins may be mapped onto a one-dimensional Ising model (2–4). Going beyond equilibrium, in this issue of PNAS, Werbeck et al. (5) delight us by studying the folding kinetics of a very long repeat protein, D34, a 12-ankyrin repeat fragment of AnkyrinR.
... We studied here the abundance, length distribution and energetics 29 of ANK arrays in natural polypeptides. 30 In contrast to most globular domains, repeat proteins are believed to distinctively 31 evolve by duplication and deletion of internal repetitions [2], [21], [22], [23]. It was 32 recently suggested that this horizontal evolution is accelerated compared to their 33 vertical divergence in related species [24]. ...
... There is a 124 large number of arrays of just one repeat unit, representing 19% of arrays, of which 50% 125 were detected as single repeats in the natural sequence and the remainder are at least 67 126 residues apart from their nearest neighbour. Since it is known that ANK proteins 127 require multiple repeats to acquire a stable fold [30], [31], [13], these may represent miss 128 detections of ANK patterns in unrelated sequences, as shown later by their energetic 129 distribution (see below). The abundance of arrays decreases roughly exponentially with 130 array length with an anomalous peak around 23 repeats. ...
Preprint
Full-text available
Ankyrin containing proteins are one of the most abundant repeat protein families present in all extant organisms. They are made with tandem copies of similar amino acid stretches that fold into elongated architectures. Here, we build and curated a dataset of 200 thousand proteins that contain 1,2 million Ankyrin regions and characterize the abundance, structure and energetics of the repetitive regions in natural proteins. We found that there is a continuous roughly exponential variety of array lengths with an exceptional frequency at 24 repeats. We describe that individual repeats are seldom interrupted with long insertions and accept few deletions, consistently with the know tertiary structures. We found that longer arrays are made up of repeats that are more similar to each other than shorter arrays, and display more favourable folding energy, hinting at their evolutionary origin. The array distributions show that there is a physical upper limit to the size of an array of Ankyrin repeats of about 120 copies, consistent with the limit found in nature. Analysis of the identity patterns within the arrays suggest that they may have originated by sequential copies of more than one Ankyrin unit. Author summary Repeat proteins are coded in tandem copies of similar amino acid stretches. We built and curated a large dataset of Ankyrin containing proteins, one of the most abundant families of repeat proteins, and characterized the structure of the arrays formed by the repetitions. We found that large arrays are constructed with repetitions that are more similar to each other than shorter arrays. Also, the largest the array, the more energetically favourable its folding energy is. We speculate about the mechanistic origin of large arrays and hint into their evolutionary dynamics.
... Repeat proteins such as ankyrin repeats and tetratricopeptide repeats (TPRs) can be viewed as quasi one-dimensional arrays of small structural elements (typically 20-40 residues). They fold into elongated, non-globular structures that are stabilised only by local interactions whether within repeats or between adjacent repeats [1][2][3][4][5][6][7][8][9][10][11] . This architecture contrasts with the three-dimensional connectivity of typical globular protein, which contains many sequence-distant interactions that usually play critical roles in their folding, and it has been widely exploited in the design of repeat proteins [12][13][14][15][16][17] . ...
... This architecture contrasts with the three-dimensional connectivity of typical globular protein, which contains many sequence-distant interactions that usually play critical roles in their folding, and it has been widely exploited in the design of repeat proteins [12][13][14][15][16][17] . The translational structural symmetry of repeat proteins is reflected in their energy landscapes 8,11 and makes them both amenable to gross manipulation without destroying the overall structure (e.g. addition or deletion of repeats 10,14,[18][19][20] ) as well as sensitive to even small perturbations (e.g. the folding route can be redirected by single amino-acid substitutions 2,4,21-24 ). ...
Article
Full-text available
The simple topology and modular architecture of tandem-repeat proteins such as tetratricopeptide repeats (TPRs) and ankyrin repeats makes them straightforward to dissect and redesign. Repeat-protein stability can be manipulated in a predictable way using site-specific mutations. Here we explore a different type of modification - loop insertion - that will enable a simple route to functionalisation of this versatile scaffold. We previously showed that a single loop insertion has a dramatically different effect on stability depending on its location in the repeat array. Here we dissect this effect by a combination of multiple and alternated loop insertions to understand the origins of the context-dependent loss in stability. We find that the scaffold is remarkably robust in that its overall structure is maintained. However, adjacent repeats are now only weakly coupled, and consequently the increase in solvent protection, and thus stability, with increasing repeat number that defines the tandem-repeat protein class is lost. Our results also provide us with a rulebook with which we can apply these principles to the design of artificial repeat proteins with precisely tuned folding landscapes and functional capabilities, thereby paving the way for their exploitation as a versatile and truly modular platform in synthetic biology.
... One key finding of chevron and phi-value analysis of repeat proteins, predicted by Ferreiro, Komives and Wolynes in 2005, is that there is more than one low energy route between the folded and unfolded states [12]. This was first shown experimentally in 2007 for the ankyrin-repeat protein myotrophin [3]. ...
... The heterogeneous distribution of stabilities across their repeat array results in a polarised folding mechanism. However, as the distribution of stabilities are sufficiently closely balanced small perturbations can shift the folding flux from exclusively one pathway to another or allow flux through parallel pathways [3,12]. Pelizzola, A. Lowe, P. Bruscolini, LSI, submitted) ( Figure 2B). ...
Article
Studying protein folding and protein design in globular proteins presents significant challenges because of the two related features, topological complexity and co-operativity. In contrast, tandem-repeat proteins have regular and modular structures composed of linearly arrayed motifs. This means that the biophysics of even giant repeat proteins is highly amenable to dissection and to rational design. Here we discuss what has been learnt about the folding mechanisms of tandem-repeat proteins. The defining features that have emerged are: (i) accessibility of multiple distinct routes between denatured and native states, both at equilibrium and under kinetic conditions; (ii) different routes are favoured for folding compared with unfolding; (iii) unfolding energy barriers are broad, reflecting stepwise unravelling of an array repeat by repeat; (iv) highly co-operative unfolding at equilibrium and the potential for exceptionally high thermodynamic stabilities by introducing consensus residues; (v) under force, helical-repeat structures are very weak with non-cooperative unfolding leading to elasticity and buffering effects. This level of understanding should enable us to create repeat proteins with made-to-measure folding mechanisms, in which one can dial into the sequence the order of repeat folding, number of pathways taken, step size (co-operativity) and fine-structure of the kinetic energy barriers.
... Large, multidomain proteins have been subjected to little single-mole-cule folding experiments other than force-probe characterization of their mechanics (Bertz et al., 2010;Kotamarthi et al., 2013;Shank et al., 2010;Wang et al., 2012). Tandem-repeat proteins such as ankyrin, tetratricopeptide, and HEAT repeats will be particularly interesting targets for single-molecule analysis because of a fundamental property distinguishing them from globular proteins, namely the copopulation of multiple, partly folded intermediates under equilibrium unfolding conditions and the accessibility of multiple pathways in the unfolding kinetics (Ferreiro et al., 2005;Lowe and Itzhaki, 2007;Tripp and Barrick, 2008;Werbeck and Itzhaki, 2007;Werbeck et al., 2008). This feature arises from the modular nature of repeatprotein structures and the high level of similarity between the modules at both sequence and structural levels. ...
... The absence of sequence-distant contacts likely affords repeat proteins extreme flexibility and particular molecular recognition capabilities (Ferreiro et al., 2005;Sivanandan and Naganathan, 2013), consistent with the proposal that they are a distinct class midway between globular structured proteins and intrinsically disordered proteins (Forwood et al., 2010). An example is IkBa, the two C-terminal ankyrin repeats of which possess features of intrinsic disorder (Lamboy et al., 2011(Lamboy et al., , 2013. ...
Article
Full-text available
Here, using single-molecule FRET, we reveal previously hidden conformations of the ankyrin-repeat domain of AnkyrinR, a giant adaptor molecule that anchors integral membrane proteins to the spectrin-actin cytoskeleton through simultaneous binding of multiple partner proteins. We show that the ankyrin repeats switch between high-FRET and low-FRET states, controlled by an unstructured "safety pin" or "staple" from the adjacent domain of AnkyrinR. Opening of the safety pin leads to unravelling of the ankyrin repeat stack, a process that will dramatically affect the relative orientations of AnkyrinR binding partners and, hence, the anchoring of the spectrin-actin cytoskeleton to the membrane. Ankyrin repeats are one of the most ubiquitous molecular recognition platforms in nature, and it is therefore important to understand how their structures are adapted for function. Our results point to a striking mechanism by which the order-disorder transition and, thereby, the activity of repeat proteins can be regulated. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
... Because the distribution of stability between different subdomains of the array is very uneven, this order of unfolding events is strongly preferred, such that only very destabilizing mutations can shift the order, for example making unfolding of HEAT 14-15 competitive with HEAT 3-10. Similarly, our group and others have shown that alternative kinetic (un)folding routes are accessible for a number of ankyrin-repeat proteins such as myotrophin (7), D34 (35) and Notch (38), as was originally predicted based on the symmetric topology of repeat proteins (39). When intermediates are destabilized, unfolding of the HEAT repeats can occur in a single step, as occurs for example in the truncated variant HEAT 1-7. ...
... The origins and limits of cooperativity have been explored for ankyrin and TPR repeats as well as more recently for LRRs, using both experimental and computational approaches (4,5,8,(39)(40)(41)(42). These studies indicate that repeats that are distant from one another and therefore not directly in contact can nevertheless unfold in a coupled manner if they and the intervening units have similar intrinsic, low stabilities and the interfaces between them are strong. ...
Article
Here, we reveal a remarkable complexity in the unfolding of giant HEAT-repeat protein PR65/A, a molecular adaptor for the heterotrimeric PP2A phosphatases. The repeat array ruptures at multiple sites, leading to intermediate states with noncontiguous folded subdomains. There is a dominant sequence of unfolding, which reflects a nonuniform stability distribution across the repeat array and can be rationalized by theoretical models accounting for heterogeneous contact density in the folded structure. Unfolding of certain intermediates is, however, competitive, leading to parallel unfolding pathways. The low-stability, central repeats sample unfolded conformations under physiological conditions, suggesting how folding directs function: certain regions present rigid motifs for molecular recognition, whereas others have the flexibility with which to broaden the search area, as in the fly-casting mechanism. Partial unfolding of PR65/A also impacts catalysis by altering the proximity of bound catalytic subunit and substrate. Thus, the repeat array orchestrates the assembly and activity of PP2A.
... The fact that repeat proteins can be treated as quasi-one-dimensional objects, however, weakens the necessity for a deeply funnelled landscape, as conflicting interactions that may arise from the frustration of interactions far distant in sequence are not as predominant as in globular domains. Simple structure-based models of repeat-protein folding have been shown to predict folding behaviour consistent with the overall behaviour of repeat arrays seen in the laboratory (Ferreiro et al., 2005;Barrick et al., 2008). More complex models, such as all-atom simulations, have been applied to study the folding of ankyrin repeat proteins under force (Serquera et al., 2010). ...
Article
Full-text available
Ankyrin (ANK) repeat proteins are coded by tandem occurrences of patterns with around 33 amino acids. They often mediate protein–protein interactions in a diversity of biological systems. These proteins have an elongated non-globular shape and often display complex folding mechanisms. This work investigates the energy landscape of representative proteins of this class made up of 3, 4 and 6 ANK repeats using the energy-landscape visualisation method (ELViM). By combining biased and unbiased coarse-grained molecular dynamics AWSEM simulations that sample conformations along the folding trajectories with the ELViM structure-based phase space, one finds a three-dimensional representation of the globally funnelled energy surface. In this representation, it is possible to delineate distinct folding pathways. We show that ELViMs can project, in a natural way, the intricacies of the highly dimensional energy landscapes encoded by the highly symmetric ankyrin repeat proteins into useful low-dimensional representations. These projections can discriminate between multiplicities of specific parallel folding mechanisms that otherwise can be hidden in oversimplified depictions.
... The alternative pathways of binding and unbinding uncovered here for the ARM repeat protein β-catenin nicely mirror the alternative folding and unfolding pathways previously observed and predicted by our group and others for tandem-repeat proteins [74][75][76][77][78][79]. These phenomena reflect the energy landscapes of the repeating architecture, in which the internal translational symmetry affords multiple paths of similar energies, and, consequently, small perturbations such as conservative mutations are sufficient to shift the flux through each. ...
Article
Full-text available
The Wnt signalling pathway plays an important role in cell proliferation, differentiation, and fate decisions in embryonic development and the maintenance of adult tissues. The twelve armadillo (ARM) repeat-containing protein β-catenin acts as the signal transducer in this pathway. Here, we investigated the interaction between β-catenin and the intrinsically disordered transcription factor TCF7L2, comprising a very long nanomolar-affinity interface of approximately 4800 Å2 that spans ten of the twelve ARM repeats of β-catenin. First, a fluorescence reporter system for the interaction was engineered and used to determine the kinetic rate constants for the association and dissociation. The association kinetics of TCF7L2 and β-catenin were monophasic and rapid (7.3 ± 0.1 × 107 M−1·s−1), whereas dissociation was biphasic and slow (5.7 ± 0.4 × 10−4 s−1, 15.2 ± 2.8 × 10−4 s−1). This reporter system was then combined with site-directed mutagenesis to investigate the striking variability in the conformation adopted by TCF7L2 in the three different crystal structures of the TCF7L2–β-catenin complex. We found that the mutation had very little effect on the association kinetics, indicating that most interactions form after the rate-limiting barrier for association. Mutations of the N- and C-terminal subdomains of TCF7L2 that adopt relatively fixed conformations in the crystal structures had large effects on the dissociation kinetics, whereas the mutation of the labile sub-domain connecting them had negligible effect. These results point to a two-site avidity mechanism of binding with the linker region forming a “fuzzy” complex involving transient contacts that are not site-specific. Strikingly, the two mutations in the N-terminal subdomain that had the largest effects on the dissociation kinetics showed two additional phases, indicating partial flux through an alternative dissociation pathway that is inaccessible to the wild type. The results presented here provide insights into the kinetics of the molecular recognition of a long intrinsically disordered region with an elongated repeat-protein surface, a process found to involve parallel routes with sequential steps in each.
... The alternative pathways of binding and unbinding uncovered here for the ARM repeat protein β-catenin nicely mirror the alternative folding and unfolding pathways previously observed by our group and others for tandem-repeat proteins and previously predicted (Ferreiro et al., 2005(Ferreiro et al., , 2008Hutton et al., 2015;Lowe and Itzhaki, 2007;Tsytlonok et al., 2013;Werbeck et al., 2008). These phenomena reflect the energy landscapes of the repeating architecture, in which the internal symmetry affords multiple paths of similar energies and, consequently, small perturbations such as conservative mutations are sufficient to shift the flux through each. ...
Preprint
Full-text available
The Wnt signalling pathway plays an important role in cell proliferation, differentiation and fate decisions in embryonic development and in the maintenance of adult tissues, and the twelve Armadillo (ARM) repeat-containing protein beta-catenin acts as the signal transducer in this pathway. Here we investigate the interaction between beta-catenin and the intrinsically disordered transcription factor TCF7L2, comprising a very long nanomolar-affinity interface that spans ten of the twelve ARM repeats of beta-catenin. First, a fluorescence reporter system for the interaction was engineered and used to determine the kinetic rate constants for the association and dissociation. The association kinetics of TCF7L2 and β-catenin was monophasic and rapid (7.3 +/- 0.1 x107 M-1s-1), whereas dissociation was biphasic and slow (5.7 +/- 0.4 x10-4 s-1, 15.2 +/- 2.8 x10-4 s-1). This reporter system was then combined with site-directed mutagenesis to investigate the striking variability in the conformation adopted by TCF7L2 in the three different crystal structures of the TCF7L2-beta-catenin complex. We found that mutation of the N- and C-terminal subdomains of TCF7L2 that adopt relatively fixed conformations in the crystal structures has a large effect on the dissociation kinetics, whereas mutation of the labile sub-domain connecting them has negligible effect. These results point to a two-site avidity mechanism of binding with the linker region forming a 'fuzzy' complex involving transient contacts that are not site-specific. Strikingly, two mutations in the N-terminal subdomain that have the largest effects on the dissociation kinetics showed two additional phases, indicating partial flux through an alternative dissociation pathway that is inaccessible to the wild type. The results presented here provide insights into the kinetics of molecular recognition of a long intrinsically disordered region with an elongated repeat-protein surface, a process found to involve parallel routes with sequential steps in each.
... Our findings regarding the importance of distance between intra-chain native contacts for the folding pathway also bear relevance for understanding the folding of naturally occurring tandem repeat proteins. The latter are often composed of highly similar domains such as in the case of the Ankyrin repeat family 39 . In natural multi-domain proteins, homologous domains are rarely positioned adjacently in the amino acid sequence. ...
Article
Full-text available
Natural proteins are characterised by a complex folding pathway defined uniquely for each fold. Designed coiled-coil protein origami (CCPO) cages are distinct from natural compact proteins, since their fold is prescribed by discrete long-range interactions between orthogonal pairwise-interacting coiled-coil (CC) modules within a single polypeptide chain. Here, we demonstrate that CCPO proteins fold in a stepwise sequential pathway. Molecular dynamics simulations and stopped-flow Förster resonance energy transfer (FRET) measurements reveal that CCPO folding is dominated by the effective intra-chain distance between CC modules in the primary sequence and subsequent folding intermediates, allowing identical CC modules to be employed for multiple cage edges and thus relaxing CCPO cage design requirements. The number of orthogonal modules required for constructing a CCPO tetrahedron can be reduced from six to as little as three different CC modules. The stepwise modular nature of the folding pathway offers insights into the folding of tandem repeat proteins and can be exploited for the design of modular protein structures based on a given set of orthogonal modules.
... There is a large number of arrays of just one repeat unit, representing 19% of arrays, of which 50% were detected as single repeats in the natural sequence and the remainder were at least 67 residues apart from their nearest neighbour. Since it is known that ANK proteins require multiple repeats to acquire a stable fold [15,32,33], these may represent miss detections of ANK patterns in unrelated sequences, as shown later by their energetic characterization (see below). The abundance of arrays decreases roughly exponentially with array length with an anomalous peak around 23 repeats. ...
Article
Full-text available
Ankyrin containing proteins are one of the most abundant repeat protein families present in all extant organisms. They are made with tandem copies of similar amino acid stretches that fold into elongated architectures. Here, we built and curated a dataset of 200 thousand proteins that contain 1.2 million Ankyrin regions and characterize the abundance, structure and energetics of the repetitive regions in natural proteins. We found that there is a continuous roughly exponential variety of array lengths with an exceptional frequency at 24 repeats. We described that individual repeats are seldom interrupted with long insertions and accept few deletions, in line with the known tertiary structures. We found that longer arrays are made up of repeats that are more similar to each other than shorter arrays, and display more favourable folding energy, hinting at their evolutionary origin. The array distributions show that there is a physical upper limit to the size of an array of repeats of about 120 copies, consistent with the limit found in nature. The identity patterns within the arrays suggest that they may have originated by sequential copies of more than one Ankyrin unit.
... It was shown that the order in which the repeats fold is governed by their relative stabilities, with the most stable repeats folding first, and consequently, the folding pathways can be redirected relatively straightforwardly by manipulating the stability distribution across the repeat array [23,[25][26][27]. It follows also that under any given set of conditions there may be flux through multiple alternative pathways [23], as originally predicted by Wolynes and co-workers [28]. Moreover, the cooperativity of the folding process (both at equilibrium and under kinetic conditions) can also be readily tuned using appropriate mutations [29,30]. ...
Article
Full-text available
The term allostery was originally developed to describe structural changes in one binding site induced by the interaction of a partner molecule with a distant binding site, and it has been studied in depth in the field of enzymology. Here, we discuss the concept of action at a distance in relation to the folding and function of the solenoid class of tandem-repeat proteins such as tetratricopeptide repeats (TPRs) and ankyrin repeats. Distantly located repeats fold cooperatively, even though only nearest-neighbour interactions exist in these proteins. A number of repeat-protein scaffolds have been reported to display allosteric effects, transferred through the repeat array, that enable them to direct the activity of the multi-subunit enzymes within which they reside. We also highlight a recently identified group of tandem-repeat proteins, the RRPNN subclass of TPRs, recent crystal structures of which indicate that they function as allosteric switches to modulate multiple bacterial quorum-sensing mechanisms. We believe that the folding cooperativity of tandem-repeat proteins and the biophysical mechanisms that transform them into allosteric switches are intimately intertwined. This opinion piece aims to combine our understanding of the two areas and develop ideas on their common underlying principles. This article is part of a discussion meeting issue ‘Allostery and molecular machines’.
... The folding of natural repeat proteins has been characterized both experimentally and in silico (14)(15)(16)(17)(18)(19)(20)(21)(22). The best-studied consensus-designed repeat proteins are the consensus ankyrin repeats (referred to as DARPins (8) or CARPs (7)) and consensus tetratricopeptide repeats (CTPRs) (13,23). ...
Article
Full-text available
Consensus - designed tetratricopeptide repeat proteins (CTPRs) are highly stable, modular proteins that are strikingly amenable to rational engineering . They therefore have tremendous potential as building blocks for biomaterials and biomedicine . Here we explore the possibility of extending the loops between repeats to enable further diversification , and we investigate how th is modifi cation affect s stability and folding cooperativity . We fi nd that extending a single loop by up to 25 residues does not disrupt the overall protein structure, but, strikingly, the effect on stability is highly context - dependent: In a two - repeat array, destabilisation is relatively small and can be accounted for purely in entropic terms, whereas extending a loop in the middle of a large array is much m ore costly , due to weakening of the interaction between the repeats. Our findings provide new insights in to structure and folding that will be important both for understanding the function of natural repeat proteins and for the design of artifical repeat proteins in biotechnology.
... It has been proposed that the brain operates at the edge of chaos, near a critical regime, where the maximum information function lies between randomness and regularity [5,6,7]. The concept of nonlinear brain needs to be framed into the energy landscape theory, originally built for a statistical description of protein's potential surfaces [8,9]. Such landscape is characterised not just by low-energy valleys -stationary points where the gradient vanishes -, but also by high-energy peaks and transition states. ...
Article
Full-text available
The brain is a system at the edge of chaos equipped with nonlinear dynamics and functional energetic landscapes. However, still doubts exist concerning the type of attractors or the trajectories followed by particles in the nervous phase space. Starting from a system governed by differential equations in which a dissipative strange attractor coexists with an invariant conservative torus, we developed a 3D model of brain phase space which has the potential to be operationalized and assessed empirically. We achieved a system displaying both a torus and a strange attractor, depending just on the initial conditions. Further, the system generates a funnel-like attractor equipped with a fractal structure. Changes in three brain phase parameters lead to modifications in funnel’s breadth or in torus/attractor superimposition. We have found that the higher frequencies of evoked activities are more deterministic due to the greater funnel breadth with decreasing degrees of freedom. In contrast, the resting state is formed by lower frequencies represents greater degrees of freedom. Thus, our model explains a large repertoire of brain functions and activities, such as sensations/perceptions, memory and self-generated thoughts.
... The link with MFP becomes clearer when we consider that topology is anoter key factor governing folding reactions. Indeed, structures of transition state ensembles (Clementi et al., 2000;Koga and Takada, 2001), folding rates (Chavez et al., 2004), the existence of folding intermediates (Ferreiro et al., 2005) and dimerization mechanisms (Levy et al., 2004) are well-predicted in models where frustration has been removed and topological information of the native state is the sole input. ...
Article
Full-text available
The minimum frustration principle is a computational approach which states that, in the long timescales of evolution, proteins’ free-energy decreases more than expected by thermodynamical contraints as their aminoacids assume conformations progressively closer to the lowest energetic state. Here we show that this general principle, borrowed from protein folding dynamics, can be fruitfully applied to nervous function too. Highligting the foremost role of energetic requirements, macromolecular dynamics, and, above all, intertwined timescales in brain activity, the minimum frustration principle elucidates a wide range of mental processes, from sensations to memory retrieval. Brain functions are compared to trajectories which, in long nervous timescales, are attracted towards the low-energy bottom of funnel-like structures characterized both by robustness and plasticity. We discuss how the principle, as derived explicitly from evolution and selection of a funneling structure from microdynamics of contacts, is different from other brain models equipped with energy landscapes, such as the Bayesian and free-energy principle and the Hopfield networks. In sum, we make available a novel approach to brain function cast in a biologically informed fashion, with the potential to be operationalized and assessed empirically.
... The link with MFP becomes clearer when we consider that topology is a key factor governing folding reactions. Indeed, structures of transition state ensembles [102,103], folding rates [104], the existence of folding intermediates [105]and dimerization mechanisms [106] are well-predicted in models where frustration has been removed and topological information of the native state is the sole input. ...
Preprint
Full-text available
This manuscript has been published. Please quote as: Tozzi A, Fla Tor, Peters JF. 2016. Building a minimum frustration framework for brain functions in long timescales. J Neurosci Res. ------- The minimum frustration principle is a computational approach which states that, in the long timescales of evolution, proteins' free-energy decreases more than expected by chance, as their aminoacids assume conformations progressively closer to the lowest energetic state. Here we show that this general principle, borrowed from the far-flung branch of protein folding dynamics, can be fruitfully applied to nervous function too. Highligting the foremost role of energetic constraints, macromolecular dynamics, and, above all, intertwined timescales in brain activity, the minimum frustration principle elucidates a wide range of psychological processes, from sensations to memory retrieval. Brain functions are compared to trajectories which, in long nervous timescales, are attracted towards the low-energetic bottom of funnel-like structures characterized both by robustness and plasticity. We discuss how the principle, as derived explicitly from evolution and selection of a funneling structure from microdynamics of contacts, is different from other successful brain models equipped with energy landscapes, such as the Bayesian/free-energy principle and the Hopfield networks. In sum, we make available a novel approach to brain function cast in a biologically informed fashion, with the potential to be operationalized and assessed empirically.
... Folding cooperativity of repeat proteins is highly influenced by the intrinsic stabilities of the different repeats and their interfaces. It has been computationally shown that these proteins are 'poised' at particular ratios of inter-repeat and intra-repeat interaction energies that allow them to undergo partially unfolding under physiological conditions which would be a requirement to perform their biological functions [49,50]. We have mapped the interactions that are the most energetically favored between residues composing the canonical structure of ANK repeats. ...
Article
Full-text available
Ankyrin repeat containing proteins are one of the most abundant solenoid folds. Usually implicated in specific protein-protein interactions, these proteins are readily amenable for design, with promising biotechnological and biomedical applications. Studying repeat protein families presents technical challenges due to the high sequence divergence among the repeating units. We developed and applied a systematic method to consistently identify and annotate the structural repetitions over the members of the complete Ankyrin Repeat Protein Family, with increased sensitivity over previous studies. We statistically characterized the number of repeats, the folding of the repeat-arrays, their structural variations, insertions and deletions. An energetic analysis of the local frustration patterns reveal the basic features underlying fold stability and its relation to the functional binding regions. We found a strong linear correlation between the conservation of the energetic features in the repeat arrays and their sequence variations, and discuss new insights into the organization and function of these ubiquitous proteins.
... An elegant series of experiments on the folding of designed repeat-arrays show that the folding of these typically behave in an all-or-none fashion: the longer the repeat-array, the more stable and more co-operative the folding transition is, namely the repeatarrays behave as a single domains [25,26]. Theoretical and computational studies predict that these highly symmetric proteins should fold through parallel pathways on funnelled energy landscapes [27,28] and these have been recently characterized experimentally for the ankyrin [25] and the tetratricopeptide repeat (TPR) [29] protein families. The folding landscapes of repeat proteins can be manipulated, for example by the addition of terminal stabilizing repeats. ...
Article
Structural domains are believed to be modules within proteins that can fold and function independently. Some proteins show tandem repetitions of apparent modular structure that do not fold independently, but rather co-operate in stabilizing structural forms that comprise several repeat-units. For many natural repeat-proteins, it has been shown that weak energetic links between repeats lead to the breakdown of co-operativity and the appearance of folding sub-domains within an apparently regular repeat array. The quasi-1D architecture of repeat-proteins is crucial in detailing how the local energetic balances can modulate the folding dynamics of these proteins, which can be related to the physiological behaviour of these ubiquitous biological systems.
... It is apparent that repeating proteins may have a tendency to form intermolecular domain swapping 10,32 . For ANK-N5C-281, Pro171 at position-5 was randomly selected. ...
Article
Full-text available
A highly diverse DNA library coding for ankyrin seven-repeat proteins (ANK-N5C) was designed and constructed by a PCR-based combinatorial assembly strategy. A bacterial melibiose fermentation assay was adapted for in vivo functional screen. We isolated a transcription blocker that completely inhibits the melibiose-dependent expression of α-galactosidase (MelA) and melibiose permease (MelB) of Escherichia coli by specifically preventing activation of the melAB operon. High-resolution crystal structural determination reveals that the designed ANK-N5C protein has a typical ankyrin fold, and the specific transcription blocker, ANK-N5C-281, forms a domain-swapped dimer. Functional tests suggest that the activity of MelR, a DNA-binding transcription activator and a member of AraC family of transcription factors, is inhibited by ANK-N5C-281 protein. All ANK-N5C proteins are expected to have a concave binding area with negative surface potential, suggesting that the designed ANK-N5C library proteins may facilitate the discovery of binders recognizing structural motifs with positive surface potential, like in DNA-binding proteins. Overall, our results show that the established library is a useful tool for the discovery of novel bioactive reagents.
... Interestingly, the idea of an overlapping element could have been one way for nature to avoid this complicated multi-state folding, by keeping the process cooperative even for large domains with several foldons ( Figure 24). This feature of an overlapping element in proteins is also seen in α-spectrin (Batey & Clarke, 2006;Tripp & Barrick, 2007) and in ankyrin-repeat proteins (Ferreiro, Cho, Komives, & Wolynes, 2005). Complex two-state proteins with the addition of two (or more) overlapping foldons with access to parallel channels across the folding energy landscape. ...
... We found that the statistical couplings calculated from sequence variations in the ank family decay roughly exponentially (Fig. 4) as the separation between the repeats increases. The predicted global correlation length of ∼1.4 repeated units is remarkably close to that inferred from statistical mechanical analysis of folding experiments [Street andBarrick, 2009, Wetzel et al., 2008] and folding simulations [Ferreiro et al., 2005]. These predictions are based on approximating long-range covariations from sets of pair-wise inter-repeat interactions, allowing for the application of the procedure for arbitrarily large structures for which an exact calculation would be computationally prohibitive. ...
Article
Full-text available
Background: The analysis of correlations of amino acid occurrences in globular domains has led to the development of statistical tools that can identify native contacts - portions of the chains that come to close distance in folded structural ensembles. Here we introduce a direct coupling analysis for repeat proteins - natural systems for which the identification of folding domains remains challenging. Results: We show that the inherent translational symmetry of repeat protein sequences introduces a strong bias in the pair correlations at precisely the length scale of the repeat-unit. Equalizing for this bias in an objective way reveals true co-evolutionary signals from which local native contacts can be identified. Importantly, parameter values obtained for all other interactions are not significantly affected by the equalization. We quantify the robustness of the procedure and assign confidence levels to the interactions, identifying the minimum number of sequences needed to extract evolutionary information in several repeat protein families. Conclusions: The overall procedure can be used to reconstruct the interactions at distances larger than repeat-pairs, identifying the characteristics of the strongest couplings in each family, and can be applied to any system that appears translationally symmetric.
... They started out by creating a consensus ankyrin repeat sequence first. In natural ankyrin repeat proteins, each repeat has a slightly different sequence, ruining the almost perfect degeneracy that is possible with identical sequences (7). In their consensus repeats, the only breaking of degeneracies comes from the slightly different N-and C-terminal constraints. ...
... However in the present work, we examine the structural similarity of the identified sequence repeats in two domains using structure based inter-residue interaction parameters such as number of longrange contacts, surrounding hydrophobicity and pairwise interaction energy. Since it has already been shown that LRO is an important descriptor in predicting folding rate of proteins (Harihar and Selvaraj, 2009) and the folding principle of simple proteins can be applied to connected multidomain as well (Kamagata et al., 2004;Ferreiro et al., 2005;Barrick et al., 2008), the present study predicts the folding rate of the two domain in repeats using the LRO parameter. We have used RADAR (Holm and Sander, 2000) for detection of repeats in the query sequence which was confirmed by Pfam domain assignment (Punta el al., 2012). ...
Article
Full-text available
Domains are the main structural and functional units of larger proteins. They tend to be contiguous in primary structure and can fold and function independently. It has been observed that 10-20% of all encoded proteins contain duplicated domains and the average pairwise sequence identity between them is usually low. In the present study, we have analyzed the structural similarity between domain repeats of proteins with known structures available in the Protein Data Bank using structure-based inter-residue interaction measures such as the number of long-range contacts, surrounding hydrophobicity, and pairwise interaction energy. We used RADAR program for detecting the repeats in a protein sequence which were further validated using Pfam domain assignments. The sequence identity between the repeats in domains ranges from 20 to 40% and their secondary structural elements are well conserved. The number of long-range contacts, surrounding hydrophobicity calculations and pairwise interaction energy of the domain repeats clearly reveal the conservation of 3-D structure environment in the repeats of domains. The proportions of mainchain-mainchain hydrogen bonds and hydrophobic interactions are also highly conserved between the repeats. The present study has suggested that the computation of these structure-based parameters will give better clues about the tertiary environment of the repeats in domains. The folding rates of individual domains in the repeats predicted using the long-range order parameter indicate that the predicted folding rates correlate well with most of the experimentally observed folding rates for the analyzed independently folded domains.
... This region folds onto the dimerization domains of the NF-B (p50/p65) and shares a binding interface with the DNA. We previously showed that ARs 5 and 6 of IB␣ are weakly folded (38,42,43) and that mutations that restore the consensus for stable ARs can promote folding of this region (35,36). The stabilizing mutations had varied effects on NF-B⅐IB␣ binding, but all showed reduced ability to promote dissociation of NF-B from the DNA. ...
Article
Full-text available
A hallmark of the NF-B transcription response to inflammatory cytokines is the remarkably rapid rate of robust activation and subsequent signal repression. Although the rapidity of postinduc- tionrepressionisexplainedpartlybythefactthatthegeneforIB is strongly induced by NF-B, the newly synthesized IB still must enter the nucleus and compete for binding to NF-B with the very large number of B sites in the DNA. We present results from real-time binding kinetic experiments, demonstrating that IB increases the dissociation rate of NF-B from the DNA in a highly efficient kinetic process. Analysis of various IB mutant proteins shows that this process requires the C-terminal PEST sequence and the weakly folded fifth and sixth ankyrin repeats of IB. Muta- tional stabilization of these repeats reduces the efficiency with which IB enhances the dissociation rate. binding kinetics disordered proteins protein-DNA interaction transcription surface plasmon resonance
... In other words, there is a strong energetic bias toward the native basin that overcomes both the asperities of the landscape which stabilize kinetic traps and also ultimately the entropy of the chain. It has been shown that the structures of transition state ensemble (3,4), the folding rate variations (5), the existence of folding intermediates (6), dimerization mechanisms (7) and domain swapping events (8) are often well predicted in models where energetic frustration has been removed from the model landscape and topological information of the native state is the sole input. Still, inhomogeneity in the native contacts energetics, non-native interactions and the residual local frustration present in the native ensemble do contribute to the functional characteristics of proteins, ‘molding’ the roughness that underlies the detailed protein dynamics (9,10). ...
Article
Full-text available
The frustratometer is an energy landscape theory-inspired algorithm that aims at quantifying the location of frustration manifested in protein molecules. Frustration is a useful concept for gaining insight to the proteins biological behavior by analyzing how the energy is distributed in protein structures and how mutations or conformational changes shift the energetics. Sites of high local frustration often indicate biologically important regions involved in binding or allostery. In contrast, minimally frustrated linkages comprise a stable folding core of the molecule that is conserved in conformational changes. Here, we describe the implementation of these ideas in a webserver freely available at the National EMBNet node-Argentina, at URL: http://lfp.qb.fcen.uba.ar/embnet/.
... Ankyrin repeats (ARs) 4 are one of the most common motifs of repeat proteins with a high degree of amino acid sequence homology. ARs fold into nearly identical helix 1-helix 2-loop structures and stack to form elongated superhelical domains that frequently mediate protein-protein interactions (1)(2)(3). Previously, single-molecule force spectroscopy experiments of a few AR proteins revealed that mechanically unfolded ARs refold rapidly and generate very robust refolding forces (4 -9). Recently, we examined in detail the mechanical properties of a model synthetic AR protein, NI6C (7). ...
Article
Full-text available
The conserved TPLH tetrapeptide motif of ankyrin repeats (ARs) plays an important role in stabilizing AR proteins, and histidine (TPLH)-to-arginine (TPLR) mutations in this motif have been associated with a hereditary human anemia, spherocytosis. Here, we used a combination of atomic force microscopy-based single-molecule force spectroscopy and molecular dynamics simulations to examine the mechanical effects of His → Arg substitutions in TPLH motifs in a model AR protein, NI6C. Our molecular dynamics results show that the mutant protein is less mechanically stable than the WT protein. Our atomic force microscopy results indicate that the mechanical energy input necessary to fully unfold the mutant protein is only half of that necessary to unfold the WT protein (53 versus 106 kcal/mol). In addition, the ability of the mutant to generate refolding forces is also reduced. Moreover, the mutant protein subjected to cyclic stretch-relax measurements displays mechanical fatigue, which is absent in the WT protein. Taken together, these results indicate that the His → Arg substitutions in TPLH motifs compromise mechanical properties of ARs and suggest that the origin of hereditary spherocytosis may be related to mechanical failure of ARs.
Article
Full-text available
The complex topologies of large multi-domain globular proteins make the study of their folding and assembly particularly demanding. It is often characterized by complex kinetics and undesired side reactions, such as aggregation. The structural simplicity of tandem-repeat proteins, which are characterized by the repetition of a basic structural motif and are stabilized exclusively by sequentially localized contacts, has provided opportunities for dissecting their folding landscapes. In this study, we focus on the Erwinia chrysanthemi pectin methylesterase (342 residues), an all-β pectinolytic enzyme with a right-handed parallel β-helix structure. Chemicals and pressure were chosen as denaturants and a variety of optical techniques were used in conjunction with stopped-flow equipment to investigate the folding mechanism of the enzyme at 25 °C. Under equilibrium conditions, both chemical- and pressure-induced unfolding show two-state transitions, with average conformational stability (ΔG° = 35 ± 5 kJ·mol−1) but exceptionally high resistance to pressure (Pm = 800 ± 7 MPa). Stopped-flow kinetic experiments revealed a very rapid (τ < 1 ms) hydrophobic collapse accompanied by the formation of an extended secondary structure but did not reveal stable tertiary contacts. This is followed by three distinct cooperative phases and the significant population of two intermediate species. The kinetics followed by intrinsic fluorescence shows a lag phase, strongly indicating that these intermediates are productive species on a sequential folding pathway, for which we propose a plausible model. These combined data demonstrate that even a large repeat protein can fold in a highly cooperative manner.
Article
Engineered repeat proteins have proven to be a fertile ground for studying the competition between folding, misfolding and transient aggregation of tethered protein domains. We examine the interplay between folding and inter-domain interactions of engineered FiP35 WW domain repeat proteins with n = 1 through 5 repeats. We characterize protein expression, thermal and guanidium melts, as well as laser T-jump kinetics. All experimental data is fitted by a global fitting model with two states per domain (U, N), plus a third state M to account for non-native states due to domain interactions present in all but the monomer. A detailed structural model is provided by coarse-grained simulated annealing using the AWSEM Hamiltonian. Tethered FiP35 WW domains with n=2 and 3 domains are just slightly less stable than the monomer. The n=4 oligomer is yet less stable, its expression yield is much lower than the monomer’s, and depends on the purification tag used. The n=5 plasmid did not express at all, indicating sudden onset of aggregation past n=4. Thus, tethered FiP35 has a critical nucleus size for inter-domain aggregation of n ≈ 4. According to our simulations, misfolded structures become increasingly prevalent as one proceeds from monomer to pentamer, with extended inter-domain beta sheets appearing first, then multi-sheet ‘intramolecular amyloid’ structures, and finally novel motifs containing alpha helices. We discuss the implications of our results for oligomeric aggregate formation and structure, transient aggregation of proteins whilst folding, as well as for protein evolution that starts with repeat proteins.
Article
Full-text available
Within the crowded and complex environment of the cell, a protein experiences stabilizing excluded-volume effects and destabilizing quinary interactions with other proteins. Which of these prevail, needs to be determined on a case-by-case basis. PAPS synthases are dimeric and bifunctional enzymes, providing activated sulfate in the form of 3′-phosphoadenosine-5′-phosphosulfate (PAPS) for sulfation reactions. The human PAPS synthases PAPSS1 and PAPSS2 differ significantly in their protein stability as PAPSS2 is a naturally fragile protein. PAPS synthases bind a series of nucleotide ligands and some of them markedly stabilize these proteins. PAPS synthases are of biomedical relevance as destabilizing point mutations give rise to several pathologies. Genetic defects in PAPSS2 have been linked to bone and cartilage malformations as well as a steroid sulfation defect. All this makes PAPS synthases ideal to study protein unfolding, ligand binding, and the stabilizing and destabilizing factors in their cellular environment. This review provides an overview on current concepts of protein folding and stability and links this with our current understanding of the different disease mechanisms of PAPSS2-related pathologies with perspectives for future research and application.
Article
Significance We apply a statistical thermodynamic formalism to quantify the cooperativity of folding of de novo-designed helical repeat proteins (DHRs). This analysis provides a fundamental thermodynamic description of folding for de novo-designed proteins and permits comparison with naturally occurring repeat protein thermodynamics. We find that individual DHR units are intrinsically stable, unlike those of naturally occurring proteins. This observation reveals local (intrarepeat) interactions as a source of high stability in Rosetta-designed proteins and suggests that different types of DHR repeats may be combined in a single polypeptide chain, expanding the repertoire of folded DHRs for applications such as molecular recognition. Favorable intrinsic stability imparts a downhill shape to the energy landscape, suggesting that DHRs fold fast and through parallel pathways.
Article
The inherent conflict between non-covalent interactions and the large conformational entropy of the polypeptide chain forces folding reactions and their mechanisms to deviate significantly from chemical reactions. Accordingly, measures of structure in the transition state ensemble (TSE) are strongly influenced by the underlying distributions of microscopic folding pathways that are challenging to discern experimentally. Here, we present a detailed analysis of 150,000 folding transition paths of five proteins at three different thermodynamic conditions from an experimentally consistent statistical mechanical model. We find that the underlying TSE structural distributions are rarely unimodal and the average experimental measures arise from complex underlying distributions. Unfolding pathways also exhibit subtle differences from folding counterparts due to a combination of Hammond behavior and native-state movements. Local interactions and topological complexity, to a lesser extent, are found to determine pathway heterogeneity, underscoring the importance of the balance between local and non-local energetics in protein folding.
Article
Full-text available
The development of computational efficient models is essential to obtain a detailed characterization of the mechanisms underlying the folding of proteins and the formation of amyloid fibrils. Structure based computational models (Go-model) with Cα or all-atom resolutions have been able to successfully delineate the mechanisms of folding of several globular proteins and offer an interesting alternative to computationally intensive simulations with explicit solvent description. We are exploring here the limits of Go-model predictions by analyzing the folding of the non-globular repeat domain proteins Notch Ankyrin and p16INK4 and the formation of human islet polypeptide (hIAPP) fibrils. Folding trajectories of the repeat domain proteins revealed that an all-atom resolution is required to capture the folding pathways and cooperativity reported in experimental studies. The all-atom Go-model was also successful in predicting the free energy landscape of hIAPP fibrillation, suggesting a “dock and lock” mechanism of fibril elongation. We finally explored how mutations can affect the co-assembly of hIAPP fibrils by simulating a heterogeneous system composed of wildtype and mutated hIAPP peptides. Overall, this study shows that all-atom Go-model based simulations have the potential of discerning the effects of mutations and post-translational modifications in protein folding and association and may help in resolving the dichotomy between experimental and theoretical studies on protein folding and amyloid fibrillation.
Preprint
Full-text available
Our manuscript is currently under review. However, if you want to quote it before its publication, please write: Tozzi A, Peters JF. 2016. Towards Equations for Brain Dynamics and the Concept of Extended Connectome viXra:1609.0045 The brain is a system at the edge of chaos equipped with nonlinear dynamics and functional energetic landscapes. However, still doubts exist concerning the type of attractors or the trajectories followed by particles in the nervous phase space. Starting from an unusual system governed by differential equations in which a dissipative strange attractor coexists with an invariant conservative torus, we developed a 3D model of brain phase space which has the potential to be operationalized and assessed empirically. We achieved a system displaying both a torus and a strange attractor, depending just on the initial conditions. Further, the system generates a funnel-like attractor equipped with a fractal structure. Changes in three easily detectable brain phase parameters (log frequency, log power and fractal slope) lead to modifications in funnel's tightness or in the two conformations' superimposition, which explain a large repertoire of brain functions and activities, such as sensations/perceptions, memory and self-generated thoughts. We anticipate this finding of an ordinary differential equations's system will lead to multidimensional brain functional models.
Article
Repeat proteins contain tandem arrays of a small structural motif. In contrast to globular proteins, they are stabilized only by interactions between residues that are close in sequence. The modular structure of repeat proteins makes them ideal systems in which to study protein folding and stability. In this chapter, we review studies that enabled an understanding of key aspects of repeat protein stability and folding: the basis and limits of cooperativity, simple models to quantify and predict energy landscapes, and the connection between equilibrium stability and folding pathways. We also discuss the implications of these studies for the general understanding of protein folding.
Article
Protein energy landscapes are highly complex, yet the vast majority of states within them tend to be invisible to experimentalists. Here, using site-directed mutagenesis and exploiting the simplicity of tandem-repeat protein structures, we delineate a network of these states and the routes between them. We show that our target, gankyrin, a 226-residue 7-ankyrin-repeat protein, can access two alternative (un)folding pathways. We resolve intermediates as well as transition states, constituting a comprehensive series of snapshots that map early and late stages of the two pathways and show both to be polarized such that the repeat array progressively unravels from one end of the molecule or the other. Strikingly, we find that the protein folds via one pathway but unfolds via a different one. The origins of this behavior can be rationalized using the numerical results of a simple statistical mechanics model that allows us to visualize the equilibrium behavior as well as single-molecule folding/unfolding trajectories, thereby filling in the gaps that are not accessible to direct experimental observation. Our study highlights the complexity of repeat-protein folding arising from their symmetrical structures; at the same time, however, this structural simplicity enables us to dissect the complexity and thereby map the precise topography of the energy landscape in full breadth and remarkable detail. That we can recapitulate the key features of the folding mechanism by computational analysis of the native structure alone will help toward the ultimate goal of designed amino-acid sequences with made-to-measure folding mechanisms-the Holy Grail of protein folding.
Article
Myriad biological processes proceed through states that defy characterization by conventional atomic-resolution structural biological methods. The invisibility of these 'dark' states can arise from their transient nature, low equilibrium population, large molecular weight, and/or heterogeneity. Although they are invisible, these dark states underlie a range of processes, acting as encounter complexes between proteins and as intermediates in protein folding and aggregation. New methods have made these states accessible to high-resolution analysis by nuclear magnetic resonance (NMR) spectroscopy, as long as the dark state is in dynamic equilibrium with an NMR-visible species. These methods - paramagnetic NMR, relaxation dispersion, saturation transfer, lifetime line broadening, and hydrogen exchange - allow the exploration of otherwise invisible states in exchange with a visible species over a range of timescales, each taking advantage of some unique property of the dark state to amplify its effect on a particular NMR observable. In this review, we introduce these methods and explore two specific techniques - paramagnetic relaxation enhancement and dark state exchange saturation transfer - in greater detail.
Article
The folding behaviors and mechanisms of large multidomain proteins have remained largely uncharacterized, primarily because of the lack of appropriate research methods. To address these limitations, novel mechanical folding probes have been developed that are based on antiparallel coiled-coil polypeptides. Such probes can be conveniently inserted at the DNA level, at different positions within the protein of interest where they minimally disturb the host protein structure. During single-molecule force spectroscopy measurements, the forced unfolding of the probe captures the progress of the unfolding front through the host protein structure. This novel approach allows unfolding pathways of large proteins to be directly identified. As an example, this probe was used in a large multidomain protein with ten identical ankyrin repeats, and the unfolding pathway, its direction, and the order of sequential unfolding were unequivocally and precisely determined. This development facilitates the examination of the folding pathways of large proteins, which are predominant in the proteasomes of all organisms, but have thus far eluded study because of the technical limitations encountered when using traditional techniques.
Article
The folding behaviors and mechanisms of large multidomain proteins have remained largely uncharacterized, primarily because of the lack of appropriate research methods. To address these limitations, novel mechanical folding probes have been developed that are based on antiparallel coiled-coil polypeptides. Such probes can be conveniently inserted at the DNA level, at different positions within the protein of interest where they minimally disturb the host protein structure. During single-molecule force spectroscopy measurements, the forced unfolding of the probe captures the progress of the unfolding front through the host protein structure. This novel approach allows unfolding pathways of large proteins to be directly identified. As an example, this probe was used in a large multidomain protein with ten identical ankyrin repeats, and the unfolding pathway, its direction, and the order of sequential unfolding were unequivocally and precisely determined. This development facilitates the examination of the folding pathways of large proteins, which are predominant in the proteasomes of all organisms, but have thus far eluded study because of the technical limitations encountered when using traditional techniques.
Article
Full-text available
IκBα inhibits the transcription factor, NFκB, by forming a very tightly bound complex in which the ankyrin repeat domain (ARD) of IκBα interacts primarily with the dimerization domain of NFκB. The first four ankyrin repeats (ARs) of the IκBα ARD are well-folded, but the AR5-6 region is intrinsically disordered according to amide H/D exchange and protein folding/unfolding experiments. We previously showed that mutations towards the consensus sequence for stable ankyrin repeats resulted in a "prefolded" mutant. To investigate whether the consensus mutations were solely able to order the AR5-6 region, we used a predictor of protein disordered regions PONDR VL-XT to select mutations that would alter the intrinsic disorder towards a more ordered structure (D → O mutants). The algorithm predicted two mutations, E282W and P261F, neither of which correspond to the consensus sequence for ankyrin repeats. Amide exchange and CD were used to assess ordering. Although only the E282W was predicted to be more ordered by CD and amide exchange, stopped-flow fluorescence studies showed that both of the D → O mutants were less efficient at dissociating NFκB from DNA.
Article
The folding mechanisms of proteins with multi-state transitions, the role of the intermediate states, and the precise mechanism how each transition occurs are significant on-going research issues. In this study, we investigate ferredoxin-like fold proteins which have a simple topology and multi-state transitions. We analyze the folding processes by means of a coarse-grained Gō model. We are able to reproduce the differences in the folding mechanisms between U1A, which has a high-free-energy intermediate state, and ADA2h and S6, which fold into the native structure through two-state transitions. The folding pathways of U1A, ADA2h, S6, and the S6 circular permutant, S6_p54-55, are reproduced and compared with experimental observations. We show that the ferredoxin-like fold contains two common regions consisting folding cores as predicted in other studies and that U1A produces an intermediate state due to the distinct cooperative folding of each core. However, because one of the cores of S6 loses its cooperativity and the two cores of ADA2h are tightly coupled, these proteins fold into the native structure through a two-state mechanism. © Proteins 2013;. © 2013 Wiley Periodicals, Inc.
Article
In this issue of Structure, Tsytlonok and colleagues describe the folding landscape of the giant HEAT-repeat protein PR65/A (a molecular adaptor of protein phosphatase 2A) by using experimental and theoretical methods. Both approaches agree in suggesting the presence of parallel folding pathways with several intermediates.
Article
Researchers in the field of rational protein design face a significant challenge, which arises from the two defining and inter-related features of typical globular protein structures, namely topological complexity and cooperativity. In striking contrast to globular proteins, tandem repeat proteins, such as ankyrin, tetratricopeptide and leucine-rich repeats, have regular, modular, linearly arrayed structures which makes it especially straightforward to dissect and redesign their properties. Here we review what we have learnt about the biophysics of natural repeat proteins and recent progress in applying that knowledge to engineer the thermodynamics, folding pathways and molecular recognition properties of tandem repeat proteins, and we discuss the wealth of possibilities presented for the extension of this modular construction process to build new molecules for use in medicine and biotechnology.
Article
Ankyrin repeat proteins (ARPs) are ubiquitous proteins that play critical regulatory roles in organisms and consist of repeating motifs (ankyrin repeats) stacked in non-globular, almost linear, “quasi one-dimensional” configurations. They also have highly unusual mechanical properties, notably ARPs can behave as nano-springs. Both their essential cellular functions and distinctive nano-mechanical properties have aroused interest in ARPs for potential applications in medicine and nanotechnology. Further, the modular architecture of ARPs, which lack the long-range contacts that typically stabilize globular proteins, provides a new paradigm for understanding protein stability and folding mechanisms of proteins. In the present study, the stability of ARP p18 (p18) and fifty p18 fragments was investigated by all-atomic molecular dynamics (MD) simulations in explicit water on a ∼3.3 microseconds timescale. The fragment simulations indicate that p18 a-helices are significantly stabilized by tertiary interactions, because in the absence of their native context they readily melt. All single p18 ARs and their structural elements are also unstable outside their native context. The minimal stable motifs are pairs of ARs, implying that inter-repeat contacts are essential for AR stability. Further, pairs of internal ARs are less stable than pairs that include a native capping AR. The MD simulations also provide indications of the functional roles of p18 turns and loops; the turns appear to be essential for the stability of the protein, while the loops both help to stabilize the p18 structure and are involved in recognition processes. Temperature-induced unfolding analysis shows that the p18 melts from the N-terminus to the C-terminus.
Article
In this chapter we review recent studies of repeat proteins, a class of proteins consisting of tandem arrays of small structural motifs that stack approximately linearly to produce elongated structures. We discuss the observation that, despite lacking the long-range tertiary interactions that are thought to be the hallmark of globular protein stability, repeat proteins can be as stable and as co-orperatively folded as their globular counterparts. The symmetry inherent in the structures of repeat arrays, however, means there can be many partly folded species (whether it be intermediates or transition states) that have similar stabilities. Consequently they do have distinct folding properties compared with globular proteins and these are manifest in their behaviour both at equilibrium and under kinetic conditions. Thus, when studying repeat proteins one appears to be probing a moving target: a relatively small perturbation, by mutation for example, can result in a shift to a different intermediate or transition state. The growing literature on these proteins illustrates how their modular architecture can be adapted to a remarkable array of biological and physical roles, both in vivo and in vitro. Further, their simple architecture makes them uniquely amenable to redesign-of their stability, folding and function-promising exciting possibilities for future research.
Article
Full-text available
The 33-amino-acid ankyrin motif comprises a β-turn followed by two anti-parallel α-helices and a loop and tandem arrays of the motif pack in a linear fashion to produce elongated structures characterized by short-range interactions. In this article we use site-directed mutagenesis to investigate the kinetic unfolding mechanism of D34, a 426-residue, 12-ankyrin repeat fragment of the protein ankyrinR. The data are consistent with a model in which the N-terminal half of the protein unfolds first by unraveling progressively from the start of the polypeptide chain to form an intermediate; in the next step, the C-terminal half of the protein unfolds via two pathways whose transition states have either the early or the late C-terminal ankyrin repeats folded. We conclude that the two halves of the protein unfold by different mechanisms because the N-terminal moiety folds and unfolds in the context of a folded C-terminal moiety, which therefore acts as a “seed” and confers a unique directionality on the process, whereas the C-terminal moiety folds and unfolds in the context of an unfolded N-terminal moiety and therefore behaves like a single-domain ankyrin repeat protein, having a high degree of symmetry and consequently more than one unfolding pathway accessible to it. • parallel pathways • protein engineering • protein folding • D34
Article
Full-text available
The overall structure of the transition state and intermediate ensembles experimentally observed for Dihydrofolate Reductase and Interleukin-1beta can be obtained utilizing simplified models which have almost no energetic frustration. The predictive power of these models suggest that, even for these very large proteins with completely different folding mechanisms and functions, real protein sequences are sufficiently well designed and much of the structural heterogeneity observed in the intermediates and the transition state ensemble is determined by topological effects.
Article
Full-text available
To understand the kinetics of protein folding, we introduce the concept of a ``transition coordinate'' which is defined to be the coordinate along which the system progresses most slowly. As a practical implementation of this concept, we define the transmission coefficient for any conformation to be the probability for a chain with the given conformation to fold before it unfolds. Since the transmission coefficient can serve as the best possible measure of kinetic distance for a system, we present two methods by which we can determine how closely any parameter of the system approximates the transmission coefficient. As we determine that the transmission coefficient for a short-chain heteropolymer system is dominated by entropic factors, we have chosen to illustrate the methods mentioned by applying them to geometrical properties of the system such as the number of native contacts and the looplength distribution. We find that these coordinates are not good approximations of the transmission coefficient and therefore, cannot adequately describe the kinetics of protein folding.
Article
Full-text available
A lattice model of protein is studied by a Monte Carlo simulation method. The native conformation of the lattice protein molecule is stabilized by specific long-range and short-ranged interactions. By comparing results of simulation for different relative weights of the long- and short-range interactions, it is concluded that the specific long-range interactions are essential for highly cooperative stabilization of the native conformation and that the short-range interactions accelerate the folding and unfolding transitions.
Article
Full-text available
The theory of spin glasses was used to study a simple model of protein folding. The phase diagram of the model was calculated, and the results of dynamics calculations are briefly reported. The relation of these results to folding experiments, the relation of these hypotheses to previous protein folding theories, and the implication of these hypotheses for protein folding prediction schemes are discussed.
Article
Full-text available
Ankyrin repeats are a 33-amino acid motif present in a number of proteins of diverse functions including transcription factors, cell differentiation molecules, and structural proteins. This motif has been shown to mediate protein interactions in the case of ankyrin as well as several other repeat-bearing proteins. In ankyrin, 24 tandemly arrayed repeats are arranged to form a globular, membrane-binding domain. This report provides evidence that the repeats in this domain fold into four independently folded subdomains of six repeats each. Limited proteolytic digestions of defined regions of the membrane-binding domain identified protease-sensitive sites, which divided this domain into subdomains of approximately six repeats each. Hydrodynamic measurements and circular dichroism spectroscopy of expressed subdomains confirmed that these six-repeat regions exist as folded, globular structures. The requirement of a complete set of six repeats for proper folding was determined using a series of protein constructs, which sequentially deleted repeats from the last subdomain. Deletion of even one repeat resulted in a 40% loss of alpha-helicity. Deletions removing three or more repeats abolished the helical signal completely. The spherical shapes of the intact domain and of the subdomains (inferred from hydrodynamic values) suggest that the four subdomains are organized in either a tetrahedral or square planar configuration. Two six-repeat subdomains were found to be required for high affinity association with the anion exchanger, suggesting that at least some of the protein interactions mediated by ankyrin repeats involve multiple subdomains.
Article
Full-text available
The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
Article
Full-text available
1. Introduction 111 2. Levinthal's paradox and energy landscapes 115 2.1 Including randomness in the energy function 121 2.2 Some effects of energetic correlations between structurally similar states 126 3. Resolution of problems by funnel theory 128 3.1 Physical origin of free-energy barriers 133 4. Generic mechanisms in folding 138 4.1 Collapse, generic and specific 139 4.2 Helix formation 139 4.3 Nematic ordering 141 4.4 Microphase separation 142 5. Signatures of a funneled energy landscape 145 6. Statistical Hamiltonians and self-averaging 152 7. Conclusions and future prospects 156 8. Acknowledgments 157 9. Appendix: Glossary of terms 157 10. References 158 The current explosion of research in molecular biology was made possible by the profound discovery that hereditary information is stored and passed on in the simple, one-dimensional (1D) sequence of DNA base pairs (Watson & Crick, 1953). The connection between heredity and biological function is made through the transmission of this 1D information, through RNA, to the protein sequence of amino acids. The information contained in this sequence is now known to be sufficient to completely determine a protein's geometrical 3D structure, at least for simpler proteins which are observed to reliably refold when denatured in vitro , i.e. without the aid of any cellular machinery such as chaperones or steric (geometrical) constraints due to the presence of a ribosomal surface (for example Anfinsen, 1973) (see Fig. 1). Folding to a specific structure is typically a prerequisite for a protein to function, and structural and functional probes are both often used in the laboratory to test for the in vitro yield of folded proteins in an experiment.
Article
Full-text available
We describe a method for predicting the structure of alpha beta class proteins in the absence of information from homologous structures. The method is based on an associative memory model for short to intermediate range in sequence contacts and a contact potential for long range in sequence contacts. The coefficients in the energy function are chosen to maximize the ratio of the folding temperature to the glass transition temperature. We use the resulting optimized model to predict the structure of three alpha beta protein domains ranging in length from 81 to 115 residues. The resulting predictions align with low rms deviations to large portions of the native state. We have also calculated the free energy as a function of similarity to the native state for one of these three domains, and we show that, as expected from the optimization criteria, the free energy surface resembles a rough funnel to the native state. Finally, we briefly demonstrate the effect of roughness in the energy landscape on the dynamics.
Article
Full-text available
Ankyrin repeat (AR) proteins mediate innumerable protein-protein interactions in virtually all phyla. This finding suggested the use of AR proteins as designed binding molecules. Based on sequence and structural analyses, we designed a consensus AR with fixed framework and randomized interacting residues. We generated several combinatorial libraries of AR proteins consisting of defined numbers of this repeat. Randomly chosen library members are expressed in soluble form in the cytoplasm of Escherichia coli constituting up to 30% of total cellular protein and show high thermodynamic stability. We determined the crystal structure of one of those library members to 2.0-A resolution, providing insight into the consensus AR fold. Besides the highly complementary hydrophobic repeat-repeat interfaces and the absence of structural irregularities in the consensus AR protein, the regular and extended hydrogen bond networks in the beta-turn and loop regions are noteworthy. Furthermore, all residues found in the turn region of the Ramachandran plot are glycines. Many of these features also occur in natural AR proteins, but not in this rigorous and standardized fashion. We conclude that the AR domain fold is an intrinsically very stable and well-expressed scaffold, able to display randomized interacting residues. This scaffold represents an excellent basis for the design of novel binding molecules.
Article
Full-text available
Protein recognition and binding, which result in either transient or long-lived complexes, play a fundamental role in many biological functions, but sometimes also result in pathologic aggregates. We use a simplified simulation model to survey a range of systems where two highly flexible protein chains form a homodimer. In all cases, this model, which corresponds to a perfectly funneled energy landscape for folding and binding, reproduces the macroscopic experimental observations on whether folding and binding are coupled in one step or whether intermediates occur. Owing to the minimal frustration principle, we find that, as in the case of protein folding, the native topology is the major factor that governs the choice of binding mechanism. Even when the monomer is stable on its own, binding sometimes occurs fastest through unfolded intermediates, thus showing the speedup envisioned in the fly-casting scenario for molecular recognition.
Article
Full-text available
We report here the evolution of ankyrin repeat (AR) proteins in vitro for specific, high-affinity target binding. Using a consensus design strategy, we generated combinatorial libraries of AR proteins of varying repeat numbers with diversified binding surfaces. Libraries of two and three repeats, flanked by 'capping repeats,' were used in ribosome-display selections against maltose binding protein (MBP) and two eukaryotic kinases. We rapidly enriched target-specific binders with affinities in the low nanomolar range and determined the crystal structure of one of the selected AR proteins in complex with MBP at 2.3 A resolution. The interaction relies on the randomized positions of the designed AR protein and is comparable to natural, heterodimeric protein-protein interactions. Thus, our AR protein libraries are valuable sources for binding molecules and, because of the very favorable biophysical properties of the designed AR proteins, an attractive alternative to antibody libraries.
Article
The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
Article
In the past few years, new approaches to Monte Carlo simulations have produced substantial improvements in the efficiency of both simulation techniques and data analysis. This paper will focus on the recent renewal of interest in histogram methods and the new developments in this field. This approach to data analysis has proven very effective in improving the efficiency and ultimate accuracy of Monte Carlo calculations. The methods will be described along with several new applications.
Article
IκBα regulates the transcription factor NF-κB through the formation of stable IκBα/NF-κB complexes. Prior to induction, IκBα retains NF-κB in the cytoplasm until the NF-κB activation signal is received. After activation, NF-κB is removed from gene promoters through association with nuclear IκBα, restoring the preinduction state. The 2.3 Å crystal structure of IκBα in complex with the NF-κB p50/p65 heterodimer reveals mechanisms of these inhibitory activities. The presence of IκBα allows large en bloc movement of the NF-κB p65 subunit amino-terminal domain. This conformational change induces allosteric inhibition of NF-κB DNA binding. Amino acid residues immediately preceding the nuclear localization signals of both NF-κB p50 and p65 subunits are tethered to the IκBα amino-terminal ankyrin repeats, impeding NF-κB from nuclear import machinery recognition.
Article
The inhibitory protein, IκBα, sequesters the transcription factor, NF-κB, as an inactive complex in the cytoplasm. The structure of the IκBα ankyrin repeat domain, bound to a partially truncated NF-κB heterodimer (p50/p65), has been determined by X-ray crystallography at 2.7 Å resolution. It shows a stack of six IκBα ankyrin repeats facing the C-terminal domains of the NF-κB Rel homology regions. Contacts occur in discontinuous patches, suggesting a combinatorial quality for ankyrin repeat specificity. The first two repeats cover an α helically ordered segment containing the p65 nuclear localization signal. The position of the sixth ankyrin repeat shows that full-length IκBα will occlude the NF-κB DNA-binding cleft. The orientation of IκBα in the complex places its N- and C-terminal regions in appropriate locations for their known regulatory functions.
Article
The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
Article
The tumour suppressor p16 is a member of the INK4 family of inhibi tors of the cyclin D-dependent kinases, CDK4 and CDK6, that are involved in the key growth control pathway of the eukaryotic cell cycle. The 156 amino acid residue protein is composed of four ankyrin repeats (a helix-turn-helix motif) that stack linearly as two four-helix bundles resulting in a non-globular, elongated molecule. The thermodynamic and kinetic properties of the folding of p16 are unusual. The protein has a very low free energy of unfolding, ΔGD-NH2O, of 3.1 kcal mol−1 at 25°C. The rate-determining transition state of folding/unfolding is very compact (89% as compact as the native state). The other unusual feature is the very rapid rate of unfolding in the absence of denaturant of 0.8 s−1at 25°C. Thus, p16 has both thermodynamic and kinetic instability. These features may be essential for the regulatory function of the INK4 proteins and of other ankyrin-repeat-containing proteins that mediate a wide range of protein-protein interactions. The mechanisms of inactivation of p16 by eight cancer-associated mutations were dissected using a systematic method designed to probe the integrity of the secondary structure and the global fold. The structure and folding of p16 appear to be highly vulnerable to single point mutations, probably as a result of the protein’s low stability. This vulnerability provides one explanation for the striking frequency of p16 mutations in tumours and in immortalised cell lines.
Article
IκBα inhibits transcription factor NF-κB activity by specific binding to NF-κB heterodimers composed of p65 and p50 subunits. It binds with slightly lower affinity to p65 homodimers and with significantly lower affinity to homodimers of p50. We have employed a structure-based mutagenesis approach coupled with protein–protein interaction assays to determine the source of this dimer selectivity exhibited by IκBα. Mutation of amino acid residues in IκBα that contact NF-κB only marginally affects complex binding affinity, indicating a lack of hot spots in NF-κB/IκBα complex formation. Conversion of the weak binding NF-κB p50 homodimer into a high affinity binding partner of IκBα requires transfer of both the NLS polypeptide and amino acid residues Asn202 and Ser203 from the NF-κB p65 subunit. Involvement of Asn202 and Ser203 in complex formation is surprising as these amino acid residues occupy solvent exposed positions at a distance of 20 Å from IκBα in the crystal structures. However, the same amino acid residue positions have been genetically isolated as determinants of binding specificity in a homologous system in Drosophila. X-ray crystallographic and solvent accessibility experiments suggest that these solvent-exposed amino acid residues contribute to NF-κB/IκBα complex formation by modulating the NF-κB p65 subunit NLS polypeptide.
Article
Recent experimental results suggest that the native fold, or topology, plays a primary role in determining the structure of the transition state ensemble, at least for small, fast-folding proteins. To investigate the extent of the topological control of the folding process, we studied the folding of simplified models of five small globular proteins constructed using a G-like potential to retain the information about the native structures but drastically reduce the energetic frustration and energetic heterogeneity among residue-residue native interactions. By comparing the structure of the transition state ensemble (experimentally determined by Φ-values) and of the intermediates with those obtained using our models, we show that these energetically unfrustrated models can reproduce the global experimentally known features of the transition state ensembles and “en-route” intermediates, at least for the analyzed proteins. This result clearly indicates that, as long as the protein sequence is sufficiently minimally frustrated, topology plays a central role in determining the folding mechanism.
Article
The INK4 (inhibitor of cyclin-dependent kinase 4) family consists of four tumor-suppressor proteins: p15INK4B, p16INK4A, p18INK4C, and p19INK4D. While their sequences and structures are highly homologous, they show appreciable differences in conformational flexibility, stability, and aggregation tendency. Here, p16 and p18 were first compared directly by NMR for line broadening and disappearance, then investigated by three different approaches in search of the causes of these differences. From denaturation experiments it was found that both proteins are marginally stable with low denaturation stability (1.94 and 2.98 kcal/mol, respectively). Heteronuclear 1H-15N nuclear Overhauser enhancement measurements revealed very limited conformational flexibility on the pico- to nanosecond time-scale for both p16 and p18. H/2H exchange of amide protons monitored by NMR on three proteins (p16, p18 as well as p15), however, revealed markedly different rates in the order p18<p16⩽p15. A subset of very slowly exchanging residues (about 19 in total) was identified in p18, including 16 residues in the region of the fourth ankyrin repeat, probably as a result of a stabilizing effect by the extra ankyrin repeat. Thus, while INK4 proteins may have similar low thermodynamic stability as well as limited flexibility on the pico- to nanosecond time-scale, they display pronounced differences in the conformational flexibility on the time-scale of minutes to hours. Further analyses suggested that differences in H/2H exchange rates reflect differences in the kinetic stability of the INK4 proteins, which in turn is related to differences in the aggregation tendency.
Article
The HO gene of Saccharomyces cerevisiae encodes the endonuclease that initiates mating-type switching. To prevent inopportune switching, HO transcription is restricted to a specific period in the haploid cell cycle, which is just after, and dependent on, the start of the mitotic cell cycle. A repeated promoter element (CACGA4) (refs 7-9) and two trans-acting activators (SWI4 and SWI6) have been identified, which are responsible for the periodic and start-dependent transcription of HO. To understand further the link between start and HO transcription, the SWI6 gene has been cloned and sequenced. The SWI6 protein is similar to the protein in Schizosaccharomyces pombe that is encoded by cdc10 an essential gene specifically required at the start of the cell cycle. The similarity between the SWI6 and cdc10 products, and their common involvement with 'start', suggest that they may share a common mechanism for sensing or executing this critical control step in the cell cycle. The SWI6 and cdc10 proteins also contain two copies of a repeated motif that occurs at least five times in the cytoplasmic domain of the Notch protein of Drosophila melanogaster.
Article
In cells that do not express immunoglobulin kappa light chain genes, the kappa enhancer binding protein NF-kappa B is found in cytosolic fractions and exhibits DNA binding activity only in the presence of a dissociating agent such as sodium deoxycholate. The dependence on deoxycholate is shown to result from association of NF-kappa B with a 60- to 70-kilodalton inhibitory protein (I kappa B). The fractionated inhibitor can inactivate NF-kappa B from various sources--including the nuclei of phorbol ester-treated cells--in a specific, saturable, and reversible manner. The cytoplasmic localization of the complex of NF-kappa B and I kappa B was supported by enucleation experiments. An active phorbol ester must therefore, presumably by activation of protein kinase C, cause dissociation of a cytoplasmic complex of NF-kappa B and I kappa B by modifying I kappa B. this releases active NF-kappa B which can translocate into the nucleus to activate target enhancers. The data show the existence of a phorbol ester-responsive regulatory protein that acts by controlling the DNA binding activity and subcellular localization of a transcription factor.
Article
The understanding, and even the description of protein folding is impeded by the complexity of the process. Much of this complexity can be described and understood by taking a statistical approach to the energetics of protein conformation, that is, to the energy landscape. The statistical energy landscape approach explains when and why unique behaviors, such as specific folding pathways, occur in some proteins and more generally explains the distinction between folding processes common to all sequences and those peculiar to individual sequences. This approach also gives new, quantitative insights into the interpretation of experiments and simulations of protein folding thermodynamics and kinetics. Specifically, the picture provides simple explanations for folding as a two-state first-order phase transition, for the origin of metastable collapsed unfolded states and for the curved Arrhenius plots observed in both laboratory experiments and discrete lattice simulations. The relation of these quantitative ideas to folding pathways, to uniexponential vs. multiexponential behavior in protein folding experiments and to the effect of mutations on folding is also discussed. The success of energy landscape ideas in protein structure prediction is also described. The use of the energy landscape approach for analyzing data is illustrated with a quantitative analysis of some recent simulations, and a qualitative analysis of experiments on the folding of three proteins. The work unifies several previously proposed ideas concerning the mechanism protein folding and delimits the regions of validity of these ideas under different thermodynamic conditions.
Article
Based on pattern searches and systematic database screening, almost 650 different ankyrin-like (ANK) repeats from nearly all phyla have been identified; more than 150 of them are reported here for the first time. Their presence in functionally diverse proteins such as enzymes, toxins, and transcription factors strongly suggests domain shuffling, but their occurrence in prokaryotes and yeast excludes exon shuffling. The spreading mechanism remains unknown, but in at least three cases horizontal gene transfer appears to be involved. ANK repeats occur in at least four consecutive copies. The terminal repeats are more variable in sequence. One feature of the internal repeats is a predicted central hydrophobic alpha-helix, which is likely to interact with other repeats. The functions of the ankyrin-like repeats are compatible with a role in protein-protein interactions.
Article
The division cycle of eukaryotic cells is regulated by a family of protein kinases known as the cyclin-dependent kinases (CDKs). The sequential activation of individual members of this family and their consequent phosphorylation of critical substrates promotes orderly progression through the cell cycle. The complexes formed by CDK4 and the D-type cyclins have been strongly implicated in the control of cell proliferation during the G1 phase. CDK4 exists, in part, as a multi-protein complex with a D-type cyclin, proliferating cell nuclear antigen and a protein, p21 (refs 7-9). CDK4 associates separately with a protein of M(r) 16K, particularly in cells lacking a functional retinoblastoma protein. Here we report the isolation of a human p16 complementary DNA and demonstrate that p16 binds to CDK4 and inhibits the catalytic activity of the CDK4/cyclin D enzymes. p16 seems to act in a regulatory feedback circuit with CDK4, D-type cyclins and retinoblastoma protein.
Article
The transcription factor NF-kappa B has attracted widespread attention among researchers in many fields based on the following: its unusual and rapid regulation, the wide range of genes that it controls, its central role in immunological processes, the complexity of its subunits, and its apparent involvement in several diseases. A primary level of control for NF-kappa B is through interactions with an inhibitor protein called I kappa B. Recent evidence confirms the existence of multiple forms of I kappa B that appear to regulate NF-kappa B by distinct mechanisms. NF-kappa B can be activated by exposure of cells to LPS or inflammatory cytokines such as TNF or IL-1, viral infection or expression of certain viral gene products, UV irradiation, B or T cell activation, and by other physiological and nonphysiological stimuli. Activation of NF-kappa B to move into the nucleus is controlled by the targeted phosphorylation and subsequent degradation of I kappa B. Exciting new research has elaborated several important and unexpected findings that explain mechanisms involved in the activation of NF-kappa B. In the nucleus, NF-kappa B dimers bind to target DNA elements and activate transcription of genes encoding proteins involved with immune or inflammation responses and with cell growth control. Recent data provide evidence that NF-kappa B is constitutively active in several cell types, potentially playing unexpected roles in regulation of gene expression. In addition to advances in describing the mechanisms of NF-kappa B activation, excitement in NF-kappa B research has been generated by the first report of a crystal structure for one form of NF-kappa B, the first gene knockout studies for different forms of NF-kB and of I kappa B, and the implications for therapies of diseases thought to involve the inappropriate activation of NF-kappa B.
Article
Energy landscape theory predicts that the folding funnel for a small fast-folding alpha-helical protein will have a transition state half-way to the native state. Estimates of the position of the transition state along an appropriate reaction coordinate can be obtained from linear free energy relationships observed for folding and unfolding rate constants as a function of denaturant concentration. The experimental results of Huang and Oas for lambda repressor, Fersht and collaborators for C12, and Gray and collaborators for cytochrome c indicate a free energy barrier midway between the folded and unfolded regions. This barrier arises from an entropic bottleneck for the folding process. In keeping with the experimental results, lattice simulations based on the folding funnel description show that the transition state is not just a single conformation, but rather an ensemble of a relatively large number of configurations that can be described by specific values of one or a few order parameters (e.g. the fraction of native contacts). Analysis of this transition state or bottleneck region from our lattice simulations and from atomistic models for small alpha-helical proteins by Boczko and Brooks indicates a broad distribution for native contact participation in the transition state ensemble centered around 50%. Importantly, however, the lattice-simulated transition state ensemble does include some particularly hot contacts, as seen in the experiments, which have been termed by others a folding nucleus. Linear free energy relations provide a crude spectroscopy of the transition state, allowing us to infer the values of a reaction coordinate based on the fraction of native contacts. This bottleneck may be thought of as a collection of delocalized nuclei where different native contacts will have different degrees of participation. The agreement between the experimental results and the theoretical predictions provides strong support for the landscape analysis.
Article
The inhibitory protein, IkappaBalpha, sequesters the transcription factor, NF-kappaB, as an inactive complex in the cytoplasm. The structure of the IkappaBalpha ankyrin repeat domain, bound to a partially truncated NF-kappaB heterodimer (p50/ p65), has been determined by X-ray crystallography at 2.7 A resolution. It shows a stack of six IkappaBalpha ankyrin repeats facing the C-terminal domains of the NF-kappaB Rel homology regions. Contacts occur in discontinuous patches, suggesting a combinatorial quality for ankyrin repeat specificity. The first two repeats cover an alpha helically ordered segment containing the p65 nuclear localization signal. The position of the sixth ankyrin repeat shows that full-length IkappaBalpha will occlude the NF-kappaB DNA-binding cleft. The orientation of IkappaBalpha in the complex places its N- and C-terminal regions in appropriate locations for their known regulatory functions.
Article
The recent elucidation of protein structures based upon repeating amino acid motifs, including the armadillo motif, the HEAT motif and tetratricopeptide repeats, reveals that they belong to the class of helical repeat proteins. These proteins share the common property of being assembled from tandem repeats of an alpha-helical structural unit, creating extended superhelical structures that are ideally suited to create a protein recognition interface.
Article
The ankyrin repeat is one of the most common protein sequence motifs. Recent X-ray and NMR structures of ankyrin-repeat proteins and their complexes have provided invaluable insights into the molecular basis of the extraordinary variety of biological activities of these molecules. In particular, they have begun to reveal how a large family of structurally related proteins can interact specifically with such a diverse array of macromolecular targets.
Article
The polypeptide chains that make up proteins have thousands of atoms and hence millions of possible inter-atomic interactions. It might be supposed that the resulting complexity would make prediction of protein structure and protein-folding mechanisms nearly impossible. But the fundamental physics underlying folding may be much simpler than this complexity would lead us to expect: folding rates and mechanisms appear to be largely determined by the topology of the native (folded) state, and new methods have shown great promise in predicting protein-folding mechanisms and the three-dimensional structures of proteins.
Article
The ankyrin repeat is an abundant, 33 residue sequence motif that forms a consecutive beta-hairpin-helix-loop-helix (beta(2)alpha(2)) fold. Most ankyrin repeat proteins consist of four or more complete repeats, which provide stabilizing interactions between adjacent modules. The cyclin-dependent kinase inhibitor and tumor suppressor p16(INK4) (p16) is one of the smallest ankyrin repeat proteins with a known structure. It consists of four complete repeats plus short N and C-terminal flanking regions that are unstructured in solution. On the basis of preliminary proteolysis studies and predictions using a computer algorithm for identifying autonomous folding units, we have identified a fragment consisting of the third and fourth ankyrin repeats of p16, called p16C, that can fold independently, without the rest of the protein. Far-UV circular dichroism studies showed that p16C has a significant level of alpha-helical secondary structure, and two proline substitutions that disrupt the alpha-helical secondary structure in wild-type p16 disrupt the secondary structure in p16C. The thermal denaturation of p16C is cooperative and reversible, with a midpoint of transition at 30. 5(+/-1) degrees C. From urea-induced denaturation studies, the free energy of unfolding for p16C was estimated to be 1.7(+/-0.3) kcal/mol at 20 degrees C. (1)H-(15)N 2D NMR studies suggest that the ankyrin repeats in p16C are likely to fold into a structure similar to that of full-length p16. In order to define the minimum autonomous folding unit in p16, we have further dissected p16C into two complementary peptides, each containing a single ankyrin repeat. These peptides are unstructured in solution. Thus, p16C is the smallest ankyrin repeat module that is known to fold independently and, in general, we believe that the two-ankyrin repeat fold could be the minimum structural unit for all ankyrin repeat proteins. We further discuss the significance of p16C in protein folding and engineering.
Article
We perform folding simulations on 18 small proteins with using a simple Go-like protein model and analyze the folding rate constants, characteristics of the transition state ensemble, and those of the denatured states in terms of native topology and chain length. Near the folding transition temperature, the folding rate k(F) scales as k(F) approximately exp(-c RCO N(0.6)) where RCO and N are the relative contact order and number of residues, respectively. Here the topology RCO dependence of the rates is close to that found experimentally (k(F) approximately exp(-c RCO)), while the chain length N dependence is in harmony with the predicted scaling property (k(F) approximately exp(-c N(2/3))). Thus, this may provides a unified scaling law in folding rates at the transition temperature, k(F) approximately exp(-c RCO N(2/3)). The degree of residual structure in the denatured state is highly correlated with RCO, namely, proteins with smaller RCO tend to have more ordered structure in the denatured state. This is consistent with the observation that many helical proteins such as myoglobin and protein A, have partial helices, in the denatured states. The characteristics of the transition state ensemble calculated by the current model, which uses native topology but not sequence specific information, are consistent with experimental phi-value data for about half of proteins.
Article
To define the boundaries of the Drosophila Notch ankyrin domain, examine the effects of repeat number on the folding of this domain, and examine the degree to which the modular architecture of ankyrin repeat proteins results in modular stability, we have investigated the thermodynamics of unfolding of polypeptides corresponding to different segments of the ankyrin repeats of Drosophila Notch. We find that a polypeptide containing the six previously identified ankyrin repeats unfolds cooperatively, but is of modest stability. However, inclusion of a putative seventh, C-terminal ankyrin sequence doubles the stability of the Notch ankyrin domain (a 1000-fold increase in the folding equilibrium constant), indicating that the seventh ankyrin repeat is an important part of the Notch ankyrin domain, and demonstrating long-range interactions among ankyrin repeats. This putative seven-repeat polypeptide also shows increases in enthalpy, denaturant dependence (m-value), and heat capacity of unfolding (DeltaC(p)()) of around 50% each, suggesting that deletion of the seventh repeat results in partial unfolding of the sixth ankyrin repeat, consistent with spectroscopic and hydrodynamic data reported in the preceding paper [Zweifel, M. E., and Barrick, D. (2001) Biochemistry 40, 14344-14356]. A polypeptide consisting of only the five N-terminal repeats has stability similar to the six-repeat construct, demonstrating that stability is distributed asymmetrically along the ankyrin domain. These data are consistent with highly cooperative two-state folding of these ankyrin polypeptides, despite their modular architecture.
Article
Proteins containing stretches of repeating amino acid sequences are prevalent throughout nature, yet little is known about the general folding and assembly mechanisms of these systems. Here we propose myotrophin as a model system to study the folding of ankyrin repeat proteins. Myotrophin is folded over a large pH range and is soluble at high concentrations. Thermal and urea denaturation studies show that the protein displays cooperative two-state folding properties despite its modular nature. Taken together with previous studies on other ankyrin repeat proteins, our data suggest that the two-state folding pathway may be characteristic of ankyrin repeat proteins and other integrated alpha-helical repeat proteins in general.
Article
The ankyrin repeat is one of the most common, modular, protein-protein interaction motifs in nature. To understand the structural determinants of this family of proteins and extract the consensus information that defines the architecture of this motif, we have designed a series of idealized ankyrin repeat proteins containing one, two, three, or four repeats by using statistical analysis of approximately 4,000 ankyrin repeat sequences from the PFAM database. Biophysical and x-ray crystallographic studies of the three and four repeat constructs (3ANK and 4ANK) to 1.26 and 1.5 A resolution, respectively, demonstrate that these proteins are well-folded, monomeric, display high thermostability, and adopt a very regular, tightly packed ankyrin repeat fold. Mapping the degree of amino acid conservation at each position on the 4ANK structure shows that most nonconserved residues are clustered on the surface of the molecule that has been designated as the binding site in naturally occurring ankyrin repeat proteins. Thus, the consensus amino acid sequence contains all information required to define the ankyrin repeat fold. Our results suggest that statistical analysis and the consensus sequence approach can be used as an effective method to design proteins with complex topologies. These generic ankyrin repeat proteins can serve as prototypes for dissecting the rules of molecular recognition mediated by ankyrin repeats and for engineering proteins with novel biological functions.
Article
IkappaBalpha inhibits transcription factor NF-kappaB activity by specific binding to NF-kappaB heterodimers composed of p65 and p50 subunits. It binds with slightly lower affinity to p65 homodimers and with significantly lower affinity to homodimers of p50. We have employed a structure-based mutagenesis approach coupled with protein-protein interaction assays to determine the source of this dimer selectivity exhibited by IkappaBalpha. Mutation of amino acid residues in IkappaBalpha that contact NF-kappaB only marginally affects complex binding affinity, indicating a lack of hot spots in NF-kappaB/IkappaBalpha complex formation. Conversion of the weak binding NF-kappaB p50 homodimer into a high affinity binding partner of IkappaBalpha requires transfer of both the NLS polypeptide and amino acid residues Asn202 and Ser203 from the NF-kappaB p65 subunit. Involvement of Asn202 and Ser203 in complex formation is surprising as these amino acid residues occupy solvent exposed positions at a distance of 20A from IkappaBalpha in the crystal structures. However, the same amino acid residue positions have been genetically isolated as determinants of binding specificity in a homologous system in Drosophila. X-ray crystallographic and solvent accessibility experiments suggest that these solvent-exposed amino acid residues contribute to NF-kappaB/IkappaBalpha complex formation by modulating the NF-kappaB p65 subunit NLS polypeptide.
Article
The ANK repeat is a ubiquitous 33-residue motif that adopts a beta hairpin helix-loop-helix fold. Multiple tandem repeats stack in a linear manner to produce an elongated structure that is stabilized predominantly by short-range interactions between residues close in sequence. The tumor suppressor p16(INK4) consists of four repeats and represents the minimal ANK folding unit. We found from Phi value analysis that p16 unfolded sequentially. The two N-terminal ANK repeats, which are distorted from the canonical ANK structure in all INK4 proteins and which are important for functional specificity, were mainly unstructured in the rate-limiting transition state for folding/unfolding, while the two C-terminal repeats were fully formed. A sequential unfolding mechanism could have implications for the cellular fate of wild-type and cancer-associated mutant p16 proteins.
Article
Although they are widely distributed across kingdoms and are involved in a myriad of essential processes, until recently, repeat proteins have received little attention in comparison to globular proteins. As the name indicates, repeat proteins contain strings of tandem repeats of a basic structural element. In this respect, their construction is quite different from that of globular proteins, in which sequentially distant elements coalesce to form the protein. The different families of repeat proteins use their diverse scaffolds to present highly specific binding surfaces through which protein-protein interactions are mediated. Recent studies seek to understand the stability, folding and design of this important class of proteins.
Article
We describe an efficient way to generate combinatorial libraries of stable, soluble and well-expressed ankyrin repeat (AR) proteins. Using a combination of sequence and structure consensus analyses, we designed a 33 amino acid residue AR module with seven randomized positions having a theoretical diversity of 7.2x10(7). Different numbers of this module were cloned between N and C-terminal capping repeats, i.e. ARs designed to shield the hydrophobic core of stacked AR modules. In this manner, combinatorial libraries of designed AR proteins consisting of four to six repeats were generated, thereby potentiating the theoretical diversity. All randomly chosen library members were expressed in soluble form in the cytoplasm of Escherichia coli in amounts up to 200 mg per 1 l of shake-flask culture. Virtually pure proteins were obtained in a single purification step. The designed AR proteins are monomeric and display CD spectra identical with those of natural AR proteins. At the same time, our AR proteins are highly thermostable, with T(m) values ranging from 66 degrees C to well above 85 degrees C. Thus, our combinatorial library members possess the properties required for biotechnological applications. Moreover, the favorable biophysical properties and the modularity of the AR fold may account, partly, for the abundance of natural AR proteins.
Article
The Notch receptor contains a conserved ankyrin repeat domain that is required for Notch-mediated signal transduction. The ankyrin domain of Drosophila Notch contains six ankyrin sequence repeats previously identified as closely matching the ankyrin repeat consensus sequence, and a putative seventh C-terminal sequence repeat that exhibits lower similarity to the consensus sequence. To better understand the role of the Notch ankyrin domain in Notch-mediated signaling and to examine how structure is distributed among the seven ankyrin sequence repeats, we have determined the crystal structure of this domain to 2.0 angstroms resolution. The seventh, C-terminal, ankyrin sequence repeat adopts a regular ankyrin fold, but the first, N-terminal ankyrin repeat, which contains a 15-residue insertion, appears to be largely disordered. The structure reveals a substantial interface between ankyrin polypeptides, showing a high degree of shape and charge complementarity, which may be related to homotypic interactions suggested from indirect studies. However, the Notch ankyrin domain remains largely monomeric in solution, demonstrating that this interface alone is not sufficient to promote tight association. Using the structure, we have classified reported mutations within the Notch ankyrin domain that are known to disrupt signaling into those that affect buried residues and those restricted to surface residues. We show that the buried substitutions greatly decrease protein stability, whereas the surface substitutions have only a marginal affect on stability. The surface substitutions are thus likely to interfere with Notch signaling by disrupting specific Notch-effector interactions and map the sites of these interactions.
Article
SMART (Simple Modular Architecture Research Tool) is a web tool (http://smart.embl.de/) for the identification and annotation of protein domains, and provides a platform for the comparative study of complex domain architectures in genes and proteins. The January 2004 release of SMART contains 685 protein domains. New developments in SMART are centred on the integration of data from completed metazoan genomes. SMART now uses predicted proteins from complete genomes in its source sequence databases, and integrates these with predictions of orthology. New visualization tools have been developed to allow analysis of gene intron–exon structure within the context of protein domain structure, and to align these displays to provide schematic comparisons of orthologous genes, or multiple transcripts from the same gene. Other improvements include the ability to query SMART by Gene Ontology terms, improved structure database searching and batch retrieval of multiple entries.
Article
The ankyrin repeat is one of the most frequently observed amino acid motifs in protein databases. This protein-protein interaction module is involved in a diverse set of cellular functions, and consequently, defects in ankyrin repeat proteins have been found in a number of human diseases. Recent biophysical, crystallographic, and NMR studies have been used to measure the stability and define the various topological features of this motif in an effort to understand the structural basis of ankyrin repeat-mediated protein-protein interactions. Characterization of the folding and assembly pathways suggests that ankyrin repeat domains generally undergo a two-state folding transition despite their modular structure. Also, the large number of available sequences has allowed the ankyrin repeat to be used as a template for consensus-based protein design. Such projects have been successful in revealing positions responsible for structure and function in the ankyrin repeat as well as creating a potential universal scaffold for molecular recognition.
Article
HIV-1 protease (PR) is a major drug target in combating AIDS, as it plays a key role in maturation and replication of the virus. Six FDA-approved drugs are currently in clinical use, all designed to inhibit enzyme activity by blocking the active site, which exists only in the dimer. An alternative inhibition mode would be required to overcome the emergence of drug-resistance through the accumulation of mutations. This might involve inhibiting the formation of the dimer itself. Here, the folding of HIV-1 PR dimer is studied with several simulation models appropriate for folding mechanism studies. Simulations with an off-lattice Gō-model, which corresponds to a perfectly funneled energy landscape, indicate that the enzyme is formed by association of structured monomers. All-atom molecular dynamics simulations strongly support the stability of an isolated monomer. The conjunction of results from a model that focuses on the protein topology and a detailed all-atom force-field model suggests, in contradiction to some reported equilibrium denaturation experiments, that monomer folding and dimerization are decoupled. The simulation result is, however, in agreement with the recent NMR detection of folded monomers of HIV-1 PR mutants with a destabilized interface. Accordingly, the design of dimerization inhibitors should not focus only on the flexible N and C termini that constitute most of the dimer interface, but also on other structured regions of the monomer. In particular, the relatively high phi values for residues 23-35 and 79-87 in both the folding and binding transition states, together with their proximity to the interface, highlight them as good targets for inhibitor design.
Article
The crystal structure of IkappaBalpha in complex with the transcription factor, nuclear factor kappa-B (NF-kappaB) shows six ankyrin repeats, which are all ordered. Electron density was not observed for most of the residues within the PEST sequence, although it is required for high-affinity binding. To characterize the folded state of IkappaBalpha (67-317) when it is not in complex with NF-kappaB, we have carried out circular dichroism (CD) spectroscopy, 8-anilino-1-napthalenesulphonic acid (ANS) binding, differential scanning calorimetry, and amide hydrogen/deuterium exchange experiments. The CD spectrum shows the presence of helical structure, consistent with other ankyrin repeat proteins. The large amount of ANS-binding and amide exchange suggest that the protein may have molten globule character. The amide exchange experiments show that the third ankyrin repeat is the most compact, the second and fourth repeats are somewhat less compact, and the first and sixth repeats are solvent exposed. The PEST extension is also highly solvent accessible. Ikappa Balpha unfolds with a T(m) of 42 degrees C, and forms a soluble aggregate that sequesters helical and variable loop parts of the first, fourth, and sixth repeats and the PEST extension. The second and third repeats, which conform most closely to a consensus for stable ankyrin repeats, appear to remain outside of the aggregate. The ramifications of these observations for the biological function of IkappaBalpha are discussed.
Article
Energy landscapes have been used to conceptually describe and model protein folding but have been difficult to measure experimentally, in large part because of the myriad of partly folded protein conformations that cannot be isolated and thermodynamically characterized. Here we experimentally determine a detailed energy landscape for protein folding. We generated a series of overlapping constructs containing subsets of the seven ankyrin repeats of the Drosophila Notch receptor, a protein domain whose linear arrangement of modular structural units can be fragmented without disrupting structure. To a good approximation, stabilities of each construct can be described as a sum of energy terms associated with each repeat. The magnitude of each energy term indicates that each repeat is intrinsically unstable but is strongly stabilized by interactions with its nearest neighbors. These linear energy terms define an equilibrium free energy landscape, which shows an early free energy barrier and suggests preferred low-energy routes for folding. • repeat protein • Notch ankyrin domain • Ising model • energy landscape • protein stability