Article

Computational determination of side chain specificity of pockets in class I MHC molecules

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

We show that a rapidly executable computational procedure provides the basis for a predictive understanding of antigenic peptide side chain specificity, for binding to class I major histocompatibility complex (MHC) molecules. The procedure consists of a combined search to identify the joint conformations of peptide side chains and side chains comprising the MHC pocket, followed by conformational selection, using a target function, based on solvation energies and modified electrostatic energies. The method was applied to the B pocket region of five MHC molecules, which were chosen to encompass the full range of specificities displayed by anchors at peptide position 2. These were a medium hydrophobic residue (Leu or Met) for HLA-A*0201, a basic residue (Arg or Lys) for HLA-B*2705; a small hydrophobic residue (Val) for HLA-A*6801, an acidic residue (Glu) for HLA-B*4001 and a bulky residue (Tyr) for H-2K(d). The observed anchors are correctly predicted in each case. The agreement for HLA-B40 and H-2K(d) is especially promising, since their structures have not yet been determined experimentally. Because the experimental determination of motifs by elution is difficult and these calculations take only hours on a high speed workstation, the results open the possibility of routine determination of motifs computationally.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Methods of various degrees of accuracy have been developed to model the peptide structure in the groove [12][13][14][15], to evaluate the compatibility between a peptide and an MHC molecule [16][17][18], and even to estimate the free energy of binding of a peptide to an MHC molecule [15,17,19]. A threading protocol has been applied to select a binding peptide from a protein sequence [16] and to distinguish between binding and nonbinding peptides [20], using the known structure of an MHC-peptide complex as a general threading template. ...
... Methods of various degrees of accuracy have been developed to model the peptide structure in the groove [12][13][14][15], to evaluate the compatibility between a peptide and an MHC molecule [16][17][18], and even to estimate the free energy of binding of a peptide to an MHC molecule [15,17,19]. A threading protocol has been applied to select a binding peptide from a protein sequence [16] and to distinguish between binding and nonbinding peptides [20], using the known structure of an MHC-peptide complex as a general threading template. ...
... The sidechain conformations of the peptides were predicted based on the same energy functions as in Figure 5 (E comb and E nocomb ). (a) (b) (c) 17 The influence of the interactions between peptide sidechains on the determination of the sidechain conformations is controversial. Traditionally, most algorithms included the interactions between sidechains; however, several studies suggested that the sidechain-mainchain interactions determine to a large extent the structure of the protein sidechains, whereas the influence of sidechain-sidechain interactions is less important [31,33,35,36,38]. ...
Article
Background: The binding of T-cell antigenic peptides to MHC molecules is a prerequisite for their immunogenicity. The ability to identify binding peptides based on the protein sequence is of great importance to the rational design of peptide vaccines. As the requirements for peptide binding cannot be fully explained by the peptide sequence per se, structural considerations should be taken into account and are expected to improve predictive algorithms. The first step in such an algorithm requires accurate and fast modeling of the peptide structure in the MHC-binding groove. Results: We have used 23 solved peptide-MHC class I complexes as a source of structural information in the development of a modeling algorithm. The peptide backbones and MHC structures were used as the templates for prediction. Sidechain conformations were built based on a rotamer library, using the 'dead end elimination' approach. A simple energy function selects the favorable combination of rotamers for a given sequence. It further selects the correct backbone structure from a limited library. The influence of different parameters on the prediction quality was assessed. With a specific rotamer library that incorporates information from the peptide sidechains in the solved complexes, the algorithm correctly identifies 85% (92%) of all (buried) sidechains and selects the correct backbones. Under cross-validation, 70% (78%) of all (buried) residues are correctly predicted and most of all backbones. The interaction between peptide sidechains has a negligible effect on the prediction quality. Conclusions: The structure of the peptide sidechains follows from the interactions with the MHC and the peptide backbone, as the prediction is hardly influenced by sidechain interactions. The proposed methodology was able to select the correct backbone from a limited set. The impairment in performance under cross-validation suggests that, currently, the specific rotamer library is not satisfactorily representative. The predictions might improve with an increase in the data.
... P2 and p9 are generally accepted as primary anchors for the A3 superfamily (Garrett et al., 1989; Matsamura et al., 1992; Falk and Rtzschke, 1993). The peptide side chain at p2 falls into pocket B and the C-terminal is buried in pocket F (Saper and Bjorkman, 1991; Vasmatzis et al., 1996 ). Peptides usually have a positively charged residue Arg or Lys at p9 and a variety of hydrophobic residues at p2. ...
... Because of this, larger residues like Leu are deleterious for A*6801 and A*1101 but are preferred for A*0301. The change from Glu 63 to Asn 63 in A*6801 and A*1101 also changes the conformation of the pocket and stops large amino acids from binding (Vasmatzis et al., 1996 ). A previous study of pocket B revealed Val 67 was reoriented in A*6801 and affected amino acid selection (Guo et al., 1993). ...
Article
Full-text available
Quantitative structure – activity relationships (QSAR) is a well established ligand-based approach to drug de-sign. It correlates changes in the chemical structure of a series of compounds with changes in their biological activities. Peptides of equal length which bind to a certain protein are an excellent target for QSAR. In the present review, we sum-marize our experience in QSAR studies of peptides acting as T-cell epitopes. T-cell epitopes are protein fragments pre-sented on the cell surface which afford the immune system the opportunity to detect and respond to both intracellular and extracellular pathogens. Epitope-based vaccines are a new generation of vaccines with lower side effects. The process of antigen presentation, which includes proteasome cleavage, TAP and MHC binding, has been modeled and analyzed by QSAR. Derived QSAR models are highly predictive, allowing us to design and test in vitro MHC superbinders. All mod-els have been implemented in servers for in silico prediction of MHC binders and T-cell epitopes. In practice, better initial in silico prediction leads to improved subsequent experimental research on epitope-based vaccines.
... Several databases, SYFPEITHI [43], JenPep [44] and MHCpep [45], provide peptide sequences associated with MHC alleles together with anchor positions and experimental data on affinity. These observations have extensively been used in peptide/MHC binding prediction464748 (a list of prediction programs and servers is available at " The IMGT Immunoinformatics page " , http://imgt.cines.fr). Nevertheless exceptions have been found [49– 51] and it has been noted that only 30% of peptides with the expected pattern really bind whereas some peptides without the expected pattern do bind [52]. ...
... Several databases, SYFPEITHI [43], JenPep [44] and MHCpep [45], provide peptide sequences associated with MHC alleles together with anchor positions and experimental data on affinity. These observations have extensively been used in peptide/MHC binding prediction [46][47][48] (a list of prediction programs and servers is available at "The IMGT Immunoinformatics page", http://imgt.cines.fr). Nevertheless exceptions have been found [49][50][51] and it has been noted that only 30% of peptides with the expected pattern really bind whereas some peptides without the expected pattern do bind [52]. ...
Article
Full-text available
One of the key elements in the adaptive immune response is the presentation of peptides by the major histocompatibility complex (MHC) to the T cell receptors (TR) at the surface of T cells. The characterization of the TR/peptide/MHC trimolecular complexes (TR/pMHC) is crucial to the fields of immunology, vaccination and immunotherapy. In order to facilitate data comparison and cross-referencing between experiments from different laboratories whatever the receptor, the chain type, the domain, or the species, IMGT, the international ImMunoGeneTics information system (http://imgt.cines.fr), has developed IMGT-ONTOLOGY, the first ontology in immunogenetics and immunoinformatics. In IMGT/3Dstructure-DB, the IMGT three-dimensional structure database, TR/pMHC molecular characterization and pMHC contact analysis are made according to the IMGT Scientific chart rules, based on the IMGT-ONTOLOGY concepts. IMGT/3Dstructure-DB provides the standardized IMGT gene and allele names (CLASSIFICATION), the standardized IMGT labels (DESCRIPTION) and the IMGT unique numbering (NUMEROTATION). As the IMGT structural unit is the domain, amino acids at conserved positions always have the same number in the IMGT databases, tools and Web resources. For the TR alpha and beta chains, the amino acids in contact with the peptide/MHC (pMHC) are defined according to the IMGT unique numbering for V-DOMAIN. The MHC cleft that binds the peptide is formed by two groove domains (G-DOMAIN), each one comprising four antiparallel beta strands and one alpha helix. The IMGT unique numbering for G-DOMAIN applies both to the first two domains (G-ALPHA1 and G-ALPHA2) of the MHC class I alpha chain, and to the first domain (G-ALPHA and G-BETA) of the two MHC class II chains, alpha and beta. Based on the IMGT unique numbering, we defined eleven contact sites for the analysis of the pMHC contacts. The TR/pMHC contact description, based on the IMGT numbering, can be queried in the IMGT/StucturalQuery tool, at http://imgt.cines.fr.
... In contrast, the MHC HLA-B*2705 (B27) prefers to bind peptides with a hydrophilic amino acid in one of its pockets (9). Historically, immunopeptidomes have been predicted by modelling the interaction of the MHC binding pocket and peptide, particularly focusing on biochemical attributes such as sidechain conformations, solvation energies, electrostatic interactions, and hydrophobicity (10,11). However with improved computing power, larger datasets, and the need for interpolation due to the high polymorphism in MHC Class I alleles (12), artificial intelligence based methods have become popular over such mechanistic means of prediction. ...
Article
Full-text available
Major Histocompability Complex (MHC) Class I molecules allow cells to present foreign and endogenous peptides to T-Cells so that cells infected by pathogens can be identified and killed. Neural networks tools such as NetMHC-4.0 and NetMHCpan-4.1 are used to predict whether peptides will bind to variants of MHC molecules. These tools are trained on data gathered from binding affinity and eluted ligand experiments. However, these tools do not track hydrophobicity, a significant biochemical factor relevant to peptide binding, in their predictions. A previous study had concluded that the peptides predicted to bind to HLA-A*0201 by NetMHC-4.0 were much more hydrophobic than expected. This paper expands that study by also focusing on HLA-B*2705 and HLA-B*0801, which prefer binding hydrophilic and balanced peptides respectively. The correlation of hydrophobicity of 9-mer peptides with their predicted binding strengths to these various HLAs was investigated. Two studies were performed, one using the data that the two neural networks were trained on, and the other using a sample of the human proteome. NetMHC-4.0 was found to have a statistically significant bias towards predicting highly hydrophobic peptides as strong binders to HLA-A*0201 and HLA-B*2705 in both studies. Machine Learning metrics were used to identify the causes for this bias: hydrophobic false positives and hydrophilic false negatives. These results suggest that the retraining the neural networks with biochemical attributes such as hydrophobicity and better training data could increase the accuracy of their predictions. This would increase their impact in applications such as vaccine design and neoantigen identification.
... This energy function and its various applications have been described in previous publications. 29,35,36 Here we only sketch its principal components and the modifications made in the current application to protein docking. We use structure-derived atomic contact energies (ACE) to estimate the change in solvation energy of the two molecules on going from the unligated state to the complex, ...
Article
Full-text available
We present a rapidly executable minimal binding energy model for molecular docking and use it to explore the energy landscape in the vicinity of the binding sites of four different enzyme inhibitor complexes. The structures of the complexes are calculated starting with the crystal structures of the free monomers, using DOCK 4.0 to generate a large number of potential configurations, and screening with the binding energy target function. In order to investigate possible correlations between energy and variation from the native structure, we introduce a new measure of similarity, which removes many of the difficulties associated with root mean square deviation. The analysis uncovers energy gradients, or funnels, near the binding site, with decreasing energy as the degree of similarity between the native and docked structures increases. Such energy funnels can increase the number of random collisions that may evolve into productive stable complex, and indicate that short-range interactions in the precomplexes can contribute to the association rate. The finding could provide an explanation for the relatively rapid association rates that are observed even in the absence of long-range electrostatic steering. Proteins 1999; 34:255–267. © 1999 Wiley-Liss, Inc.
... Essentially, all what is required is the experimentally determined structure, or a convincing homology model, of an MHC peptide complex. DeLisi and coworkers were among the first to apply molecular dynamics to peptide, MHC binding, and have, subsequently, developed a series of different meth- ods [111, 112, 113]. Part of this work has concentrated on accurate docking using molecular dynamics and another part on determining free energies from peptide MHC complexes. ...
Article
Full-text available
The postgenomic era, as manifest, inter alia, by proteomics, offers unparalleled opportunities for the efficient discovery of safe, efficacious, and novel subunit vaccines targeting a tranche of modern major diseases. A negative corollary of this opportunity is the risk of becoming overwhelmed by this embarrassment of riches. Informatics techniques, working to address issues of both data management and through prediction to shortcut the experimental process, can be of enormous benefit in leveraging the proteomic revolution. In this disquisition, we evaluate proteomic approaches to the discovery of subunit vaccines, focussing on viral, bacterial, fungal, and parasite systems. We also adumbrate the impact that proteomic analysis of host-pathogen interactions can have. Finally, we review relevant methods to the prediction of immunome, with special emphasis on quantitative methods, and the subcellular localization of proteins within bacteria.
... The first step toward implementing our proposed method is to collect data on those properties that are hypothesized in the literature to play an important role in the MHCpeptide binding. Using the extensive literature in X-Ray crystallography of MHC molecules e.g. in [29][30][31][32][33][34][35] available on several MHC-alleles, together with a substantial literature on structural correlates of MHC-binding e.g. in [36][37][38][39], there is opportunity to assemble a set of properties to serve as our starting set. Given a particular MHCallele, the MHC-peptide binding is mainly determined by the peptide's back-bone conformation and the interaction of the side-chains with the MHC-binding grooves [36]. ...
Article
Full-text available
A key step in the development of an adaptive immune response to pathogens or vaccines is the binding of short peptides to molecules of the Major Histocompatibility Complex (MHC) for presentation to T lymphocytes, which are thereby activated and differentiate into effector and memory cells. The rational design of vaccines consists in part in the identification of appropriate peptides to effect this process. There are several algorithms currently in use for making such predictions, but these are limited to a small number of MHC molecules and have good but imperfect prediction power. We have undertaken an exploration of the power gained by taking advantage of a natural representation of the amino acids in terms of their biophysical properties. We used several well-known statistical classifiers using either a naive encoding of amino acids by name or an encoding by biophysical properties. In all cases, the encoding by biophysical properties leads to substantially lower misclassification error. Representation of amino acids using a few important bio-physio-chemical property provide a natural basis for representing peptides and greatly improves peptide-MHC class I binding prediction.
Chapter
Major Histocompability Complex (MHC) Class I molecules provide a pathway for cells to present endogenous peptides to the immune system, allowing it to distinguish healthy cells from those infected by pathogens. Software tools based on neural networks such as NetMHC and NetMHCpan predict whether peptides will bind to variants of MHC molecules. These tools are trained with experimental data, consisting of the amino acid sequence of peptides and their observed binding strength. Such tools generally do not explicitly consider hydrophobicity, a significant biochemical factor relevant to peptide binding. It was observed that these tools predict that some highly hydrophobic peptides will be strong binders, which biochemical factors suggest is incorrect. This paper investigates the correlation of the hydrophobicity of 9-mer peptides with their predicted binding strength to the MHC variant HLA-A*0201 for these software tools. Two studies were performed, one using the data that the neural networks were trained on and the other using a sample of the human proteome. A significant bias within NetMHC-4.0 towards predicting highly hydrophobic peptides as strong binders was observed in both studies. This suggests that hydrophobicity should be included in the training data of the neural networks. Retraining the neural networks with such biochemical annotations of hydrophobicity could increase the accuracy of their predictions, increasing their impact in applications such as vaccine design and neoantigen identification.
Chapter
Unravelling of the human genome sequence together with simultaneous advances in understanding the molecular aspects of immunology have had a considerable impact on developing new perspectives in molecular medicine. An understanding of the genetic basis of complex, multi-factorial diseases is crucial for identifying predisposing factors, assessing the effect of gene-environment interactions, predicting individual response to specific drugs and discovering new targets for drug development. The Major Histocompatibility Complex [MHC] of genes coding for antigen presenting molecules, that form the first step towards mounting an immune response, have been implicated in various infectious and autoimmune diseases via indirect or direct involvement with disease aetiology. The extreme genetic variability in the MHC genes along with the property of ‘enbloc’ inheritance of its genes in the form of haplotypes, makes it a good marker system for exploring disease predisposing genes. Such a polymorphism could also provide a genetic basis for the observed inter-population and inter-individual variation in immune responsiveness and resultant disease susceptibility/resistance profiles. In this context, the amino acid residues in the peptide binding sites of the MHC encoded HLA molecules are crucial for antigen presentation and subsequent immune response. Considerable greater understanding of the MHC-peptide interactions together with development in functional genomics and proteomics have made it possible to develop strategies for the design of universal molecular vaccines as effective tools for prevention and control of disease.
Article
Introduction Hepatocellular injury is often progressive in patients with hepatitis B e antigen negative chronic hepatitis B (HBeAg −ve CHB). There is scant data on association of core mutations occurring in patients with HBeAg −ve CHB with severity of liver disease. Materials and methods Hundred and eighteen patients with chronic infection who were HBeAg negative, anti-HBe, and HBV DNA positive were enrolled. Precore and core regions were amplified, sequenced, and analyzed for precore, T helper, cytotoxic T lymphocytes (CTLs), B-cell epitope, and core carboxy-terminal region mutations. Results Majority of patients were infected with HBV genotype D: 96 (81%) [D1: 16, D2: 55 and D5: 25] followed by genotype A1: 15 (13%) and genotype C: 7 (6%) [C1: 5 and unidentified subgenotype C: 2]. Classical (A1896) as well as nonclassical precore region mutations were detected in 30 (25%) and in 9 (7.6%) patients, respectively. Core immune escape, core carboxy-terminal mutations and truncations were detected in 61 (52%), 11 (9.3%), and 14 (12%) patients, respectively. Three core immune escape mutations were significantly higher in patients with coexisting precore stop codon compared with patients without precore stop codon mutation, cT12S (43 vs. 8%, p < 0.001), cS21T (16 vs. 3.4%, p < 0.026), and cE77D (30 vs. 4.5%, p < 0.002). When frequency of core immune escape mutations was compared among CHB and decompensated patients, and cT12S: (27 vs. 10%, p < 0.05), cS21T (16 vs. 1.35%, p < 0.01), cT67P/N: (20 vs. 4%, p < 0.001), cE113D (11.37 vs. 1.35%, p < 0.05), and cP130T/Q (7 vs. 0%, p < 0.001) mutations were found to be significantly higher in decompensated patients. Conclusion Core immune-escape mutations cT12S, cS21T, cT67P, cE113D, and cP130T/Q are significantly higher in decompensated liver disease patients and could influence the severity of liver disease in HBeAg −ve CHB patients.
Article
Molecular dynamics (MD) simulation methods have been an effective source of generating biomolecular-level structural information in immunology, as feedback to understand basic science and to design new experiments, leading to the discovery of drugs and vaccines. Different soluble or surface-bound proteins secreted by immune cells exchange signals through the formation of specialized molecular complexes. Molecules involved in the complex formation are complement proteins, antibodies, T cell receptors, MHC encoded HLA molecules, endogenous peptide antigens, and pathogenic peptides. Understanding the molecular details of the complex formation is very important to systematic design of drugs and vaccines. Experimental data provide only macroscopic reasoning and in many cases fail to perceive subtle differences in behaviors of two apparently very similar systems. Formation of stable complexes depends on complementary residues in proteins and peptides and their matching conformations. Here we present a comprehensive review of applications of MD simulations in immunology. In addition, a short section on computational predictive methods to identify T cell epitopes has been included.
Chapter
One of the key elements in the adaptive immune response is the presentation of peptides by the major histocompatibility complex (MHC) to the T-cell receptors (TR) at the surface of T cells. The characterization of the TR/peptide/MHC trimolecular complexes (TR/pMHC) is crucial to the fields of immunology, vaccination, and immunotherapy. In order to facilitate data comparison and cross-referencing between experiments from different laboratories whatever the receptor, the chain type, the domain, or the species, IMGT®, the international ImMunoGeneTics information system® (http://imgt.cines.fr), has developed IMGT-ONTOLOGY, the first ontology in immunogenetics and immunoinformatics. In IMGT/3Dstructure-DB, the IMGT three-dimensional structure database, the molecular characterization of the TR/pMHC is made according to the IMGT Scientific chart rules that are based on the IMGT-ONTOLOGY concepts. IMGT/3Dstructure-DB provides the standardized IMGT gene and allele names (CLASSIFICATION), the standardized IMGT labels (DESCRIPTION), and the IMGT unique numbering (NUMEROTATION). As the IMGT structural unit is the domain, amino acids at conserved positions always have the same number in the IMGT® databases, tools, and Web resources. For the TR α and β chains, the amino acids in contact with the peptide/MHC (pMHC) are defined according to the IMGT unique numbering for V-DOMAIN. The MHC chain cleft that binds the peptide is formed by two groove domains (G-DOMAIN), each one comprising four antiparallel β strands and one α helix. The IMGT unique numbering for G-DOMAIN applies both to the first two domains (G-ALPHA1 and G-ALPHA2) of the MHC class I α chain, and to the first domain (G-ALPHA and G-BETA) of the MHC class II α chain and β chain, respectively. Based on the IMGT unique numbering, we defined 11 contact sites for the analysis of the pMHC contacts. The TR/pMHC contact description, based on the IMGT numbering, can be queried in the IMGT/StucturalQuery tool, at http://imgt.cines.fr.
Article
Full-text available
Atomistic Molecular Dynamics provides powerful and flexible tools for the prediction and analysis of molecular and macromolecular systems. Specifically, it provides a means by which we can measure theoretically that which cannot be measured experimentally: the dynamic time-evolution of complex systems comprising atoms and molecules. It is particularly suitable for the simulation and analysis of the otherwise inaccessible details of MHC-peptide interaction and, on a larger scale, the simulation of the immune synapse. Progress has been relatively tentative yet the emergence of truly high-performance computing and the development of coarse-grained simulation now offers us the hope of accurately predicting thermodynamic parameters and of simulating not merely a handful of proteins but larger, longer simulations comprising thousands of protein molecules and the cellular scale structures they form. We exemplify this within the context of immunoinformatics.
Article
Full-text available
The full repertoire of hepatitis B virus (HBV) peptides that bind to the common HLA class I molecules found in areas with a high prevalence of chronic HBV infection has not been determined. This information may be useful for designing immunotherapies for chronic hepatitis B. We identified amino acid residues under positive selection pressure in the HBV core gene by phylogenetic analysis of cloned DNA sequences obtained from HBV DNA extracted from the sera of Tongan subjects with inactive, HBeAg-negative chronic HBV infections. The repertoires of positively selected sites in groups of subjects who were homozygous for either HLA-B*4001 (n = 10) or HLA-B*5602 (n = 7) were compared. We identified 13 amino acid sites under positive selection pressure. A significant association between an HLA class I allele and the presence of nonsynonymous mutations was found at five of these sites. HLA-B*4001 was associated with mutations at E77 (P = 0.05) and E113 (P = 0.002), and HLA-B*5602 was associated with mutations at S21 (P = 0.02). In addition, amino acid mutations at V13 (P = 0.03) and E14 (P = 0.01) were more common in the seven subjects with an HLA-A*02 allele. In summary, we have developed an assay that can identify associations between HLA class I alleles and HBV core gene amino acids that mutate in response to selection pressure. This is consistent with published evidence that CD8+ T cells have a role in suppressing viral replication in inactive, HBeAg-negative chronic HBV infection. This assay may be useful for identifying the clinically significant HBV peptides that bind to common HLA class I molecules.
Article
We estimated effective atomic contact energies (ACE), the desolvation free energies required to transfer atoms from water to a protein's interior, using an adaptation of a method introduced by S. Miyazawa and R. L. Jernigan. The energies were obtained for 18 different atom types, which were resolved on the basis of the way their properties cluster in the 20 common amino acids. In addition to providing information on atoms at the highest resolution compatible with the amount and quality of data currently available, the method itself has several new features, including its reference state, the random crystal structure, which removes compositional bias, and a scaling factor that makes contact energies quantitatively comparable with experimentally measured energies. The high level of resolution, the explicit accounting of the local properties of protein interiors during determination of the energies, and the very high computational efficiency with which they can be assigned during any computation, should make the results presented here widely applicable. First we used ACE to calculate the free energies of transferring side-chains from protein interior into water. A comparison of the results thus obtained with the measured free energies of transferring side-chains from n-octanol to water, indicates that the magnitude of protein to water transfer free energies for hydrophobic side-chains is larger than that of n-octanol to water transfer free energies. The difference is consistent with observations made by D. Shortle and co-workers, who measured differential free energies of protein unfolding for site-specific mutants in which Ala or Gly was substituted for various hydrophobic side-chains. A direct comparison (calculated versus observed free energy differences) with those experiments finds slopes of 1.15 and 1.13 for Gly and Ala substitutions, respectively. Finally we compared calculated and observed binding free energies of nine protease-inhibitor complexes. This requires a full free energy function, which is created by adding direct electrostatic interactions and an appropriate entropic component to the solvation free energy term. The calculated free energies are typically within 10% of the observed values. Taken collectively, these results suggest that ACE should provide a reasonably accurate and rapidly evaluatable solvation component of free energy, and should thus make accessible a range of docking, design and protein folding calculations that would otherwise be difficult to perform.
Article
We report a new free energy decomposition that includes structure-derived atomic contact energies for the desolvation component, and show that it applies equally well to the analysis of single-domain protein folding and to the binding of flexible peptides to proteins. Specifically, we selected the 17 single-domain proteins for which the three-dimensional structures and thermodynamic unfolding free energies are available. By calculating all terms except the backbone conformational entropy change and comparing the result to the experimentally measured free energy, we estimated that the mean entropy gain by the backbone chain upon unfolding (delta Sbb) is 5.3 cal/K per mole of residue, and that the average backbone entropy for glycine is 6.7 cal/K. Both numbers are in close agreement with recent estimates made by entirely different methods, suggesting a promising degree of consistency between data obtained from disparate sources. In addition, a quantitative analysis of the folding free energy indicates that the unfavorable backbone entropy for each of the proteins is balanced predominantly by favorable backbone interactions. Finally, because the binding of flexible peptides to receptors is physically similar to folding, the free energy function should, in principle, be equally applicable to flexible docking. By combining atomic contact energies, electrostatics, and sequence-dependent backbone entropy, we calculated a priori the free energy changes associated with the binding of four different peptides to HLA-A2, 1 MHC molecule and found agreement with experiment to within 10% without parameter adjustment.
Article
In this study, we exploited an elementary 2-dimensional square lattice model of HP polymers to test the premise of extracting contact energies from protein structures. Given a set of prespecified energies for H-H, H-P, and P-P contacts, all possible sequences of various lengths were exhaustively enumerated to find sequences that have unique lowest-energy conformations. The lowest-energy structures (or native structures) of such (native) sequences were used to extract contact energies using the Miyazawa-Jernigan procedure and here-defined reference state. The relative magnitudes of the original energies were restored reasonably well, but the extracted contact energies were independent of the absolute magnitudes of the initial energies. We turned to a more detailed characterization of the energy landscapes of the native sequences in light of a new theoretical framework on protein folding. Foldability of such sequences imposes two limits on the absolute value of the prespecified energies: a lower bound entailed by the minimum requirement for thermodynamic stability and an upper bound associated with the entrapment of the chain to local minima. We found that these two limits confine the prespecified energy values to a rather narrow range which, surprisingly, also contains the extracted energies in all the cases examined. These results indicate that the quasi-chemical approximation can be used to connect quantitatively the occurrence of various residue-residue contacts in an ensemble of native structures with the energies of the contacts. More importantly, they suggest that the extracted contact energies do contain information on structural stability and can be used to estimate actual structural energetics. This study also encourages the use of structure-derived contact energies in threading. The finding that there is a rather narrow range of energies that are optimal for folding a sequence also cautions the use of arbitrary energy Hamiltonion in minimal folding models.
Article
Full-text available
T cells circulate in blood and the lymphatic system, continually engaging cells through transient non‐specific adhesion. In a normally functioning immune system, these interactions permit sufficient time for T‐cell receptors (TCRs) to sample major histocompatibility complex (MHC)‐pep‐tide complexes for the presence of foreign antigen, with detection of the latter to some extent being triggered by a longer dwell time of the receptor on the complex. Precisely how this incremental stability, which may be relatively small, leads to activation is unclear, but it appears to be related to diffusion‐mediated formation of ternary complex dimers. The formation of stable dimers can explain the high sensitivity of the response, but leaves a number of questions un addressed, including the following; i) How can high sensitivity be reconciled with high specificity, and how can a short TCR dwell time be reconciled with a comparably short time for ternary complex pair formation? ii) What is the nature of the early signals on the plasma membrane that determine alternative responses e.g. proliferation at one extreme and apoptosis at the other’ iii) What arc the cell‐surface correlates of biphasic dose response functions i.e. of responses that peak as a function of dose and then descend? This paper has two loosely coupled goals. One is to review and assess the mathematical and computational methods available for analyzing reactions with and between mobile membrane‐bound receptors. These methods range from phenomenological to mechanistic, the latter being based on the details of atomic structure. The other is to apply these methods to address biological questions, such as those raised above, part of whose answer may lie in the kinetic competition between alternative reaction paths.
Article
The peptides that bind class I MHC molecules are restricted in length and often contain key amino acids, anchor residues, at particular positions. The side-chains of peptide anchor residues interact with the polymorphic complementary pockets in MHC peptide-binding grooves and provide the molecular basis for allele-specific recognition of antigenic peptides. We establish correlations between class I MHC specificities for anchor residues and class I MHC sequence markers that occur at the polymorphic positions lining the structural pockets. By analyzing the pocket structures of nine crystallized class I MHC molecules and the modeled structures of another 39 class I MHC molecules, we show that class I pockets can be classified into families that are distinguishable by their common physico-chemical properties and peptide side-chain selectivities. The identification of recurrent structural principles among class I pockets makes it possible to greatly expand the repertoire of known peptide-binding motifs of class I MHC molecules. The evolutionary strategies underlying the emergence of pocket families is briefly discussed.
Article
HLA class I alleles are studied by representing them in a metric space where each dimension corresponds to each one of the amino acid positions. Their similarity in reference to their ability to present peptides to T cells is then evaluated by calculating the correlation matrix between the amino-acid-composition tables (or binding affinity tables) for the sets of peptides presented by each allele. This correlation matrix is considered an empirical similarity matrix between HLA alleles, and is modeled in terms of possible structures defined in the metric space of HLA class I amino acid sequences. These geometric structures are adequate models of the peptide-binding data currently available. The following clusters of HLA class I molecules are identified in reference to their ability to present peptides: Cluster I) HLA-A3/ HLA-A11/ HLA-A31/ HLA-A33/ HLA-A68; Cluster II) HLA-B35/ HLA-B51/ HLA-B53/ HLA-B54/ HLA-B7; and Cluster III) HLA-A29/ HLA-B61/HLA-B44; the last cluster showing possible similarities between alleles from different loci. In modeling these natural clusters, the geometric structures with more predictive power confirm the importance of those positions in the peptide-binding groove, particularly those in the B pocket. In addition, other positions (46, 79, 113, 131, 144, and 177) appeared to bear some relevance in determining which peptides can be presented by which HLA alleles.
Article
We present a rapidly executable minimal binding energy model for molecular docking and use it to explore the energy landscape in the vicinity of the binding sites of four different enzyme inhibitor complexes. The structures of the complexes are calculated starting with the crystal structures of the free monomers, using DOCK 4.0 to generate a large number of potential configurations, and screening with the binding energy target function. In order to investigate possible correlations between energy and variation from the native structure, we introduce a new measure of similarity, which removes many of the difficulties associated with root mean square deviation. The analysis uncovers energy gradients, or funnels, near the binding site, with decreasing energy as the degree of similarity between the native and docked structures increases. Such energy funnels can increase the number of random collisions that may evolve into productive stable complex, and indicate that short-range interactions in the precomplexes can contribute to the association rate. The finding could provide an explanation for the relatively rapid association rates that are observed even in the absence of long-range electrostatic steering.
Article
MHC molecules are crucially involved in controlling the specific immune system. They are highly polymorphic receptors sampling peptides from the cellular environment and presenting these peptides for scrutiny by immune cells. Recent advances in combinatorial peptide chemistry have improved the description and prediction of peptide-MHC binding. It is envisioned that a complete mapping of human immune reactivities will be possible.
Article
Full-text available
We have developed an interactive docking program called VRDD. It offers various modes of displaying molecules in an immersive, three-dimensional virtual reality (VR) environment. It allows a user to interactively perform molecular docking aided by automatic docking and side chain conformational search. Binding free energies are computed in real time, and the program enables the user to explore only clash-free orientations of a ligand. VRDD also supplies visual and auditory feedback during docking and side chain search, indicating the levels of atomic overlap and interaction energy. The stunning VR graphics immerse users in the scene and can maximally stimulate their design intuition. We have tested VRDD on three cases with increasing complexity: a nine-residue-long peptide bound to a major histocompatibility complex (MHC) molecule, barstar bound to barnase, and an antibody bound to a hemagglutinin. Without prior knowledge, combinations of hand-docking and automatic refinement led to accurate complex structures for the first two complexes. The third case, for which all automatic docking algorithms failed to identify the correct complex in a previous blind test, also failed for VRDD. Our results show that the combination of VR docking and automatic docking can make unique contributions to molecular modeling.
Article
Full-text available
Major histocompatibility complex (MHC) molecules present peptides to T lymphocytes. It is of critical biological and medical importance to elucidate how different MHC alleles bind to a specific set of peptides. In this study we approach the problem from the algebraic and geometric point of view to analyse MHC-peptide-binding data accumulated over the years. The space of sequence properties (having a particular amino acid at a particular position) of MHC-peptide complexes conveys a geometric structure to these sequence properties in the form of a distance measure, which reveals the peptide binding requirements imposed by the polymorphic sequence characteristics of the MHC molecules. Comparison of the results of this study with our current knowledge of MHC-peptide binding constraints leads to robust agreement. This study provides the tools to quantitate these binding constraints giving a more detailed account of them and opening the way to make peptide binding predictions for MHC alleles for which there is no peptide elution data. In addition, the geometric representation of MHC-peptide complex sequence data gives a distance measure between amino acids in reference to their ability to meet MHC binding requirements. The algebraic and geometric view of amino acid sequences provides a theoretical framework to study the function of proteins when there is enough variation in this sequence to account for the variation in their function, as it is the case with MHC molecules in regard to their ability to present peptides.
Article
A comprehensive docking study was performed on 27 distinct protein-protein complexes. For 13 test systems, docking was performed with the unbound X-ray structures of both the receptor and the ligand. For the remaining systems, the unbound X-ray structure of only molecule was available; therefore the bound structure for the other molecule was used. Our method optimizes desolvation, shape complementarity, and electrostatics using a Fast Fourier Transform algorithm. A global search in the rotational and translational space without any knowledge of the binding sites was performed for all proteins except nine antibodies recognizing antigens. For these antibodies, we docked their well-characterized binding site-the complementarity-determining region defined without information of the antigen-to the entire surface of the antigen. For 24 systems, we were able to find near-native ligand orientations (interface C(alpha) root mean square deviation less than 2.5 A from the crystal complex) among the top 2,000 choices. For three systems, our algorithm could identify the correct complex structure unambiguously. For 13 other complexes, we either ranked a near-native structure in the top 20 or obtained 20 or more near-native structures in the top 2,000 or both. The key feature of our algorithm is the use of target functions that are highly tolerant to conformational changes upon binding. If combined with a post-processing method, our algorithm may provide a general solution to the unbound docking problem. Our program, called ZDOCK, is freely available to academic users (http://zlab.bu.edu/~rong/dock/).
Article
Full-text available
Activation of a cytotoxic T cell requires specific binding of antigenic peptides to major histocompatibility complex (MHC) molecules. This paper reports a study of peptides binding to members of the HLA-A3 superfamily using a recently developed 2D-QSAR method, called the additive method. Four alleles with high phenotype frequency were included in the study: A*0301, A*1101, A*3101 and A*6801. The influence of each of the 20 amino acids at each position of the peptide on binding was studied. A refined A3 supertype motif was defined in the study.
Article
JenPep is a relational database containing a compendium of thermodynamic binding data for the interaction of peptides with a range of important immunological molecules: the major histocompatibility complex, TAP transporter, and T cell receptor. The database also includes annotated lists of B cell and T cell epitopes. Version 2.0 of the database is implemented in a bespoke postgreSQL database system and is fully searchable online via a perl/HTML interface (URL: http://www.jenner.ac.uk/JenPep).
Article
Major histocompatibility complex class I (MHCI) and class II (MHCII) molecules display peptides on antigen-presenting cell surfaces for subsequent T-cell recognition. Within the human population, allelic variation among the classical MHCI and II gene products is the basis for differential peptide binding, thymic repertoire bias and allograft rejection. While available 3D structural analysis suggests that polymorphisms are found primarily within the peptide-binding site, a broader informatic approach pinpointing functional polymorphisms relevant for immune recognition is currently lacking. To this end, we have now analyzed known human class I (774) and class II (485) alleles at each amino acid position using a variability metric (V). Polymorphisms (V>1) have been identified in residues that contact the peptide and/or T-cell receptor (TCR). Using sequence logos to investigate TCR contact sites on HLA molecules, we have identified conserved MHCI residues distinct from those of conserved MHCII residues. In addition, specific class II (HLA-DP, -DQ, -DR) and class I (HLA-A, -B, -C) contacts for TCR binding are revealed. We discuss these findings in the context of TCR restriction and alloreactivity.
Article
As torrents of new data now emerge from microbial genomics, bioinformatic prediction of immunogenic epitopes remains challenging but vital. In silico methods often produce paradoxically inconsistent results: good prediction rates on certain test sets but not others. The inherent complexity of immune presentation and recognition processes complicates epitope prediction. Two encouraging developments - data driven artificial intelligence sequence-based methods for epitope prediction and molecular modeling methods based on three-dimensional protein structures - offer hope for the future.
Article
Identification of immunodominant peptides is the first step in the rational design of peptide vaccines aimed at T-cell immunity. The advances in sequencing techniques and the accumulation of many protein sequences without the purified protein challenge the development of computer algorithms to identify dominant T-cell epitopes based on sequence data alone. Here, we focus on antigenic peptides recognized by cytotoxic T cells. The selection of T-cell epitopes along a protein sequence is influenced by the specificity of each of the processing stages that precede antigen presentation. The most selective of these processing stages is the binding of the peptides to the major histocompatibility complex molecules, and therefore many of the predictive algorithms focus on this stage. Most of these algorithms are based on known binding peptides whose sequences have been used for the characterization of binding motifs or profiles. Here, we describe a structure-based algorithm that does not rely on previous binding data. It is based on observations from crystal structures that many of the bound peptides adopt similar conformations and placements within the MHC groove. The algorithm uses a structural template of the peptide in the MHC groove upon which peptide candidates are threaded and their fit to the MHC groove is evaluated by statistical pairwise potentials. It can rank all possible peptides along a protein sequence or within a suspected group of peptides, directing the experimental efforts towards the most promising peptides. This approach is especially useful when no previous peptide binding data are available.
Article
The underlying assumption in quantitative structure-activity relationship (QSAR) methodology is that related chemical structures exhibit related biological activities. We review here two QSAR methods in terms of their applicability for human MHC supermotif definition. Supermotifs are motifs that characterise binding to more than one allele. Supermotif definition is the initial in silico step of epitope-based vaccine design. The first QSAR method we review here--the additive method--is based on the assumption that the binding affinity of a peptide depends on contributions from both amino acids and the interactions between them. The second method is a 3D-QSAR method: comparative molecular similarity indices analysis (CoMSIA). Both methods were applied to 771 peptides binding to 9 HLA alleles. Five of the alleles (A*0201, A*0202, A*0203, A*0206 and A*6802) belong to the HLA-A2 superfamily and the other four (A*0301, A*1101, A*3101 and A*6801) to the HLA-A3 superfamily. For each superfamily, supermotifs defined by the two QSAR methods agree closely and are supported by many experimental data.
Article
Unlabelled: Interaction free energies are crucial for analyzing binding propensities in proteins. Although the problem of computing binding free energies remains open, approximate estimates have become very useful for filtering potential binding complexes. We report on the implementation of a fast computational estimate of the binding free energy based on a statistically determined desolvation contact potential and Coulomb electrostatics with a distance-dependent dielectric constant, and validated in the Critical Assessment of PRotein Interactions experiment. The application also reports residue contact free energies that rapidly highlight the hotspots of the interaction. Availability: The program was written in Fortran. The executable and full documentation is freely available at http://structure.pitt.edu/software/FastContact
Article
Shape complementarity is the most basic ingredient of the scoring functions for protein-protein docking. Most grid-based docking algorithms use the total number of grid points at the binding interface to quantify shape complementarity. We have developed a novel Pairwise Shape Complementarity (PSC) function that is conceptually simple and rapid to compute. The favorable component of PSC is the total number of atom pairs between the receptor and the ligand within a distance cutoff. When applied to a benchmark of 49 test cases, PSC consistently ranks near-native structures higher and produces more near-native structures than the traditional grid-based function, and the improvement was seen across all prediction levels and in all categories of the benchmark. Without any post-processing or biological information about the binding site except the complementarity-determining region of antibodies, PSC predicts the complex structure correctly for 6 test cases, and ranks at least one near-native structure in the top 20 predictions for 18 test cases. Our docking program ZDOCK has been parallelized and the average computing time is 4 minutes using sixteen IBM SP3 processors. Both ZDOCK and the benchmark are freely available to academic users (http://zlab.bu.edu/~ rong/dock).
Article
Peptide binding to class I major histocompatibility complex (MHCI) molecules is a key step in the immune response and the structural details of this interaction are of importance in the design of peptide vaccines. Algorithms based on primary sequence have had success in predicting potential antigenic peptides for MHCI, but such algorithms have limited accuracy and provide no structural information. Here, we present an algorithm, PePSSI (peptide-MHC prediction of structure through solvated interfaces), for the prediction of peptide structure when bound to the MHCI molecule, HLA-A2. The algorithm combines sampling of peptide backbone conformations and flexible movement of MHC side chains and is unique among other prediction algorithms in its incorporation of explicit water molecules at the peptide-MHC interface. In an initial test of the algorithm, PePSSI was used to predict the conformation of eight peptides bound to HLA-A2, for which X-ray data are available. Comparison of the predicted and X-ray conformations of these peptides gave RMSD values between 1.301 and 2.475 A. Binding conformations of 266 peptides with known binding affinities for HLA-A2 were then predicted using PePSSI. Structural analyses of these peptide-HLA-A2 conformations showed that peptide binding affinity is positively correlated with the number of peptide-MHC contacts and negatively correlated with the number of interfacial water molecules. These results are consistent with the relatively hydrophobic binding nature of the HLA-A2 peptide binding interface. In summary, PePSSI is capable of rapid and accurate prediction of peptide-MHC binding conformations, which may in turn allow estimation of MHCI-peptide binding affinity.
Article
It is increasingly clear that both transient and long-lasting interactions between biomacromolecules and their molecular partners are the most fundamental of all biological mechanisms and lie at the conceptual heart of protein function. In particular, the protein-binding site is the most fascinating and important mechanistic arbiter of protein function. In this review, I examine the nature of protein-binding sites found in both ligand-binding receptors and substrate-binding enzymes. I highlight two important concepts underlying the identification and analysis of binding sites. The first is based on knowledge: when one knows the location of a binding site in one protein, one can "inherit" the site from one protein to another. The second approach involves the a priori prediction of a binding site from a sequence or a structure. The full and complete analysis of binding sites will necessarily involve the full range of informatic techniques ranging from sequence-based bioinformatic analysis through structural bioinformatics to computational chemistry and molecular physics. Integration of both diverse experimental and diverse theoretical approaches is thus a mandatory requirement in the evaluation of binding sites and the binding events that occur within them.
Article
The major histocompatibility complex (MHC) harbours genes whose primary function in regulating immune responsiveness to infection is to present foreign antigens to cytotoxic T lymphocytes (CTLs) and T helper cells. In the case of infection by human immunodeficiency virus (HIV), defining the optimal HIV epitopes that are recognised by CTLs is important for vaccine design, and this in turn will depend on the characteristics of the predominant infecting virus. Moreover, the particular MHC human leukocyte antigens (HLAs) expressed by a geographical population is important since these are likely to determine which HIV epitopes are immunodominant in the anti-HIV immune response. Consideration of these aspects has lead to the dawn of a new era of MHC-based vaccine design, in which the CTL epitopes are selected on the basis of the frequency of restricting MHC alleles. This article reviews data on the distribution patterns of molecular subtypes of HLA class I and class II extended haplotypes, discussing distribution among Asian Indians but with reference to global distributions. These data provide a genetic basis for the possible predisposition and fast progression of HIV infections in the Indian population. Since there is selective predominance of different HLA alleles and haplotypes in different populations, a dedicated screening effort is required at the global level to develop MHC-based vaccines against infectious diseases. It is hoped that this might lead to the development of multivalent, poly-epitope, subtype-specific HIV vaccines that are specific for the target geographical location.
Article
Full-text available
We present an analysis that synthesizes information on the sequence, structure, and motifs of antigenic peptides, which previously appeared to be in conflict. Fourier analysis of T-cell antigenic peptides indicates a periodic variation in amino acid polarities of 3-3.6 residues per period, suggesting an amphipathic alpha-helical structure. However, the diffraction patterns of major histocompatibility complex (MHC) molecules indicate that their ligands are in an extended non-alpha-helical conformation. We present two mutually consistent structural explanations for the source of the alpha-helical periodicity, based on an observation that the side chains of MHC-bound peptides generally partition with hydrophobic (hydrophilic) side chains pointing into (out of) the cleft. First, an analysis of haplotype-dependent peptide motifs indicates that the locations of their defining residues tend to force a period 3-4 variation in hydrophobicity along the peptide sequence, in a manner consistent with the spacing of pockets in the MHC. Second, recent crystallographic determination of the structure of a peptide bound to a class II MHC molecule reveals an extended but regularly twisted peptide with a rotation angle of about 130 degrees. We show that similar structures with rotation angles of 100-130 degrees are energetically acceptable and also span the length of the MHC cleft. These results provide a sound physical chemical and structural basis for the existence of a haplotype-independent antigenic motif which can be particularly important in limiting the search time for antigenic peptides.
Article
The Protein Data Bank is a computer-based archival file for macromolecular structures. The Bank stores in a uniform format atomic co-ordinates and partial bond connectivities, as derived from crystallographic studies. Text included in each data entry gives pertinent information for the structure at hand (e.g. species from which the molecule has been obtained, resolution of diffraction data, literature citations and specifications of secondary structure). In addition to atomic co-ordinates and connectivities, the Protein Data Bank stores structure factors and phases, although these latter data are not placed in any uniform format. Input of data to the Bank and general maintenance functions are carried out at Brookhaven National Laboratory. All data stored in the Bank are available on magnetic tape for public distribution, from Brookhaven (to laboratories in the Americas), Tokyo (Japan), and Cambridge (Europe and worldwide). A master file is maintained at Brookhaven and duplicate copies are stored in Cambridge and Tokyo. In the future, it is hoped to expand the scope of the Protein Data Bank to make available co-ordinates for standard structural types (e.g. alpha-helix, RNA double-stranded helix) and representative computer programs of utility in the study and interpretation of macromolecular structures.
Article
Class I major histocompatibility complex (MHC) molecules interact with self and foreign peptides of diverse amino acid sequences yet exhibit distinct allele-specific selectivity for peptide binding. The structures of the peptide-binding specificity pockets (subsites) in the groove of murine H-2Kb as well as human histocompatibility antigen class I molecules have been analyzed. Deep but highly conserved pockets at each end of the groove bind the amino and carboxyl termini of peptide through extensive hydrogen bonding and, hence, dictate the orientation of peptide binding. A deep polymorphic pocket in the middle of the groove provides the chemical and structural complementarity for one of the peptide's anchor residues, thereby playing a major role in allele-specific peptide binding. Although one or two shallow pockets in the groove may also interact with specific peptide side chains, their role in the selection of peptide is minor. Thus, usage of a limited number of both deep and shallow pockets in multiple combinations appears to allow the binding of a broad range of peptides. This binding occurs with high affinity, primarily because of extensive interactions with the peptide backbone and the conserved hydrogen bonding network at both termini of the peptide. Interactions between the anchor residue (or residues) and the corresponding allele-specific pocket provide sufficient extra binding affinity not only to enhance specificity but also to endure the presentation of the peptide at the cell surface for recognition by T cells.
Article
The x-ray structures of a murine MHC class I molecule (H-2Kb) were determined in complex with two different viral peptides, derived from the vesicular stomatitis virus nucleoprotein (52-59), VSV-8, and the Sendai virus nucleoprotein (324-332), SEV-9. The H-2Kb complexes were refined at 2.3 A for VSV-8 and 2.5 A for SEV-9. The structure of H-2Kb exhibits a high degree of similarity with human HLA class I, although the individual domains can have slightly altered dispositions. Both peptides bind in extended conformations with most of their surfaces buried in the H-2Kb binding groove. The nonamer peptide maintains the same amino- and carboxyl-terminal interactions as the octamer primarily by the insertion of a bulge in the center of an otherwise beta conformation. Most of the specific interactions are between side-chain atoms of H-2Kb and main-chain atoms of peptide. This binding scheme accounts in large part for the enormous diversity of peptide sequences that bind with high affinity to class I molecules. Small but significant conformational changes in H-2Kb are associated with peptide binding, and these synergistic movements may be an integral part of the T cell receptor recognition process.
Article
This chapter presents an overview of T cell recognition and function. Complex organisms coevolved with a wide variety of microorganisms—viruses, bacteria, and simple eukaryotes—that reside within the cells or extracellular fluids of the host. In some cases, hosts and parasites mutually benefit from the relationship. In other cases, parasites are purely destructive. With few exceptions, T cells expressing CD4 recognize antigens in conjunction with class II molecules encoded by the major histocompatibility complex (MHC), whereas those expressing CD8 recognize antigen in conjunction with MHC class I molecules. This targeting is achieved by the direct interaction of CD4 and CD8 molecules with class II and I MHC molecules, respectively, bearing the antigenic determinants recognized by the T cell antigen receptor. The chapter outlines the history of the discovery of MHC restriction. The discovery of the MHC class I-restricted nature of TCD8+ recognition of virus-infected cells caused great excitement because it represented the first function to be ascribed to MHC class I gene products since transplantation rejection. Of the many clever models proposed to account for the phenomenon of MHC restriction, most proposed that the viral antigens recognized by TCD8+ were membrane glycoproteins. All of the viruses used in the initial studies were membrane viruses that expressed such proteins as part of their infectious cycles.
Article
We report here the determination and refinement to 1.9 A resolution by X-ray cryo-crystallography the structure of HLA-Aw68. The averaged image from the collection of bound, endogenous peptides clearly shows the atomic structure at the first three and last two amino acids in the peptides but no connected electron density in between. This suggests that bound peptides, held at both ends, take alternative pathways and could be of different lengths by bulging out in the middle. Peptides eluted from HLA-Aw68 include peptides of 9, 10 and 11 amino acids, a direct indication of the length heterogeneity of tightly bound peptides. Peptide sequencing shows relatively conserved 'anchor' residues at position 2 and the carboxy-terminal residue. Conserved binding sites for the peptide N and C termini at the ends of the class I major histocompatibility complex binding groove are apparently dominant in producing the long half-lives of peptide binding and the peptide-dependent stabilization of the class I molecule's structure.
Article
Cell surface complexes of class I MHC molecules and bound peptide antigens serve as specific recognition elements controlling the cytotoxic immune response. The 2.1 A structure of the human class I MHC molecule HLA-B27 provides a detailed composite image of a co-crystallized collection of HLA-B27-bound peptides, indicating that they share a common main-chain structure and length. It also permits direct visualization of the conservation of arginine as an "anchor" side chain at the second peptide position, which is bound in a potentially HLA-B27-specific pocket and may therefore have a role in the association of HLA-B27 with several diseases. Tight peptide binding to class I MHC molecules appears to result from the extensive contacts found at the ends of the cleft between peptide main-chain atoms and conserved MHC side chains, which also involve the peptide in stabilizing the three-dimensional fold of HLA-B27. The concentration of binding interactions at the peptide termini permits extensive sequence (and probably some length) variability in the center of the peptide, where it is exposed for T cell recognition.
Article
A pool of endogenous peptides bound to the human class I MHC molecule, HLA-B27, has been isolated. Microsequence analysis of the pool and of 11 HPLC-purified peptides provides information on the binding specificity of the HLA-B27 molecule. The peptides all seem to be nonamers, seven of which match to protein sequences in a database search. These self peptides derive from abundant cytosolic or nuclear proteins, such as histone, ribosomal proteins, and members of the 90K heat-shock protein family.
Article
The three-dimensional structure of the human histocompatibility antigen HLA-A2 was determined at 3.5 A resolution by a combination of isomorphous replacement and iterative real-space averaging of two crystal forms. The monoclinic crystal form has now been refined by least-squares methods to an R-factor of 0.169 for data from 6 to 2.6 A resolution. A superposition of the structurally similar domains found in the heterodimer, alpha 1 onto alpha 2 and alpha 3 onto beta 2m, as well as the latter pair onto the ancestrally related immunoglobulin constant domain, reveals that differences are mainly in the turn regions. Structural features of the alpha 1 and alpha 2 domains, such as conserved salt-bridges that contribute to stability, specific loops that form contacts with other domains, and the antigen-binding groove formed from two adjacent helical regions on top of an eight-stranded beta-sheet, are analyzed. The interfaces between the domains, especially those between beta 2m and the HLA heavy chain presumably involved in beta 2m exchange and heterodimer assembly, are described in detail. A detailed examination of the binding groove confirms that the solvent-accessible amino acid side-chains that are most polymorphic in mouse and human alleles fill up the central and widest portion of the binding groove, while conserved side-chains are clustered at the narrower ends of the groove. Six pockets or sub-sites in the antigen-binding groove, of diverse shape and composition, appear suited for binding side-chains from antigenic peptides. Three pockets contain predominantly non-polar atoms; but others, especially those at the extreme ends of the groove, have clusters of polar atoms in close proximity to the "extra" electron density in the binding site. A possible role for beta 2m in stabilizing permissible peptide complexes during folding and assembly is presented.
Article
We have determined the structure of a second human histocompatibility glycoprotein, HLA-Aw68, by X-ray crystallography and refined it to a resolution of 2.6 A. Overall, the structure is extremely similar to that of HLA-A2 (refs 1, 2; and M.A.S. et al., manuscript in preparation), although the 11 amino-acid substitutions at polymorphic residues in the antigen-binding cleft alter the detailed shape and electrostatic charge of that site. A prominent negatively charged pocket within the cleft extends underneath the alpha-helix of the alpha 1-domain, providing a potential subsite for recognizing a positively charged side chain or peptide N terminus. Uninterpreted electron density, presumably representing an unknown 'antigen(s)', which seems to be different from that seen in the HLA-A2 structure, occupies the cleft and extends into the negatively charged pocket in HLA-Aw68. The structures of HLA-Aw68 and HLA-A2 demonstrate how polymorphism creates and alters subsites (pockets) positioned to bind peptide side chains, thereby suggesting the structural basis for allelic specificity in foreign antigen binding.
Article
Solution at 2.4 A resolution of the structure of H-2Db with the influenza virus peptide NP366-374 (ASNEN-METM) and comparison with the H-2Kb-VSV (RGY-VYQGL) structure allow description of the molecular details of MHC class I peptide binding interactions for mice of the H-2b haplotype, revealing a strategy that maximizes the repertoire of peptides than can be presented. The H-2Db cleft has a mouse-specific hydrophobic ridge that causes a compensatory arch in the backbone of the peptide, exposing the arch residues to TCR contact and requiring the peptide to be at least 9 residues. This ridge occurs in about 40% of the known murine D and L allelic molecules, classifying them as a structural subgroup.
Article
Distinct amino acid (aa) residue motifs for peptides binding to HLA-A1 and HLA-B8 were identified by sequence analyses of reversed-phase HPLC fractions containing endogenous peptides derived from these HLA molecules. Fifteen different primary sequences were determined for HLA-A1-associated peptides, 12 of which were nine aa in length. Common features among these peptide sequences were Tyr at the COOH-terminus, a negatively charged aa (usually Glu) at position 3 (P3), and Pro at P4. Twenty-seven different primary sequence assignments were made for HLA-B8-associated peptides, most of which were eight aa in length. Lys, and in a few cases Arg, predominated at P3 and P5; Leu and Pro predominated at P2, and Leu was the preferred COOH-terminal residue. Unlike all other human class I molecules whose peptide-binding properties have been studied, both HLA-A1 and HLA-B8 endogenous peptide sequences have a dominant anchor residue at P3, and these aa are opposite in charge to the aa at position 156 of the peptide-binding site. Synthetic peptides corresponding to endogenous peptide sequences bound to their respective HLA molecules in vitro, indicating that they derive from peptides bound to HLA and not from copurifying contaminants. Eight of the HLA-A1 and HLA-B8 endogenous peptide sequences matched intracellularly expressed proteins found in protein sequence data bases. The HLA-A1 peptide-binding motif was then used to identify potential antigenic peptides from influenza A viral proteins that bound to HLA-A1 in vitro.
Article
HLA B8-restricted cytotoxic T lymphocytes (CTL) specific for influenza A virus were generated and shown to recognize the nucleoprotein (NP). The dominant epitope was mapped using recombinant vaccinia viruses that expressed fragments of the NP and then synthetic peptides based on the NP amino acid sequence. The peptide 380-393 was first identified and further refined; it was shown that the glutamic acid at position 380 was essential for recognition by CTL and that the nonamer 380-388 was the optimum peptide. Six HLA B8-positive influenza immune donors that we have tested respond to this peptide as part of their influenza-specific CTL response. The amino acid sequence of the peptide epitope was compared to six other known virus peptides known to be restricted by HLA B8 and a sequence homology was identified, which predicted nonamer and octamer epitope sequences. Probable anchor residues were identified at peptide residues 3 (lysine/arginine), 5 (lysine/arginine) and 9 (leucine/isoleucine). Support for this pattern came from sequencing peptides eluted from purified HLA B8 molecules, where lysines were predominant at positions 3 and 5. One of the predicted epitope peptides was made and shown to be recognized by specific CTL. These and the two others were shown to compete with NP 380-388 for binding to HLA B8. A model was made of the HLA B8 molecule and negatively charged pockets predicted, which could accommodate the positively charged side chains of the peptide anchor residues.
Article
Complexes of five peptides (from HIV-1, influenza A virus, HTLV-1, and hepatitis B virus proteins) bound to the human class I MHC molecule HLA-A2 have been studied by X-ray crystallography. While the peptide termini and their second and C-terminal anchor side chains are bound similarly in all five cases, the main chain and side chain conformations of each peptide are strikingly different in the center of the binding site, and these differences are accessible to direct TCR recognition. Each of the central peptide residues is seen to point up for some bound peptides, but down or sideways for others. Thus, although fixed at its ends, the structure of an MHC-bound peptide appears to be a highly complex function of its entire sequence, potentially sensitive to even small sequence differences. In contrast, MHC structural variation is relatively limited. These results offer a structural framework for understanding the role of nonanchor peptide side chains in both peptide-MHC binding affinity and TCR recognition.
Article
An influenza virus matrix peptide in which either the charged amino or carboxyl terminus was substituted by methyl groups promoted folding of the class I human histocompatibility antigen (HLA-A2). A peptide modified at both termini did not promote stable folding. The thermal stability of HLA-A2 complexed with peptides that did not have either terminus was approximately 22 degrees C lower than that of the control peptide, whereas matrix peptide in which both anchor positions were substituted by alanines had its stability decreased by only 5.5 degrees C. Thus, the conserved major histocompatibility complex class I residues at both ends of the peptide binding site form energetically important sites for binding the termini of short peptides.
Article
Allele-specific motifs for the human MHC class I molecules, HLA-A1, A3, A11, and A24 were characterized by three complementary approaches. First, amino acid sequence analysis of acid eluted peptide pools from affinity purified class I molecules defined putative motifs 9 or 10 amino acids in length and bearing critical anchor residues at position 2 and at the COOH-terminal. These motifs were distinct, with the exception of the HLA-A3 and A11 motifs that were very similar to each other. Second, the correctness of these putative motifs was verified by analyzing the binding capacity of polyalanine peptide analogues to purified HLA-A molecules. Several alternative anchor residues that were not obvious from the pooled peptide sequencing analysis were identified. Third, sequences of individual peptides eluted from HLA-A1, A11, and A24 were determined by tandem mass spectrometry. Nonamers were the predominant species, although peptides of 8, 10, 11, and 12 amino acids in length were also identified. These peptides displayed anchor residues predicted by the specific motifs at position 2 and at the COOH-terminal, regardless of peptide length. Synthetic versions of the naturally processed peptides were shown to bind to the appropriate HLA-A alleles with IC50 values in the 0.3- to 200-nM range. A rational approach to search Ags with known amino acid sequences for epitopes restricted by some of the most common HLA-A types and of potential clinical importance is now feasible.
Article
Naturally processed peptides, bound to HLA-A2, A68, B40 molecules, were isolated from a c-myc transfected lymphoblastoid B cell lines for sequence analysis. Forty-three sequences of bound peptides could be grouped into three structural motifs. One of the peptide sequences obtained, SLLPAIVEL, was identical to a previously reported peptide bound to HLA-A2.1 and was used for grouping HLA-A2-bound peptides. A second motif, identical to that previously reported for HLA-A68-bound peptides, was also observed. A distinct third motif, consistent with the structure of the HLA-B40 "45 pocket," was observed. The peptides within this group contained glutamate in position 2, usually followed by a hydrophobic residue in positions 3 and 9. Within this motif group of peptides bound to MHC class I molecules, one peptide, HEETPPTTS, was 100% homologous to residues 243-251 of the c-myc protein.
Article
Two major components are required for a successful prediction of the three-dimensional structure of peptides and proteins: an efficient global optimization procedure which is capable of finding an appropriate local minimum for the strongly anisotropic function of hundreds of variables, and a set of free energy components for a protein molecule in solution which are computationally inexpensive enough to be used in the search procedure, yet sufficiently accurate to ensure the uniqueness of the native conformation. We here found an efficient way to make a random step in a Monte Carlo procedure given knowledge of the energy or statistical properties of conformational subspaces (e.g. phi-psi zones or side-chain torsion angles). This biased probability Monte Carlo (BPMC) procedure randomly selects the subspace first, then makes a step to a new random position independent of the previous position, but according to the predefined continuous probability distribution. The random step is followed by a local minimization in torsion angle space. The positions, sizes and preferences for high-probability zones on phi-psi maps and chi-angle maps were calculated for different residue types from the representative set of 191 and 161 protein 3D-structures, respectively. A fast and precise method to evaluate the electrostatic energy of a protein in solution is developed and combined with the BPMC procedure. The method is based on the modified spherical image charge approximation, efficiently projected onto a molecule of arbitrary shape. Comparison with the finite-difference solutions of the Poisson-Boltzmann equation shows high accuracy for our approach. The BPMC procedure is applied successfully to the structure prediction of 12- and 16-residue synthetic peptides and the determination of protein structure from NMR data, with the immunoglobulin binding domain of streptococcal protein G as an example. The BPMC runs display much better convergence properties than the non-biased simulations. The advantage of a true global optimization procedure for NMR structure determination is its ability to cope with local minima originating from data errors and ambiguities in NMR data.
Article
Coordinates from x-ray structures of HLA-A*6801, HLA-A*0201, and HLA-B*2705 were analyzed to examine the basis for their selectivity in peptide binding. The pocket that binds the side chain of the peptide's second amino acid residue (P2 residue) shows a preference for Val, Leu, and Arg in these three HLA subtypes, respectively. The Arg-specific pocket of HLA-B*2705 differs markedly from those of HLA-A*0201 and HLA-A*6801, as a result of numerous differences in the side chains that form the pocket's surface. The cause of the specificity differences between HLA-A*0201 and HLA-A*6801 is more subtle and depends both on a change in conformation of pocket residue Val-67 and on a sequence difference at residue 9. The Val-67 conformational change appears to be caused by a shift in the position of the alpha 1-domain alpha-helix relative to the beta-sheet in the cleft and may, in fact, depend on amino acid differences remote from the P2 pocket. Analysis of the stereochemistry of the P2 side chain interacting with its binding pocket permits an estimate to be made of its contribution to the free-energy change of peptide binding.
Article
The crystal structures of major histocompatibility complex (MHC) molecules contain a groove occupied by heterogeneous material thought to represent peptides central to immune recognition, although until now relatively little characterization of the peptides has been possible. Exact information about the contents of MHC grooves is now provided. Moreover, each MHC class I allele has its individual rules to which peptides presented in the groove adhere.
Getting the inside out: the transporter associated with antigen processing (TAP) and the presentation of viral antigen (commentary)
  • Hill
Emerging principles for the recognition of peptide antigens by MHC class I molecules
  • Matsumura