Comparative structure of IRGM loci. The structures of the IRGM loci are shown in the context of a generally-accepted primate phylogenetic tree. ORF, ERV9, intronic sequence, Alu sequence, and 5 9 untranslated region (UTR) depicted in green, black, white, yellow and blue colors respectively. A red color denotes pseudogenes based on the accumulation of deleterious mutations in the ORF. Shaded orange color indicates an atypical GTPase because of mutations leading to the loss of a canonical GTPase binding motif (see Figure S1). The first ATG codon (green arrow) after the Alu repeat sequence is used as putative start codon for the open reading frame of IRGM . The transcription start site is marked with green flag. FS indicates frameshift mutation. TGA and TAA denote the position of stop codons (arrows). The shaded white, blue and green colors indicate predicted intron, UTR or exon, respectively. The genomic loci are not drawn to scale with the exception of the full-length sequence of IRGM ORF. doi:10.1371/journal.pgen.1000403.g001 

Comparative structure of IRGM loci. The structures of the IRGM loci are shown in the context of a generally-accepted primate phylogenetic tree. ORF, ERV9, intronic sequence, Alu sequence, and 5 9 untranslated region (UTR) depicted in green, black, white, yellow and blue colors respectively. A red color denotes pseudogenes based on the accumulation of deleterious mutations in the ORF. Shaded orange color indicates an atypical GTPase because of mutations leading to the loss of a canonical GTPase binding motif (see Figure S1). The first ATG codon (green arrow) after the Alu repeat sequence is used as putative start codon for the open reading frame of IRGM . The transcription start site is marked with green flag. FS indicates frameshift mutation. TGA and TAA denote the position of stop codons (arrows). The shaded white, blue and green colors indicate predicted intron, UTR or exon, respectively. The genomic loci are not drawn to scale with the exception of the full-length sequence of IRGM ORF. doi:10.1371/journal.pgen.1000403.g001 

Source publication
Article
Full-text available
Immunity-related GTPases (IRG) play an important role in defense against intracellular pathogens. One member of this gene family in humans, IRGM, has been recently implicated as a risk factor for Crohn's disease. We analyzed the detailed structure of this gene family among primates and showed that most of the IRG gene cluster was deleted early in p...

Contexts in source publication

Context 1
... Related GTPases ( IRG ), a family of genes induced by interferons, are one of the strongest resistance systems to intracellular pathogens [1–4]. The IRGM gene has been shown to have a role in the autophagy-targeted destruction of Mycobacterium bovis BCG [5]. Recently, whole genome association studies have shown that specific IRGM haplotypes associate with increased risk for Crohn’s disease [6,7]. The IRG gene family exists as multiple copies (3–21) in most mammalian species but has been reduced to two copies, IRGC and a truncated gene IRGM , in humans [8]. Analysis of mammalian genomes (dog, rat and mouse) has shown that all IRG genes except IRGC are organized in tandem gene clusters mapping to mouse chromosomes 11 and 18 (both syntenic to human chromosome 5) [8]. A comparison of the mouse and human genomes identified 21 genes in mouse but only a single syntenic truncated IRGM copy and IRGC in human [8]. We investigated the copy number and sequence organization of the IRG gene family in multiple nonhuman primate species in order to reconstruct the evolutionary history of this locus. Sequence analysis of two different prosimian species ( Microcebus murinus and Lemur catta ) confirmed the mammalian archetypical organization with three IRGM paralogs in each species (Figure 1). FISH analysis showed that genes in these species are organized as part of a tandem gene family similar to the organization observed within the mouse genome (Figure 2). In contrast, FISH and sequence analysis of various monkey and great ape species (see Text S1) confirmed a single copy in each of these species. Based on the estimated divergence of strepsirrhine and platyrrhine primate lineages, we conclude the IRGM gene cluster contracted to a single truncated copy 40–50 million years ago within the anthropoid lineage of evolution. We next compared the structure of the IRGM gene in various primate species. One of the three mouse lemur IRGM genes ( IRGM9 ) preserves a complete ORF based on the mouse model and shows the greatest homology to mouse Irgm1 . The ORF encodes a putative 47 kD protein including a classical N-terminal region as well as classical motifs at the end of the carboxyl- terminus associated with most functional murine IRGM loci [8,9] (see Text S1). The second mouse lemur gene, IRGM8 , is likely a pseudogene because of a mutation generating a stop codon within the G domain and a frameshift mutation at the C terminus. The third mouse lemur gene, IRGM7 , is atypical because it has substitutions in the G domain that disrupt the G1 motif that interacts with the nucleotide phosphates and is highly conserved in P-loop GTPases [10] (Figure S1 and Text S1). In contrast to mouse and prosimian species, all anthropoid primate lineages show the presence of an AluS repeat immediately after the splicing acceptor that disrupts the ORF of the sole remaining IRGM gene (Figure 1 and S2). Sequencing of the IRGM locus in four New World monkey species revealed the presence of the same two stop codons disrupting the ORF of IRGM in all species. We similarly identified a common frameshift mutation resulting in premature stop codons within the IRGM locus in eleven diverse Old World monkey species suggesting that IRGM had become pseudogenized before the radiation of these species. Sequencing of the gene in multiple individuals in the same species (five unrelated Rhesus macaque and baboon) suggested that the frameshift mutations were fixed (Figure S3 and Text S1). In total, these data argue that the IRGM locus has been nonfunctional since the divergence of the New World and Old World monkey lineages (35–40 million years ago) likely as a result of an Alu repeat integration event that disrupted the ORF of the gene in the anthropoid ancestor (Figure 1). In contrast to New World and Old World monkeys, sequencing of the IRGM locus in humans and African great ape species reveals a restored, albeit truncated, ORF of , 20 kD in length. This is consistent with an antiserum raised against peptides from the human IRGM protein that detected a specific signal at , 20 kD by Western blot [11]. In contrast to humans and the African great apes, analysis of the orangutan genome assembly predicted a nonfunctional protein (C to T transition at nucleotide position 150 with respect to the start codon resulting in a premature shared stop codon in the ORF (Figure 1 and Text S1). This is the same substitution identified among all Old World monkey genomes suggesting that ancestral ape species carried a pseudogene. We resequenced the IRGM gene in twelve different orangutans and five different gibbon species. Six of the twelve individuals from orangutan and one of the five species from gibbon are heterozygous for the C to T substitution. In addition, we noted that all ape IRGM copies also shared a new translation initiation codon with a preferred Kozak sequence immediately after the Alu integration. These data indicate that the gene can exist as either a pseudogene or as a complete 20 kD ORF among these Asian ape lineages as a result of either balancing selection or recurrent mutational events. It will be necessary to examine a larger number of individuals within each species to establish the evolutionary history of this locus among the Asian apes. We noticed an important structural difference in the gene organization for species that regained putative IRGM function when compared to those primates with a pseudogenized version. In the common ancestor of humans and great apes, an ERV9 retroviral element integrated within the 5 9 end of the IRGM gene (Figure 1). We reasoned that this structural difference may have conferred expression differences and analyzed the RT-PCR expression profile of IRGM in human, macaque and marmoset. Full-length cDNA sequencing and 5 9 RACE revealed that the human transcription start signal mapped specifically within the ERV9 repeat element (Figure 1 and Figure S4) resulting in the addition of a novel 5 9 UTR exon and an alternative splice form. Although there are five distinct, alternative splice forms of human IRGM , all human copies share this first intron. In humans, we observe constitutive levels of expression of IRGM in all tissues examined, with the highest expression of IRGM in the testis (Figure 3A) [8]. Although IRGM does not encode a functional protein in marmoset and macaque, we find evidence of low levels of expression, albeit in a more restricted manner (Figure 3B). Macaque and marmoset, for example, show no expression in the kidney with marmoset IRGM expression restricted to testis and lung. Furthermore, we find no evidence in macaque of splicing of the first intron based on the human IRGM gene model (Figure 3C) but rather evidence that the first intron remains as a continuous unspliced transcript. We also failed to confirm 3 9 downstream splicing events of macaque IRGM suggesting that even if stop codons were reverted, a full-length cDNA (comparable to human) could no longer be produced. These data strongly suggest that ERV9 integration significantly reshaped the expression and splicing pattern of IRGM in the common ancestor of humans and apes (Figure 3). We note that structural changes of the human IRGM locus continues to occur within the human lineage with a 20.1 kb LTR-rich deletion polymorphism, recently identified and sequenced, located 2.82 kb upstream of the ERV9 promoter region [12]. Our preliminary data suggest that this deletion polymorphism alters the relative proportion of alternative splicing of IRGM transcripts (Figures S5 and S6). We tested for natural selection on IRGM coding sequence using maximum likelihood models to estimate evolutionary rates for individual branches in the phylogeny as well as specific codon changes [13,14]. Based on the structural differences in IRGM organization, we first divided our species into three groups: Group 1 consists of species that carry a single copy of IRGM with the ERV9 element (human (Hs), chimpanzee (Ptr), gorilla (Ggo) and orangutan (Ppy)); Group 2 consists of species that carry a single copy of IRGM but lack the ERV9 element (Macaque (Rh), baboon (Pha) and marmoset (Cja)); while Group 3 was formed by species (dog and mouse lemur) that had multiple copies in a tandem orientation (Figure 4). Phylogenetic branch estimates of d N /d S revealed striking differences between Group 2 ( v = 0.9254) and Group 3 ( v = 0.3866) with an intermediate value for Group 1 ( v = 0.6073). Group 3 was found to be under constrained evolution ( v = 0.3866) and it was significantly different (P = 6.09E 2 12 ) from a model of neutral evolution. In contrast, Group 1 and 2 gene evolutions were indistinguishable from a model of neutral evolution (see Text S1). There are two possible interpretations ...
Context 2
... genomes (dog, rat and mouse) has shown that all IRG genes except IRGC are organized in tandem gene clusters mapping to mouse chromosomes 11 and 18 (both syntenic to human chromosome 5) [8]. A comparison of the mouse and human genomes identified 21 genes in mouse but only a single syntenic truncated IRGM copy and IRGC in human [8]. We investigated the copy number and sequence organization of the IRG gene family in multiple nonhuman primate species in order to reconstruct the evolutionary history of this locus. Sequence analysis of two different prosimian species ( Microcebus murinus and Lemur catta ) confirmed the mammalian archetypical organization with three IRGM paralogs in each species (Figure 1). FISH analysis showed that genes in these species are organized as part of a tandem gene family similar to the organization observed within the mouse genome (Figure 2). In contrast, FISH and sequence analysis of various monkey and great ape species (see Text S1) confirmed a single copy in each of these species. Based on the estimated divergence of strepsirrhine and platyrrhine primate lineages, we conclude the IRGM gene cluster contracted to a single truncated copy 40–50 million years ago within the anthropoid lineage of evolution. We next compared the structure of the IRGM gene in various primate species. One of the three mouse lemur IRGM genes ( IRGM9 ) preserves a complete ORF based on the mouse model and shows the greatest homology to mouse Irgm1 . The ORF encodes a putative 47 kD protein including a classical N-terminal region as well as classical motifs at the end of the carboxyl- terminus associated with most functional murine IRGM loci [8,9] (see Text S1). The second mouse lemur gene, IRGM8 , is likely a pseudogene because of a mutation generating a stop codon within the G domain and a frameshift mutation at the C terminus. The third mouse lemur gene, IRGM7 , is atypical because it has substitutions in the G domain that disrupt the G1 motif that interacts with the nucleotide phosphates and is highly conserved in P-loop GTPases [10] (Figure S1 and Text S1). In contrast to mouse and prosimian species, all anthropoid primate lineages show the presence of an AluS repeat immediately after the splicing acceptor that disrupts the ORF of the sole remaining IRGM gene (Figure 1 and S2). Sequencing of the IRGM locus in four New World monkey species revealed the presence of the same two stop codons disrupting the ORF of IRGM in all species. We similarly identified a common frameshift mutation resulting in premature stop codons within the IRGM locus in eleven diverse Old World monkey species suggesting that IRGM had become pseudogenized before the radiation of these species. Sequencing of the gene in multiple individuals in the same species (five unrelated Rhesus macaque and baboon) suggested that the frameshift mutations were fixed (Figure S3 and Text S1). In total, these data argue that the IRGM locus has been nonfunctional since the divergence of the New World and Old World monkey lineages (35–40 million years ago) likely as a result of an Alu repeat integration event that disrupted the ORF of the gene in the anthropoid ancestor (Figure 1). In contrast to New World and Old World monkeys, sequencing of the IRGM locus in humans and African great ape species reveals a restored, albeit truncated, ORF of , 20 kD in length. This is consistent with an antiserum raised against peptides from the human IRGM protein that detected a specific signal at , 20 kD by Western blot [11]. In contrast to humans and the African great apes, analysis of the orangutan genome assembly predicted a nonfunctional protein (C to T transition at nucleotide position 150 with respect to the start codon resulting in a premature shared stop codon in the ORF (Figure 1 and Text S1). This is the same substitution identified among all Old World monkey genomes suggesting that ancestral ape species carried a pseudogene. We resequenced the IRGM gene in twelve different orangutans and five different gibbon species. Six of the twelve individuals from orangutan and one of the five species from gibbon are heterozygous for the C to T substitution. In addition, we noted that all ape IRGM copies also shared a new translation initiation codon with a preferred Kozak sequence immediately after the Alu integration. These data indicate that the gene can exist as either a pseudogene or as a complete 20 kD ORF among these Asian ape lineages as a result of either balancing selection or recurrent mutational events. It will be necessary to examine a larger number of individuals within each species to establish the evolutionary history of this locus among the Asian apes. We noticed an important structural difference in the gene organization for species that regained putative IRGM function when compared to those primates with a pseudogenized version. In the common ancestor of humans and great apes, an ERV9 retroviral element integrated within the 5 9 end of the IRGM gene (Figure 1). We reasoned that this structural difference may have conferred expression differences and analyzed the RT-PCR expression profile of IRGM in human, macaque and marmoset. Full-length cDNA sequencing and 5 9 RACE revealed that the human transcription start signal mapped specifically within the ERV9 repeat element (Figure 1 and Figure S4) resulting in the addition of a novel 5 9 UTR exon and an alternative splice form. Although there are five distinct, alternative splice forms of human IRGM , all human copies share this first intron. In humans, we observe constitutive levels of expression of IRGM in all tissues examined, with the highest expression of IRGM in the testis (Figure 3A) [8]. Although IRGM does not encode a functional protein in marmoset and macaque, we find evidence of low levels of expression, albeit in a more restricted manner (Figure 3B). Macaque and marmoset, for example, show no expression in the kidney with marmoset IRGM expression restricted to testis and lung. Furthermore, we find no evidence in macaque of splicing of the first intron based on the human IRGM gene model (Figure 3C) but rather evidence that the first intron remains as a continuous unspliced transcript. We also failed to confirm 3 9 downstream splicing events of macaque IRGM suggesting that even if stop codons were reverted, a full-length cDNA (comparable to human) could no longer be produced. These data strongly suggest that ERV9 integration significantly reshaped the expression and splicing pattern of IRGM in the common ancestor of humans and apes (Figure 3). We note that structural changes of the human IRGM locus continues to occur within the human lineage with a 20.1 kb LTR-rich deletion polymorphism, recently identified and sequenced, located 2.82 kb upstream of the ERV9 promoter region [12]. Our preliminary data suggest that this deletion polymorphism alters the relative proportion of alternative splicing of IRGM transcripts (Figures S5 and S6). We tested for natural selection on IRGM coding sequence using maximum likelihood models to estimate evolutionary rates for individual branches in the phylogeny as well as specific codon changes [13,14]. Based on the structural differences in IRGM organization, we first divided our species into three groups: Group 1 consists of species that carry a single copy of IRGM with the ERV9 element (human (Hs), chimpanzee (Ptr), gorilla (Ggo) and orangutan (Ppy)); Group 2 consists of species that carry a single copy of IRGM but lack the ERV9 element (Macaque (Rh), baboon (Pha) and marmoset (Cja)); while Group 3 was formed by species (dog and mouse lemur) that had multiple copies in a tandem orientation (Figure 4). Phylogenetic branch estimates of d N /d S revealed striking differences between Group 2 ( v = 0.9254) and Group 3 ( v = 0.3866) with an intermediate value for Group 1 ( v = 0.6073). Group 3 was found to be under constrained evolution ( v = 0.3866) and it was significantly different (P = 6.09E 2 12 ) from a model of neutral evolution. In contrast, Group 1 and 2 gene evolutions were indistinguishable from a model of neutral evolution (see Text S1). There are two possible interpretations of our results. First, the IRGM gene is not functional in humans having lost its role in intracellular parasite resistance , 40 million years ago when the gene family experienced a contraction from a set of three tandem genes to a sole, unique member whose ORF was disrupted by an AluS c repeat in the anthropoid primate ancestor. In light of the detailed functional studies [11] and the recent associations of this gene with Crohn’s disease [6,7], we feel that this interpretation is unlikely. For example, McCarroll and colleagues recently demonstrated that a 20.1 kb deletion upstream of IRGM associates with Crohn’s disease as well as the most strongly associated SNP and that the deletion haplotype showed a distinct pattern of IRGM gene expression consistent with its putative role in autophagy and Crohn’s disease. An alternate scenario is that the IRGM gene became nonfunctional , 40 million years ago (leading to pseudogene copies in Old World and New World monkeys) but was resurrected , 20 million years ago in the common ancestor humans and apes (Figure 5). In addition to the genetic and functional data, several lines of evidence support this seemingly unusual scenario. First, we find evidence of a restored ORF in humans and African great apes. Second, this change coincided with the integration of the ERV9 element that serves as the functional promoter for the human IRGM gene. Such retroposon- induced alterations of gene expression are not without precedent in mammalian species [15,16]. Third, we find that ape/human codon evolution is consistent with a model of nucleotide constraint resulting in depressed d N /d S ratios in the hominid branch (Figure 4) when compared to the Old World and New World species. It is intriguing that the orangutan and gibbon populations possess both a functional ...
Context 3
... Related GTPases ( IRG ), a family of genes induced by interferons, are one of the strongest resistance systems to intracellular pathogens [1–4]. The IRGM gene has been shown to have a role in the autophagy-targeted destruction of Mycobacterium bovis BCG [5]. Recently, whole genome association studies have shown that specific IRGM haplotypes associate with increased risk for Crohn’s disease [6,7]. The IRG gene family exists as multiple copies (3–21) in most mammalian species but has been reduced to two copies, IRGC and a truncated gene IRGM , in humans [8]. Analysis of mammalian genomes (dog, rat and mouse) has shown that all IRG genes except IRGC are organized in tandem gene clusters mapping to mouse chromosomes 11 and 18 (both syntenic to human chromosome 5) [8]. A comparison of the mouse and human genomes identified 21 genes in mouse but only a single syntenic truncated IRGM copy and IRGC in human [8]. We investigated the copy number and sequence organization of the IRG gene family in multiple nonhuman primate species in order to reconstruct the evolutionary history of this locus. Sequence analysis of two different prosimian species ( Microcebus murinus and Lemur catta ) confirmed the mammalian archetypical organization with three IRGM paralogs in each species (Figure 1). FISH analysis showed that genes in these species are organized as part of a tandem gene family similar to the organization observed within the mouse genome (Figure 2). In contrast, FISH and sequence analysis of various monkey and great ape species (see Text S1) confirmed a single copy in each of these species. Based on the estimated divergence of strepsirrhine and platyrrhine primate lineages, we conclude the IRGM gene cluster contracted to a single truncated copy 40–50 million years ago within the anthropoid lineage of evolution. We next compared the structure of the IRGM gene in various primate species. One of the three mouse lemur IRGM genes ( IRGM9 ) preserves a complete ORF based on the mouse model and shows the greatest homology to mouse Irgm1 . The ORF encodes a putative 47 kD protein including a classical N-terminal region as well as classical motifs at the end of the carboxyl- terminus associated with most functional murine IRGM loci [8,9] (see Text S1). The second mouse lemur gene, IRGM8 , is likely a pseudogene because of a mutation generating a stop codon within the G domain and a frameshift mutation at the C terminus. The third mouse lemur gene, IRGM7 , is atypical because it has substitutions in the G domain that disrupt the G1 motif that interacts with the nucleotide phosphates and is highly conserved in P-loop GTPases [10] (Figure S1 and Text S1). In contrast to mouse and prosimian species, all anthropoid primate lineages show the presence of an AluS repeat immediately after the splicing acceptor that disrupts the ORF of the sole remaining IRGM gene (Figure 1 and S2). Sequencing of the IRGM locus in four New World monkey species revealed the presence of the same two stop codons disrupting the ORF of IRGM in all species. We similarly identified a common frameshift mutation resulting in premature stop codons within the IRGM locus in eleven diverse Old World monkey species suggesting that IRGM had become pseudogenized before the radiation of these species. Sequencing of the gene in multiple individuals in the same species (five unrelated Rhesus macaque and baboon) suggested that the frameshift mutations were fixed (Figure S3 and Text S1). In total, these data argue that the IRGM locus has been nonfunctional since the divergence of the New World and Old World monkey lineages (35–40 million years ago) likely as a result of an Alu repeat integration event that disrupted the ORF of the gene in the anthropoid ancestor (Figure 1). In contrast to New World and Old World monkeys, sequencing of the IRGM locus in humans and African great ape species reveals a restored, albeit truncated, ORF of , 20 kD in length. This is consistent with an antiserum raised against peptides from the human IRGM protein that detected a specific signal at , 20 kD by Western blot [11]. In contrast to humans and the African great apes, analysis of the orangutan genome assembly predicted a nonfunctional protein (C to T transition at nucleotide position 150 with respect to the start codon resulting in a premature shared stop codon in the ORF (Figure 1 and Text S1). This is the same substitution identified among all Old World monkey genomes suggesting that ancestral ape species carried a pseudogene. We resequenced the IRGM gene in twelve different orangutans and five different gibbon species. Six of the twelve individuals from orangutan and one of the five species from gibbon are heterozygous for the C to T substitution. In addition, we noted that all ape IRGM copies also shared a new translation initiation codon with a preferred Kozak sequence immediately after the Alu integration. These data indicate that the gene can exist as either a pseudogene or as a complete 20 kD ORF among these Asian ape lineages as a result of either balancing selection or recurrent mutational events. It will be necessary to examine a larger number of individuals within each species to establish the evolutionary history of this locus among the Asian apes. We noticed an important structural difference in the gene organization for species that regained putative IRGM function when compared to those primates with a pseudogenized version. In the common ancestor of humans and great apes, an ERV9 retroviral element integrated within the 5 9 end of the IRGM gene (Figure 1). We reasoned that this structural difference may have conferred expression differences and analyzed the RT-PCR expression profile of IRGM in human, macaque and marmoset. Full-length cDNA sequencing and 5 9 RACE revealed that the human transcription start signal mapped specifically within the ERV9 repeat element (Figure 1 and Figure S4) resulting in the addition of a novel 5 9 UTR exon and an alternative splice form. Although there are five distinct, alternative splice forms of human IRGM , all human copies share this first intron. In humans, we observe constitutive levels of expression of IRGM in all tissues examined, with the highest expression of IRGM in the testis (Figure 3A) [8]. Although IRGM does not encode a functional protein in marmoset and macaque, we find evidence of low levels of expression, albeit in a more restricted manner (Figure 3B). Macaque and marmoset, for example, show no expression in the kidney with marmoset IRGM expression restricted to testis and lung. Furthermore, we find no evidence in macaque of splicing of the first intron based on the human IRGM gene model (Figure 3C) but rather evidence that the first intron remains as a continuous unspliced transcript. We also failed to confirm 3 9 downstream splicing events of macaque IRGM suggesting that even if stop codons were reverted, a full-length cDNA (comparable to human) could no longer be produced. These data strongly suggest that ERV9 integration significantly reshaped the expression and splicing pattern of IRGM in the common ancestor of humans and apes (Figure 3). We note that structural changes of the human IRGM locus continues to occur within the human lineage with a 20.1 kb LTR-rich deletion polymorphism, recently identified and sequenced, located 2.82 kb upstream of the ERV9 promoter region [12]. Our preliminary data suggest that this deletion polymorphism alters the relative proportion of alternative splicing of IRGM transcripts (Figures S5 and S6). We tested for natural selection on IRGM coding sequence using maximum likelihood models to estimate evolutionary rates for individual branches in the phylogeny as well as specific codon changes [13,14]. Based on the structural differences in IRGM organization, we first divided our species into three groups: Group 1 consists ...
Context 4
... [8]. We investigated the copy number and sequence organization of the IRG gene family in multiple nonhuman primate species in order to reconstruct the evolutionary history of this locus. Sequence analysis of two different prosimian species ( Microcebus murinus and Lemur catta ) confirmed the mammalian archetypical organization with three IRGM paralogs in each species (Figure 1). FISH analysis showed that genes in these species are organized as part of a tandem gene family similar to the organization observed within the mouse genome (Figure 2). In contrast, FISH and sequence analysis of various monkey and great ape species (see Text S1) confirmed a single copy in each of these species. Based on the estimated divergence of strepsirrhine and platyrrhine primate lineages, we conclude the IRGM gene cluster contracted to a single truncated copy 40–50 million years ago within the anthropoid lineage of evolution. We next compared the structure of the IRGM gene in various primate species. One of the three mouse lemur IRGM genes ( IRGM9 ) preserves a complete ORF based on the mouse model and shows the greatest homology to mouse Irgm1 . The ORF encodes a putative 47 kD protein including a classical N-terminal region as well as classical motifs at the end of the carboxyl- terminus associated with most functional murine IRGM loci [8,9] (see Text S1). The second mouse lemur gene, IRGM8 , is likely a pseudogene because of a mutation generating a stop codon within the G domain and a frameshift mutation at the C terminus. The third mouse lemur gene, IRGM7 , is atypical because it has substitutions in the G domain that disrupt the G1 motif that interacts with the nucleotide phosphates and is highly conserved in P-loop GTPases [10] (Figure S1 and Text S1). In contrast to mouse and prosimian species, all anthropoid primate lineages show the presence of an AluS repeat immediately after the splicing acceptor that disrupts the ORF of the sole remaining IRGM gene (Figure 1 and S2). Sequencing of the IRGM locus in four New World monkey species revealed the presence of the same two stop codons disrupting the ORF of IRGM in all species. We similarly identified a common frameshift mutation resulting in premature stop codons within the IRGM locus in eleven diverse Old World monkey species suggesting that IRGM had become pseudogenized before the radiation of these species. Sequencing of the gene in multiple individuals in the same species (five unrelated Rhesus macaque and baboon) suggested that the frameshift mutations were fixed (Figure S3 and Text S1). In total, these data argue that the IRGM locus has been nonfunctional since the divergence of the New World and Old World monkey lineages (35–40 million years ago) likely as a result of an Alu repeat integration event that disrupted the ORF of the gene in the anthropoid ancestor (Figure 1). In contrast to New World and Old World monkeys, sequencing of the IRGM locus in humans and African great ape species reveals a restored, albeit truncated, ORF of , 20 kD in length. This is consistent with an antiserum raised against peptides from the human IRGM protein that detected a specific signal at , 20 kD by Western blot [11]. In contrast to humans and the African great apes, analysis of the orangutan genome assembly predicted a nonfunctional protein (C to T transition at nucleotide position 150 with respect to the start codon resulting in a premature shared stop codon in the ORF (Figure 1 and Text S1). This is the same substitution identified among all Old World monkey genomes suggesting that ancestral ape species carried a pseudogene. We resequenced the IRGM gene in twelve different orangutans and five different gibbon species. Six of the twelve individuals from orangutan and one of the five species from gibbon are heterozygous for the C to T substitution. In addition, we noted that all ape IRGM copies also shared a new translation initiation codon with a preferred Kozak sequence immediately after the Alu integration. These data indicate that the gene can exist as either a pseudogene or as a complete 20 kD ORF among these Asian ape lineages as a result of either balancing selection or recurrent mutational events. It will be necessary to examine a larger number of individuals within each species to establish the evolutionary history of this locus among the Asian apes. We noticed an important structural difference in the gene organization for species that regained putative IRGM function when compared to those primates with a pseudogenized version. In the common ancestor of humans and great apes, an ERV9 retroviral element integrated within the 5 9 end of the IRGM gene (Figure 1). We reasoned that this structural difference may have conferred expression differences and analyzed the RT-PCR expression profile of IRGM in human, macaque and marmoset. Full-length cDNA sequencing and 5 9 RACE revealed that the human transcription start signal mapped specifically within the ERV9 repeat element (Figure 1 and Figure S4) resulting in the addition of a novel 5 9 UTR exon and an alternative splice form. Although there are five distinct, alternative splice forms of human IRGM , all human copies share this first intron. In humans, we observe constitutive levels of expression of IRGM in all tissues examined, with the highest expression of IRGM in the testis (Figure 3A) [8]. Although IRGM does not encode a functional protein in marmoset and macaque, we find evidence of low levels of expression, albeit in a more restricted manner (Figure 3B). Macaque and marmoset, for example, show no expression in the kidney with marmoset IRGM expression restricted to testis and lung. Furthermore, we find no evidence in macaque of splicing of the first intron based on the human IRGM gene model (Figure 3C) but rather evidence that the first intron remains as a continuous unspliced transcript. We also failed to confirm 3 9 downstream splicing events of macaque IRGM suggesting that even if stop codons were reverted, a full-length cDNA (comparable to human) could no longer be produced. These data strongly suggest that ERV9 integration significantly reshaped the expression and splicing pattern of IRGM in the common ancestor of humans and apes (Figure 3). We note that structural changes of the human IRGM locus continues to occur within the human lineage with a 20.1 kb LTR-rich deletion polymorphism, recently identified and sequenced, located 2.82 kb upstream of the ERV9 promoter region [12]. Our preliminary data suggest that this deletion polymorphism alters the relative proportion of alternative splicing of IRGM transcripts (Figures S5 and S6). We tested for natural selection on IRGM coding sequence using maximum likelihood models to estimate evolutionary rates for individual branches in the phylogeny as well as specific codon changes [13,14]. Based on the structural differences in IRGM organization, we first divided our species into three groups: Group 1 consists of species that carry a single copy of IRGM with the ERV9 element (human (Hs), chimpanzee (Ptr), gorilla (Ggo) and orangutan (Ppy)); Group 2 consists of species that carry a single copy of IRGM but lack the ERV9 element (Macaque (Rh), baboon (Pha) and marmoset (Cja)); while Group 3 was formed by species (dog and mouse lemur) that had multiple copies in a tandem orientation (Figure 4). Phylogenetic branch estimates of d N /d S revealed striking differences between Group 2 ( v = 0.9254) and Group 3 ( v = 0.3866) with an intermediate value for Group 1 ( v = 0.6073). Group 3 was found to be under constrained evolution ( v = 0.3866) and it was significantly different (P = 6.09E 2 12 ) from a model of neutral evolution. In contrast, Group 1 and 2 gene evolutions were indistinguishable from a model of neutral evolution (see Text S1). There are two possible interpretations of our results. First, the IRGM gene is not functional in humans having lost its role in intracellular parasite resistance , 40 million years ago when the gene family experienced a contraction from a set of three tandem genes to a sole, unique member whose ORF was disrupted by an AluS c repeat in the anthropoid primate ancestor. In light of the detailed functional studies [11] and the recent associations of this gene with Crohn’s disease [6,7], we feel that this interpretation is unlikely. For example, McCarroll and colleagues recently demonstrated that a 20.1 kb deletion upstream of IRGM associates with Crohn’s disease as well as the most strongly associated SNP and that the deletion haplotype showed a distinct pattern of IRGM gene expression consistent with its putative role in autophagy and Crohn’s disease. An alternate scenario is that the IRGM gene became nonfunctional , 40 million years ago (leading to pseudogene copies in Old World and New World monkeys) but was resurrected , 20 million years ago in the common ancestor humans and apes (Figure 5). In addition to the genetic and functional data, several lines of evidence support this seemingly unusual scenario. First, we find evidence of a restored ORF in humans and African great apes. Second, this change coincided with the integration of the ERV9 element that serves as the functional promoter for the human IRGM gene. Such retroposon- induced alterations of gene expression are not without precedent in mammalian species [15,16]. Third, we find that ape/human codon evolution is consistent with a model of nucleotide constraint resulting in depressed d N /d S ratios in the hominid branch (Figure 4) when compared to the Old World and New World species. It is intriguing that the orangutan and gibbon populations possess both a functional and nonfunctional copy of IRGM , which would open the possibility to long-term balancing selection or recurrent mutations (see Text S1). The inactivating stop codon is shared with all Old World monkey species suggesting an ancestral event. Moreover, we and others [17] find that the structure of the locus is continuing to ...
Context 5
... Related GTPases ( IRG ), a family of genes induced by interferons, are one of the strongest resistance systems to intracellular pathogens [1–4]. The IRGM gene has been shown to have a role in the autophagy-targeted destruction of Mycobacterium bovis BCG [5]. Recently, whole genome association studies have shown that specific IRGM haplotypes associate with increased risk for Crohn’s disease [6,7]. The IRG gene family exists as multiple copies (3–21) in most mammalian species but has been reduced to two copies, IRGC and a truncated gene IRGM , in humans [8]. Analysis of mammalian genomes (dog, rat and mouse) has shown that all IRG genes except IRGC are organized in tandem gene clusters mapping to mouse chromosomes 11 and 18 (both syntenic to human chromosome 5) [8]. A comparison of the mouse and human genomes identified 21 genes in mouse but only a single syntenic truncated IRGM copy and IRGC in human [8]. We investigated the copy number and sequence organization of the IRG gene family in multiple nonhuman primate species in order to reconstruct the evolutionary history of this locus. Sequence analysis of two different prosimian species ( Microcebus murinus and Lemur catta ) confirmed the mammalian archetypical organization with three IRGM paralogs in each species (Figure 1). FISH analysis showed that genes in these species are organized as part of a tandem gene family similar to the organization observed within the mouse genome (Figure 2). In contrast, FISH and sequence analysis of various monkey and great ape species (see Text S1) confirmed a single copy in each of these species. Based on the estimated divergence of strepsirrhine and platyrrhine primate lineages, we conclude the IRGM gene cluster contracted to a single truncated copy 40–50 million years ago within the anthropoid lineage of evolution. We next compared the structure of the IRGM gene in various primate species. One of the three mouse lemur IRGM genes ( IRGM9 ) preserves a complete ORF based on the mouse model and shows the greatest homology to mouse Irgm1 . The ORF encodes a putative 47 kD protein including a classical N-terminal region as well as classical motifs at the end of the carboxyl- terminus associated with most functional murine IRGM loci [8,9] (see Text S1). The second mouse lemur gene, IRGM8 , is likely a pseudogene because of a mutation generating a stop codon within the G domain and a frameshift mutation at the C terminus. The third mouse lemur gene, IRGM7 , is atypical because it has substitutions in the G domain that disrupt the G1 motif that interacts with the nucleotide phosphates and is highly conserved in P-loop GTPases [10] (Figure S1 and Text S1). In contrast to mouse and prosimian species, all anthropoid primate lineages show the presence of an AluS repeat immediately after the splicing acceptor that disrupts the ORF of the sole remaining IRGM gene (Figure 1 and S2). Sequencing of the IRGM locus in four New World monkey species revealed the presence of the same two stop codons disrupting the ORF of IRGM in all species. We similarly identified a common frameshift mutation resulting in premature stop codons within the IRGM locus in eleven diverse Old World monkey species suggesting that IRGM had become pseudogenized before the radiation of these species. Sequencing of the gene in multiple individuals in the same species (five unrelated Rhesus macaque and baboon) suggested that the frameshift mutations were fixed (Figure S3 and Text S1). In total, these data argue that the IRGM locus has been nonfunctional since the divergence of the New World and Old World monkey lineages (35–40 million years ago) likely as a result of an Alu repeat integration event that disrupted the ORF of the gene in the anthropoid ancestor (Figure 1). In contrast to New World and Old World monkeys, sequencing of the IRGM locus in humans and African great ape species reveals a restored, albeit truncated, ORF of , 20 kD in length. This is consistent with an antiserum raised against peptides from the human IRGM protein that detected a specific signal at , 20 kD by Western blot [11]. In contrast to humans and the African great apes, analysis of the orangutan genome assembly predicted a nonfunctional protein (C to T transition at nucleotide position 150 with respect to the start codon resulting in a premature shared stop codon in the ORF (Figure 1 and Text S1). This is the same substitution identified among all Old World monkey genomes suggesting that ancestral ape species carried a pseudogene. We resequenced the IRGM gene in twelve different orangutans and five different gibbon species. Six of the twelve individuals from orangutan and one of the five species from gibbon are heterozygous for the C to T substitution. In addition, we noted that all ape IRGM copies also shared a new translation initiation codon with a preferred Kozak sequence immediately after the Alu integration. These data indicate that the gene can exist as either a pseudogene or as a complete 20 kD ORF among these Asian ape lineages as a result of either balancing selection or recurrent mutational events. It will be necessary to examine a larger number of individuals within each species to establish the evolutionary history of this locus among the Asian apes. We noticed an important structural difference in the gene organization for species that regained putative IRGM function when compared to those primates with a pseudogenized version. In the common ancestor of humans and great apes, an ERV9 retroviral element integrated within the 5 9 end of the IRGM gene (Figure 1). We reasoned that this structural difference may have conferred expression differences and analyzed the RT-PCR expression profile of IRGM in human, macaque and marmoset. Full-length cDNA sequencing and 5 9 RACE revealed that the human transcription start signal mapped specifically within the ERV9 repeat element (Figure 1 and Figure S4) resulting in the addition of a novel 5 9 UTR exon and an alternative splice form. Although there are five distinct, alternative splice forms of human IRGM , all human copies share this first intron. In humans, we observe constitutive levels of expression of IRGM in all tissues examined, with the highest expression of IRGM in the testis (Figure 3A) ...
Context 6
... Related GTPases ( IRG ), a family of genes induced by interferons, are one of the strongest resistance systems to intracellular pathogens [1–4]. The IRGM gene has been shown to have a role in the autophagy-targeted destruction of Mycobacterium bovis BCG [5]. Recently, whole genome association studies have shown that specific IRGM haplotypes associate with increased risk for Crohn’s disease [6,7]. The IRG gene family exists as multiple copies (3–21) in most mammalian species but has been reduced to two copies, IRGC and a truncated gene IRGM , in humans [8]. Analysis of mammalian genomes (dog, rat and mouse) has shown that all IRG genes except IRGC are organized in tandem gene clusters mapping to mouse chromosomes 11 and 18 (both syntenic to human chromosome 5) [8]. A comparison of the mouse and human genomes identified 21 genes in mouse but only a single syntenic truncated IRGM copy and IRGC in human [8]. We investigated the copy number and sequence organization of the IRG gene family in multiple nonhuman primate species in order to reconstruct the evolutionary history of this locus. Sequence analysis of two different prosimian species ( Microcebus murinus and Lemur catta ) confirmed the mammalian archetypical organization with three IRGM paralogs in each species (Figure 1). FISH analysis showed that genes in these species are organized as part of a tandem gene family similar to the organization observed within the mouse genome (Figure 2). In contrast, FISH and sequence analysis of various monkey and great ape species (see Text S1) confirmed a single copy in each of these species. Based on the estimated divergence of strepsirrhine and platyrrhine primate lineages, we conclude the IRGM gene cluster contracted to a single truncated copy 40–50 million years ago within the anthropoid lineage of evolution. We next compared the structure of the IRGM gene in various primate species. One of the three mouse lemur IRGM genes ( IRGM9 ) preserves a complete ORF based on the mouse model and shows the greatest homology to mouse Irgm1 . The ORF encodes a putative 47 kD protein including a classical N-terminal region as well as classical motifs at the end of the carboxyl- terminus associated with most functional murine IRGM loci [8,9] (see Text S1). The second mouse lemur gene, IRGM8 , is likely a pseudogene because of a mutation generating a stop codon within the G domain and a frameshift mutation at the C terminus. The third mouse lemur gene, IRGM7 , is atypical because it has substitutions in the G domain that disrupt the G1 motif that interacts with the nucleotide phosphates and is highly conserved in P-loop GTPases [10] (Figure S1 and Text S1). In contrast to mouse and prosimian species, all anthropoid primate lineages show the presence of an AluS repeat immediately after the splicing acceptor that disrupts the ORF of the sole remaining IRGM gene (Figure 1 and S2). Sequencing of the IRGM locus in four New World monkey species revealed the presence of the same two stop codons disrupting the ORF of IRGM in all species. We similarly identified a common frameshift mutation resulting in premature stop codons within the IRGM locus in eleven diverse Old World monkey species suggesting that IRGM had become pseudogenized before the radiation of these species. Sequencing of the gene in multiple individuals in the same species (five unrelated Rhesus macaque and baboon) suggested that the frameshift mutations were fixed (Figure S3 and Text S1). In total, these data argue that the IRGM locus has been nonfunctional since the divergence of the New World and Old World monkey lineages (35–40 million years ago) likely as a result of an Alu repeat integration event that disrupted the ORF of the gene in the anthropoid ancestor (Figure 1). In contrast to New World and Old World monkeys, sequencing of the IRGM locus in humans and African great ape species reveals a restored, albeit truncated, ORF of , 20 kD in length. This is consistent with an antiserum raised against peptides from the human IRGM protein that detected a specific signal at , 20 kD by Western blot [11]. In contrast to humans and the African great apes, analysis of the orangutan genome assembly predicted a nonfunctional protein (C to T transition at nucleotide position 150 with respect to the start codon resulting in a premature shared stop codon in the ORF (Figure 1 and Text S1). This is the same substitution identified among all Old World monkey genomes suggesting that ancestral ape species carried a pseudogene. We resequenced the IRGM gene in twelve different orangutans and five different gibbon species. Six of the twelve individuals from orangutan and one of the five species from gibbon are heterozygous for the C to T substitution. In addition, we noted that all ape IRGM copies also shared a new translation initiation codon with a preferred Kozak sequence immediately after the Alu integration. These data indicate that the gene can exist as either a pseudogene or as a complete 20 kD ORF among these Asian ape lineages as a result of either balancing selection or recurrent mutational events. It will be necessary to examine a larger number of individuals within each species to establish the evolutionary history of this locus among the Asian apes. We noticed an important structural difference in the gene organization for species that regained putative IRGM function when compared to those primates with a pseudogenized version. In the common ancestor of humans and great apes, an ERV9 retroviral element integrated within the 5 9 end of the IRGM gene (Figure 1). We reasoned that this structural difference may have conferred expression differences and analyzed the RT-PCR expression profile of IRGM in human, macaque and marmoset. Full-length cDNA sequencing and 5 9 RACE revealed that the human transcription start signal mapped specifically within the ERV9 repeat element (Figure 1 and Figure S4) resulting in the addition of a novel 5 9 UTR exon and an alternative splice form. Although there are five distinct, alternative splice forms of human IRGM , all human copies share this first intron. In humans, we observe constitutive levels of expression of IRGM in all tissues examined, with the highest expression of IRGM in the testis (Figure 3A) [8]. Although IRGM does not encode a functional protein in marmoset and macaque, we find evidence of low levels of expression, albeit in a more restricted manner (Figure 3B). Macaque and marmoset, for example, show no expression in the kidney with marmoset IRGM expression restricted to testis and lung. Furthermore, we find no evidence in macaque of splicing of the first intron based on the human IRGM gene model (Figure 3C) but rather evidence that the first intron remains as a continuous unspliced transcript. We also failed to confirm 3 9 downstream splicing events of macaque IRGM suggesting that even if stop codons were reverted, a full-length cDNA (comparable to human) could no longer be produced. These data strongly suggest that ERV9 integration significantly reshaped the expression and splicing pattern of IRGM in the common ancestor of humans and apes (Figure 3). We note that structural changes of the human IRGM locus continues to occur within the human lineage with a 20.1 kb LTR-rich deletion polymorphism, recently identified and sequenced, located 2.82 kb upstream of the ERV9 promoter region [12]. Our preliminary data suggest that this deletion polymorphism alters the relative proportion of alternative splicing of IRGM transcripts (Figures S5 and S6). We tested for natural selection on IRGM coding sequence using maximum likelihood models to estimate evolutionary rates for individual branches in the phylogeny as well as specific codon changes [13,14]. Based on the structural differences in IRGM organization, we first divided our species into three groups: Group 1 consists of species that carry a single copy of IRGM with the ERV9 element (human (Hs), chimpanzee (Ptr), gorilla (Ggo) and orangutan (Ppy)); Group 2 consists of species that carry a single copy of IRGM but lack the ERV9 element (Macaque (Rh), baboon (Pha) and marmoset (Cja)); while Group 3 was formed by species (dog and mouse lemur) that had multiple copies in a tandem orientation (Figure 4). Phylogenetic branch estimates of d N /d S revealed striking differences between Group 2 ( v = 0.9254) and Group 3 ( v = 0.3866) with an intermediate value for Group 1 ( v = 0.6073). Group 3 was found to be under constrained evolution ( v = 0.3866) and it was significantly different (P = 6.09E 2 12 ) from a model of neutral evolution. In contrast, Group 1 and 2 gene evolutions were indistinguishable from a model of neutral evolution (see Text S1). There are two possible interpretations of our results. First, the IRGM gene is not functional in humans having lost its role in intracellular parasite resistance , 40 million years ago when the gene family experienced a contraction from a set of three tandem genes to a sole, unique member whose ORF was disrupted by an AluS c repeat in the anthropoid primate ancestor. In light of the detailed functional studies [11] and the recent associations of this gene with Crohn’s disease [6,7], we feel that this interpretation is unlikely. For example, McCarroll and colleagues recently demonstrated that a 20.1 kb deletion upstream of IRGM ...

Citations

... Since this devolution occurred without the necessity to fold, we might expect pseudogenized genes to encode poorly folding proteins. Nevertheless, some pseudogenized genes occasionally regain their function (4,5), suggesting they sometimes could yield robust protein folding landscapes despite their multiple sequence alterations. Noncoding genomic regions could then serve as reservoirs of protein diversity. ...
Article
Protein evolution is guided by structural, functional, and dynamical constraints ensuring organismal viability. Pseudogenes are genomic sequences identified in many eukaryotes that lack translational activity due to sequence degradation and thus over time have undergone “devolution.” Previously pseudogenized genes sometimes regain their protein-coding function, suggesting they may still encode robust folding energy landscapes despite multiple mutations. We study both the physical folding landscapes of protein sequences corresponding to human pseudogenes using the Associative Memory, Water Mediated, Structure and Energy Model, and the evolutionary energy landscapes obtained using direct coupling analysis (DCA) on their parent protein families. We found that generally mutations that have occurred in pseudogene sequences have disrupted their native global network of stabilizing residue interactions, making it harder for them to fold if they were translated. In some cases, however, energetic frustration has apparently decreased when the functional constraints were removed. We analyzed this unexpected situation for Cyclophilin A, Profilin-1, and Small Ubiquitin-like Modifier 2 Protein. Our analysis reveals that when such mutations in the pseudogene ultimately stabilize folding, at the same time, they likely alter the pseudogenes’ former biological activity, as estimated by DCA. We localize most of these stabilizing mutations generally to normally frustrated regions required for binding to other partners.
... It should be noted that the relative contribution of the TAG system in IFNGmediated control of MNV was more pronounced in murine cells than in human cells (e.g., Fig. 3E and 1D, respectively). This may reflect that human cells have an evolutionarily contracted IRG system (8,51). How exactly GBP1 disrupts the MNV RC is the question of future investigation. ...
Article
Full-text available
Replication complexes (RCs), formed by positive-strand (+) RNA viruses through rearrangements of host endomembranes, protect their replicating RNA from host innate immune defenses. We have shown that two evolutionarily conserved defense systems, autophagy and interferon (IFN), target viral RCs and inhibit viral replication collaboratively. However, the mechanism by which autophagy proteins target viral RCs and the role of IFN-inducible GTPases in the disruption of RCs remains poorly understood. Here, using murine norovirus (MNV) as a model (+) RNA virus, we show that the guanylate binding protein 1 (GBP1) is the human GTPase responsible for inhibiting RCs. Furthermore, we found that ATG16L1 mediates the LC3 targeting of MNV RC by binding to WIPI2B and CAPRIN1, and that IFN gamma-mediated control of MNV replication was dependent on CAPRIN1. Collectively, this study identifies a novel mechanism for the autophagy machinery-mediated recognition and inhibition of viral RCs, a hallmark of (+) RNA virus replication. IMPORTANCE Replication complexes provide a microenvironment important for (+) RNA virus replication and shield it from host immune response. Previously we have shown that interferon gamma (IFNG) disrupts the RC of MNV via evolutionarily conserved autophagy proteins and IFN-inducible GTPases. Elucidating the mechanism of targeting of viral RC by ATG16L1 and IFN-induced GTPase will pave the way for development of therapeutics targeting the viral replication complexes. Here, we have identified GBP1 as the sole GBP targeting viral RC and uncovered the novel role of CAPRIN1 in recruiting ATG16L1 to the viral RC.
... autoimmunity or immune responses to infection via its intersection with the autophagy pathway (5 to 9). What factors have driven the differential expansion and deletion of IRG genes between mice and humans, and what the relative fitness costs or benefits of retaining or losing the IRG system are, remain intriguing questions (10). Studies that expand our understanding of how the IRG system functions in mice during infection with diverse pathogens simultaneously offer useful points of comparison for examining the function of IRGM in humans. ...
Article
Full-text available
Mycobacterium tuberculosis (Mtb) is a bacterium that exclusively resides in human hosts and remains a dominant cause of morbidity and mortality among infectious diseases worldwide. Host protection against Mtb infection is dependent on the function of immunity-related GTPase clade M (IRGM) proteins. Polymorphisms in human IRGM associate with altered susceptibility to mycobacterial disease, and human IRGM promotes the delivery of Mtb into degradative autolysosomes. Among the three murine IRGM orthologs, Irgm1 has been singled out as essential for host protection during Mtb infections in cultured macrophages and in vivo. However, whether the paralogous murine Irgm genes, Irgm2 and Irgm3, play roles in host defense against Mtb or exhibit functional relationships with Irgm1 during Mtb infection remains undetermined. Here, we report that Irgm1-/- mice are indeed acutely susceptible to aerosol infection with Mtb, yet the additional deletion of the paralogous Irgm3 gene restores protective immunity to Mtb infections in Irgm1-deficient animals. Mice lacking all three Irgm genes (panIrgm-/-) are characterized by shifted lung cytokine profiles at 5 and 24 weeks postinfection, but control disease until the very late stages of the infection, when panIrgm-/- mice display increased mortality compared to wild-type mice. Collectively, our data demonstrate that disruptions in the balance between Irgm isoforms is more detrimental to the Mtb-infected host than total loss of Irgm-mediated host defense, a concept that also needs to be considered in the context of human Mtb susceptibility linked to IRGM polymorphisms.
... IRGM, which is located in chromosome 5q33.1, is the mammalian ortholog of murine Irgm1 and has a role in immunity, providing protection against intracellular pathogens [12]. Bekpen et al. [13] identified the process of ancestral Irgm1 pseudogenation and subsequent reactivation via insertion of the endogenous retroviral element 9 (EVR9) in human lineages. Some murine Irgm1 autophagy-related functions are performed similarly by IRGM. ...
... In addition, human IRGM functions upstream of autophagic initiation and throughout the autophagic process. Importantly, human IRGM is not IFN-γ-dependant, lacking a γ-activated sequence (GAS) [5]; however, recent evidence suggests it does act as a master negative regulator of cellular interferon responses [13]. There are four IRGM isoforms (IRGMa, IRGMb, IRGMc, and IRGMd) with distinct functions. ...
Article
Full-text available
The human immunity-related GTPase M (IRGM) is a GTP-binding protein that regulates selective autophagy including xenophagy and mitophagy. IRGM impacts autophagy by (1) affecting mitochondrial fusion and fission, (2) promoting the co-assembly of ULK1 and Beclin 1, (3) enhancing Beclin 1 interacting partners (AMBRA1, ATG14L1, and UVRAG), (4) interacting with other key proteins (ATG16L1, p62, NOD2, cGAS, TLR3, and RIG-I), and (5) regulating lysosomal biogenesis. IRGM also negatively regulates NLRP3 inflammasome formation and therefore, maturation of the important pro-inflammatory cytokine IL-1β, impacting inflammation and pyroptosis. Ultimately, this affords protection against chronic inflammatory diseases. Importantly, ten IRGM polymorphisms (rs4859843, rs4859846, rs4958842, rs4958847, rs1000113, rs10051924, rs10065172, rs11747270, rs13361189, and rs72553867) have been associated with human inflammatory disorders including cancer, which suggests that these genetic variants are functionally relevant to the autophagic and inflammatory responses. The current review contextualizes IRGM, its modulation of autophagy, and inflammation, and emphasizes the role of IRGM as a cross point of immunity and tumorigenesis.
... An autophagy related genethe human immunity related GTPase M (IRGM) -has been highlighted as an important candidate in evolutionary history of autophagy and pathogenesis [28]. IRGM has been assigned a critical role in combating various diseases caused by viruses, bacteria, and other parasites in humans [29][30][31][32][33]. Involvement of IRGM in controlling intracellular pathogen burden of M.tb is also shown by Singh et al. [34]. ...
Article
Single nucleotide polymorphisms (SNPs) in IRGM are reported to affect Mycobacterium tuberculosis (M.tb) degradation pathway. Here, we aim to screen promoter-region regulatory SNPs of IRGM, in Pakistani population. DNA extracted from blood of cohort containing 70 TB patients (TB) and 30 controls subjects (Ctrl), was amplified for IRGM promoter region, followed by DNA sequencing. Group-specific variations were found in allelic frequencies at four loci. Allele T (p-value = 0.03) at −1161T/C, allele G (p-value = 0.027) at −1133G/A; allele C (p-value = 0.029) at −1049C/T; and allele G (p-value = 0.02) at −708G/A, showed higher associations with TB susceptibility in our cohort. These SNPs display strong linkage disequilibrium (LD) in Pakistani population. Haplotype analysis showed a significant association of haplotype −1161T/−1133G/−1049C/−708G (p-value = 0.007) to TB. This 4-SNP haplotype also represents an expression quantitative trait locus (eQTL), associated with Crohn's disease and chronic inflammatory diseases. Our findings show that variants −1161T/C, −1133G/A, −1049C/T, and −708G/A are associated with IRGM expression and susceptibility to TB in a Pakistani population.
... Therefore, ATG16L mutations contribute to exacerbated inflammation. Another gene related to autophagy and the development of IBD is IRGM which codes the M protein (immunity-related GTPase family M protein, IRGM) [32]. This protein is responsible for the maturation of autophagosomes and participates in pathogen elimination from mammalian cells. ...
Article
Full-text available
Autofagia jest konserwatywnym procesem polegającym na lizosomalnym trawieniu uszkodzonych organelli komórkowych, patogenów i niefunkcjonalnych białek, co warunkuje utrzymanie równowagi komórkowej. Proces ten stanowi alternatywne źródło energii dla komórki w warunkach stresowych indukowanych głodzeniem, czynnikami chemicznymi czy niedotlenieniem. W ostatnich latach wzrosło zainteresowanie autofagią, a jej dysfunkcjonalność uznawana jest za jeden z czynników sprzyjających rozwojowi zróżnicowanych jednostek chorobowych. Prawdopodobieństwo występowania chorób, takich jak nowotwory, choroby układu sercowo-naczyniowego czy choroby neurodegeneracyjne wzrasta wraz z wiekiem, a proces autofagii ulega hamowaniu w starzejącym się organizmie, co dodatkowo wskazuje na udział upośledzonej autofagii w patogenezie wielu chorób. W związku z tym, działania ukierunkowane na modyfikację szlaków związanych z autofagią wskazywane są jako potencjalne narzędzie terapeutyczne. W niniejszym przeglądzie prezentujemy wybrane choroby, których przyczyn upatruje się w zaburzonej autofagii, wskazujemy także potencjalne możliwości terapeutyczne oraz podkreślamy dychotomiczną rolę autofagii, szczególnie w procesie nowotworzenia.
... Furthermore, IRGB10 and GBPs, along with IRGM proteins, play roles in the murine IFN response to Chlamydia [6,10,15,28]. The expression of different IFN-inducible GTPases is highly variable between mammalian species, particularly in the case of members of the IRG family [3,29]. In addition, loss and restoration of the expression of different members of this family is dynamic over time in response to evolutionary pressure [29]. ...
... The expression of different IFN-inducible GTPases is highly variable between mammalian species, particularly in the case of members of the IRG family [3,29]. In addition, loss and restoration of the expression of different members of this family is dynamic over time in response to evolutionary pressure [29]. ...
... Although IRGs play numerous roles in the defense of rodents against infectious microbes, the expression of IRGs across species is highly variable and dynamic within a species over time [3,29,59]. The variability in the number of IRGs expressed between species and loss and restoration of members of this family due to evolutionary pressure, along with differences in their regulation by IFNs, suggests that this family of proteins may have an evolutionary cost in response to some selective pressure [3,29,59]. ...
Article
Full-text available
The upregulation of interferon (IFN)-inducible GTPases in response to pathogenic insults is vital to host defense against many bacterial, fungal, and viral pathogens. Several IFN-inducible GTPases play key roles in mediating inflammasome activation and providing host protection after bacterial or fungal infections, though their role in inflammasome activation after viral infection is less clear. Among the IFN-inducible GTPases, the expression of immunity-related GTPases (IRGs) varies widely across species for unknown reasons. Here, we report that IRGB10, but not IRGM1, IRGM2, or IRGM3, is required for NLRP3 inflammasome activation in response to influenza A virus (IAV) infection. While IRGB10 functions to release inflammasome ligands in the context of bacterial and fungal infections, we found that IRGB10 facilitates endosomal maturation and nuclear translocation and viral replication of IAV. Corresponding with our in vitro results, we found that Irgb10–/– mice were more resistant to IAV-induced mortality than wild-type mice. The results of our study demonstrate a detrimental role of IRGB10 in host immunity in response to IAV and a novel function of IRGB10, but not IRGMs, in promoting viral translocation into the nucleus. This article is protected by copyright. All rights reserved
... In some cases, however, ERVs have also been coopted as the primary promoters and/ or transcription start sites (TSS) for immunity genes. One example is an ERV9 element harboring the TSS for the immunity-related GTPase family M protein (IRGM) (105). This large IFN-inducible GTPase eliminates mycobacteria by inducing autophagy (106). ...
... This large IFN-inducible GTPase eliminates mycobacteria by inducing autophagy (106). In this case, the insertion of an Alu retrotransposon initially disrupted the open reading frame of IRGM in the common anthropoid ancestor, but was subsequently resurrected by the insertion of the ERV9 transcription start site (TSS) (105). Interestingly, expression of two additional members of the family of large IFN-inducible GTPases is also regulated by ERV9 promoters. ...
Article
Full-text available
Long disregarded as junk DNA or genomic dark matter, endogenous retroviruses (ERVs) turned out to represent important components of the antiviral immune response. These remnants of once-infectious retroviruses not only regulate cellular immune activation, but may even directly target invading viral pathogens. In this review, we summarize mechanisms, by which retroviral fossils protect us from viral infections. One focus will be on recent advances in the role of ERVs as regulators of antiviral gene expression.
... IRGM and its murine orthologue Irgm1 (refs. 8,9 ) bridge the immune system 10 and the core ATG machinery to control autophagy in mammalian cells 4,[11][12][13][14] . ...
Article
Macroautophagy/autophagy delivers cytoplasmic cargo to lysosomes for degradation. In yeast, the single Atg8 protein plays a role in the formation of autophagosomes whereas in mammalian cells there are five to seven paralogs, referred to as mammalian Atg8s (mAtg8s: GABARAP, GABARAPL1, GABARAPL2, LC3A, LC3B, LC3B2 and LC3C) with incompletely defined functions. Here we show that a subset of mAtg8s directly control lysosomal biogenesis. This occurs at the level of TFEB, the principal regulator of the lysosomal transcriptional program. mAtg8s promote TFEB’s nuclear translocation in response to stimuli such as starvation. GABARAP interacts directly with TFEB, whereas RNA-Seq analyses reveal that knockout of six genes encoding mAtg8s, or a triple knockout of the genes encoding all GABARAPs, diminishes the TFEB transcriptional program. We furthermore show that GABARAPs in cooperation with other proteins, IRGM, a factor implicated in tuberculosis and Crohn disease, and STX17, are required during starvation for optimal inhibition of MTOR, an upstream kinase of TFEB, and activation of the PPP3/calcineurin phosphatase that dephosphorylates TFEB, thus promoting its nuclear translocation. In conclusion, mAtg8s, IRGM and STX17 control lysosomal biogenesis by their combined or individual effects on MTOR, TFEB, and PPP3/calcineurin, independently of their roles in the formation of autophagosomal membranes. Abbreviations: AMPK: AMP-activated protein kinase; IRGM: immunity related GTPase M; mAtg8s: mammalian Atg8 proteins; MTOR: mechanistic target of rapamycin kinase; PPP3CB: protein phosphatase 3 catalytic subunit beta; RRAGA: Ras related GTP binding A.; STX17: syntaxin 17; ULK1: unc-51 like autophagy activating kinase 1
... IRGM and its murine orthologue Irgm1 (refs. 8,9 ) bridge the immune system 10 and the core ATG machinery to control autophagy in mammalian cells 4,11-14 . ...
Article
Full-text available
Autophagy is a homeostatic process with multiple functions in mammalian cells. Here, we show that mammalian Atg8 proteins (mAtg8s) and the autophagy regulator IRGM control TFEB, a transcriptional activator of the lysosomal system. IRGM directly interacted with TFEB and promoted the nuclear translocation of TFEB. An mAtg8 partner of IRGM, GABARAP, interacted with TFEB. Deletion of all mAtg8s or GABARAPs affected the global transcriptional response to starvation and downregulated subsets of TFEB targets. IRGM and GABARAPs countered the action of mTOR as a negative regulator of TFEB. This was suppressed by constitutively active RagB, an activator of mTOR. Infection of macrophages with the membrane-permeabilizing microbe Mycobacterium tuberculosis or infection of target cells by HIV elicited TFEB activation in an IRGM-dependent manner. Thus, IRGM and its interactors mAtg8s close a loop between the autophagosomal pathway and the control of lysosomal biogenesis by TFEB, thus ensuring coordinated activation of the two systems that eventually merge during autophagy.