Figure 3 - uploaded by Ilya Belalov
Content may be subject to copyright.
Plots of ENC vs. sequence content in 29 RNA viruses indicated in Table 1 (black dots) and in 1000 simulated sequences with random third-position nucleotide content (gray circles). Plot of ENC against GC content (A), variance of third- position GC content (B), variance of third-position nucleotide frequencies (D) and variance of dinucleotide frequencies at codon position 2-3 (D). Solid line in panel (a) indicates theoretical prediction of ENC as a function of GC content bias [31]. doi:10.1371/journal.pone.0056642.g003 

Plots of ENC vs. sequence content in 29 RNA viruses indicated in Table 1 (black dots) and in 1000 simulated sequences with random third-position nucleotide content (gray circles). Plot of ENC against GC content (A), variance of third- position GC content (B), variance of third-position nucleotide frequencies (D) and variance of dinucleotide frequencies at codon position 2-3 (D). Solid line in panel (a) indicates theoretical prediction of ENC as a function of GC content bias [31]. doi:10.1371/journal.pone.0056642.g003 

Source publication
Article
Full-text available
Choice of synonymous codons depends on nucleotide/dinucleotide composition of the genome (termed mutational pressure) and relative abundance of tRNAs in a cell (translational pressure). Mutational pressure is commonly simplified to genomic GC content; however mononucleotide and dinucleotide frequencies in different genomes or mRNAs may vary signifi...

Citations

... Starting at the 3' end, the number of mismatches increased in consistent waves of three nucleotides for all primer schemes except SARS-CoV-2. We hypothesized that this might be due to synonymous codon usage caused by variations in the second and third codon position 52,53 . Manual inspection of the primers' locations in the MSAs indeed confirmed that our 3' penalty system preferentially selected primers if their 3' ends are located at the first and not at the second or third position of a codon. ...
Preprint
Full-text available
Time- and cost-saving surveillance of viral pathogens is achieved by tiled sequencing in which a viral genome is amplified in overlapping PCR amplicons and qPCR. However, designing pan-specific primers for viral pathogens that have high genomic variability represents a major challenge. Here, we present a bioinformatics command-line tool, called varVAMP ( var iable v irus amp licons). It relies on multiple sequence alignments of highly variable virus sequences and enables automatic pan-specific primer design for qPCR or tiled amplicon whole genome sequencing. The varVAMP software guarantees pan-specificity by two means: it designs primers in regions with minimal variability and introduces degenerate nucleotides into primer sequences to compensate for common sequence variations. We demonstrate varVAMP’s utility by designing and evaluating novel pan-specific primer schemes suitable for sequencing the genomes of SARS-CoV-2, Hepatitis E virus, rat Hepatitis E virus, Hepatitis A virus, Borna-disease-virus-1, and Poliovirus. Moreover, we established highly sensitive and specific Poliovirus qPCR assays that could potentially simplify current Poliovirus surveillance. Importantly, wet-lab and bioinformatic techniques established for SARS-CoV-2 tiled amplicon sequencing were readily transferable to these new primer schemes and will allow sequencing laboratories to extend their established methodology to other human pathogens.
... It is a generally agreed parameter for quantifying the degree of resemblances between codon usage of a gene and a reference dataset to investigate synonymous codon usage. Also, an estimate of gene expressivity, an examination of the factors governing synonymous codon usage, and an investigation of horizontal gene transfer have been performed with the assistance of CAI values [53,[58][59][60][61][62]. Therefore, we employed CAI to investigate the codon usage preferences of NeVs in relation to seven host species, viz. ...
Article
Full-text available
Neboviruses (NeVs) from the Caliciviridae family have been linked to enteric diseases in bovines and have been detected worldwide. As viruses rely entirely on the cellular machinery of the host for replication, their ability to thrive in a specific host is greatly impacted by the specific codon usage preferences. Here, we systematically analyzed the codon usage bias in NeVs to explore the genetic and evolutionary patterns. Relative Synonymous Codon Usage and Effective Number of Codon analyses indicated a marginally lower codon usage bias in NeVs, predominantly influenced by the nucleotide compositional constraints. Nonetheless, NeVs showed a higher codon usage bias for codons containing G/C at the third codon position. The neutrality plot analysis revealed natural selection as the primary factor that shaped the codon usage bias in both the VP1 (82%) and VP2 (57%) genes of NeVs. Furthermore, the NeVs showed a highly comparable codon usage pattern to bovines, as reflected through Codon Adaptation Index and Relative Codon Deoptimization Index analyses. Notably, yak NeVs showed considerably different nucleotide compositional constraints and mutational pressure compared to bovine NeVs, which appear to be predominantly host-driven. This study sheds light on the genetic mechanism driving NeVs’ adaptability, evolution, and fitness to their host species.
... CUB in virus genomes is being studied from different points of view, such as adaptation to their hosts (Tian et al., 2018), the extent of respiratory virulence (Chen and Yang, 2022), and the compositional difference between conserved and variable amino acid residues (Klitting et al., 2016). Genome composition in RNA viruses reflects codon usage and therefore CUB is considered to be mainly under mutational pressure (Jenkins and Holmes, 2003;Belalov and Lukashev, 2013;Yao et al., 2020). Apart from the genome composition, limited evidence of host-specific translational selection pressure also has been reported in human RNA viruses (Jitobaom et al., 2020), in the MERS-CoV (Hussain et al., 2020) and influenza virus H1N1 (Wong et al., 2010). ...
... In this study, the ENC was calculated by CodonW v1.4.2 package and was employed to accurately assess the extent of CUB in the PNRSV CP sequences. It is represented by a numerical value that can vary between 20 and 61 [52]. The numerical value of 20 serves as an indicator of a significant level of bias, that is, the gene only uses one of each set of synonymous codons, and 61 indicates that each codon is used. ...
Article
Full-text available
Prunus necrotic ringspot virus (PNRSV) is a significant virus of ornamental plants and fruit trees. It is essential to study this virus due to its impact on the horticultural industry. Several studies on PNRSV diversity and phytosanitary detection technology were reported, but the content on the codon usage bias (CUB), dinucleotide preference and codon pair bias (CPB) of PNRSV is still uncertain. We performed comprehensive analyses on a dataset consisting of 359 coat protein (CP) gene sequences in PNRSV to examine the characteristics of CUB, dinucleotide composition, and CPB. The CUB analysis of PNRSV CP sequences showed that it was not only affected by natural selection, but also affected by mutations, and natural selection played a more significant role compared to mutations as the driving force. The dinucleotide composition analysis showed an over-expression of the CpC/GpA dinucleotides and an under-expression of the UpA/GpC dinucleotides. The dinucleotide composition of the PNRSV CP gene showed a weak association with the viral lineages and hosts, but a strong association with viral codon positions. Furthermore, the CPB of PNRSV CP gene is low and is related to dinucleotide preference and codon usage patterns. This research provides reference for future research on PNRSV genetic diversity and gene evolution mechanism.
... Codon usage bias is a common phenomenon in organisms (Feng et al., 2022;Rahman et al., 2021), influenced by natural selection, mutational pressure, and gene expression levels (Belalov and Lukashev, 2013;Nyayanit et al., 2021;Peng et al., 2022). Host factors also affect the pattern of viral codon usage, thereby impacting host adaptation, evolution, and immune evasion (He et al., 2019b;Li et al., 2018). ...
Article
Full-text available
Over the past 20 years, the Seneca Valley virus (SVV) has emerged in various countries and regions around the world. Infected pigs display symptoms similar to foot-and-mouth disease and other vesicular diseases, causing severe economic losses to affected countries. In recent years, the number of SVV infections has been increasing in Brazil, China, and the United States. In this study, we comprehensively analyzed SVV genomic sequence data from the perspectives of evolutionary dynamics, phylogeography, and codon usage bias. We aimed to gain further insights into SVV's genetic diversity, spatiotemporal distribution patterns, and evolutionary adaptations. Phylogenetic analysis revealed that SVV has evolved into eight distinct lineages. Based on the results of phylogeographic analysis, it is speculated that the United States might have been the source of SVV, from where it subsequently spread to different countries and regions. Moreover, our analysis of positive selection sites in SVV capsid proteins suggests their potential importance in the process of receptor recognition. Finally, codon preference analysis indicates that natural selection has been a primary evolutionary driver influencing SVV codon usage bias. In conclusion, our in-depth investigation into SVV's origin, dissemination, evolution, and adaptation emphasizes the significance of SVV surveillance and control measures.
... THE FIT OF CODON USAGE OF GOOSE ASTROVIRUSES constraint, it was evident that natural selection represents the most crucial determinant shaping the CUPs of virus genomes of both GoAstV-1 and GoAstV-2 based on our ENC-GC3s plot and neutrality plot analysis ( Figure 2). Natural selection involves a selection acting on factors including secondary RNA structure, regulatory, structural RNA elements, viral RNA packaging compatibility, immunological escape, and most importantly, translational fitness (Jian et al., 1999;Jenkins and Holmes, 2003;Simmonds et al., 2004;Greenbaum et al., 2008;Belalov and Lukashev, 2013;Cooper et al., 2015). Specifically, selection acting on codons that match the most abundant tRNA, was supposed to contribute to translational efficiency and accuracy of virus genomes (dos Reis et al., 2004;Stoletzki and Eyre-Walker, 2007;Ran and Higgs, 2012;Chen et al., 2020;. ...
Article
Full-text available
Goose astroviruses (GoAstVs) are causative agents that account for fatal infection of goslings characterized by visceral urate deposition, resulting in severe economic losses in major goose-producing regions in China since 2017. In this study, we sought to unravel the intrinsic properties associated with adaptation and evolution in the host environment of GoAstVs. Consistent results from phylogenetic analysis and correspondence analysis performed on the codon usage patterns (CUPs) reveal 2 clusters of GoAstVs, namely, GoAstV-1 and GoAstV-2. However, multiple similar compositional characteristics were found, despite the high divergence between GoAstV-1 and GoAstV-2. Studies on the base composition of GoAstVs reveal an A/U bias, indicating a compositional constraint, while natural selection prevailed in determining the CUPs in the virus genome based on our neutrality plot analysis, reflecting high adaptive pressure to fit the host environment. Codon adaptation index (CAI) analysis revealed a higher degree of fitness to the CUPs of the corresponding host for GoAstVs than avian influenza virus and betacoronaviruses, which may be a favorable factor contributing to the high pathogenicity and wide distribution of GoAstVs in goslings. In addition, GoAstVs were less adapted to ducks and chickens, with significantly lower CAI values than to geese, which may be a reason for the different prevalence of GoAstVs among these species. Extensive investigations on dinucleotide distribution revealed a significant suppression of the CpG and UpA motifs in the virus genome, which may facilitate adaptation to the host's innate immune system by evading surveillance. In addition, our study reported the trends of increasing fitness to the host's microenvironment for GoAstVs through increasing adaptation to host CUPs and ongoing reduction of CpG motifs in the virus genome. The present analysis deepens our understanding of the basic biology, pathogenesis, adaptation and evolutionary pattern of GoAstVs, and contributes to the development of novel antiviral strategies.
... CBI values range from 1 (maximum codon bias) and 0 (uniform use of synonymous codons). An ENC value < 40 is commonly treated as evidence of a strong codon usage bias [36]. ...
Article
Full-text available
Background ETRAMP11.2 (PVX_003565) is a well-characterized protein with antigenic potential. It is considered to be a serological marker for diagnostic tools, and it has been suggested as a potential vaccine candidate. Despite its immunological relevance, the polymorphism of the P. vivax ETRAMP11.2 gene (pvetramp11.2) remains undefined. The genetic variability of an antigen may limit the effectiveness of its application as a serological surveillance tool and in vaccine development and, therefore, the aim of this study was to investigate the genetic diversity of pvetramp11.2 in parasite populations from Amazonian regions and worldwide. We also evaluated amino acid polymorphism on predicted B-cell epitopes. The low variability of the sequence encoding PvETRAMP11.2 protein suggests that it would be a suitable marker in prospective serodiagnostic assays for surveillance strategies or in vaccine design against P. vivax malaria. Methods The pvetramp11.2 of P. vivax isolates collected from Brazil (n = 68) and Peru (n = 36) were sequenced and analyzed to assess nucleotide polymorphisms, allele distributions, population differentiation, genetic diversity and signature of selection. In addition, sequences (n = 104) of seven populations from different geographical regions were retrieved from the PlasmoDB database and included in the analysis to study the worldwide allele distribution. Potential linear B-cell epitopes and their polymorphisms were also explored. Results The multiple alignments of 208 pvetramp11.2 sequences revealed a low polymorphism and a marked geographical variation in allele diversity. Seven polymorphic sites and 11 alleles were identified. All of the alleles were detected in isolates from the Latin American region and five alleles were detected in isolates from the Southeast Asia/Papua New Guinea (SEA/PNG) region. Three alleles were shared by all Latin American populations (H1, H6 and H7). The H1 allele (reference allele from Salvador-1 strain), which was absent in the SEA/PNG populations, was the most represented allele in populations from Brazil (54%) and was also detected at high frequencies in populations from all other Latin America countries (range: 13.0% to 33.3%). The H2 allele was the major allele in SEA/PNG populations, but was poorly represented in Latin America populations (only in Brazil: 7.3%). Plasmodium vivax populations from Latin America showed a marked inter-population genetic differentiation (fixation index [Fst]) in contrast to SEA/PNG populations. Codon bias measures (effective number of codons [ENC] and Codon bias index [CBI]) indicated preferential use of synonymous codons, suggesting selective pressure at the translation level. Only three amino acid substitutions, located in the C-terminus, were detected. Linear B-cell epitope mapping predicted two epitopes in the Sal-1 PvETRAMP11.2 protein, one of which was fully conserved in all of the parasite populations analyzed. Conclusions We provide an overview of the allele distribution and genetic differentiation of ETRAMP11.2 antigen in P. vivax populations from different endemic areas of the world. The reduced polymorphism and the high degree of protein conservation supports the application of PvETRAMP11.2 protein as a reliable antigen for application in serological assays or vaccine design. Our findings provide useful information that can be used to inform future study designs. Graphical abstract
... A slew of studies discovered that mutational pressure, rather than selection is the primary factor determining the codon usage bias (35). Additionally, mutational pressure cannot be considered as the main driving force in the case of different RNA or DNA viruses (36). Viral genomes differ from the genomes of prokaryotes and eukaryotes in certain aspects. ...
Article
Full-text available
Hemorrhagic fever with renal syndrome (HFRS) is an acute viral zoonosis carried and transmitted by infected rodents through urine, droppings, or saliva. The etiology of HFRS is complex due to the involvement of viral factors and host immune and genetic factors which hinder the development of potential therapeutic solutions for HFRS. Hantaan virus (HTNV), Dobrava-Belgrade virus (DOBV), Seoul virus (SEOV), and Puumala virus (PUUV) are predominantly found in hantaviral species that cause HFRS in patients. Despite ongoing prevention and control efforts, HFRS remains a serious economic burden worldwide. Furthermore, recent studies reported that the hantavirus nucleocapsid protein is a multi-functional protein and plays a major role in the replication cycle of the hantavirus. However, the precise mechanism of the nucleoproteins in viral pathogenesis is not completely understood. In the framework of the current study, various in silico approaches were employed to identify the factors influencing the codon usage pattern of hantaviral nucleoproteins. Based on the relative synonymous codon usage (RSCU) values, a comparative analysis was performed between HFRS-causing hantavirus and their hosts, suggesting that HTNV, DOBV, SEOV, and PUUV, were inclined to evolve their codon usage patterns that were comparable to those of their hosts. The results indicated that most of the overrepresented codons had AU-endings, which revealed that mutational pressure is the major force shaping codon usage patterns. However, the influence of natural selection and geographical factors cannot be ignored on viral codon usage bias. Further analysis also demonstrated that HFRS causing hantaviruses adapted host-specific codon usage patterns to sustain successful replication and transmission chains within hosts. To our knowledge, no study to date reported the factors influencing the codon usage pattern within hantaviral nucleoproteins. Thus, the proposed computational scheme can help in understanding the underlying mechanism of codon usage patterns in HFRS-causing hantaviruses which lend a helping hand in designing effective anti-HFRS treatments in future. This study, although comprehensive, relies on in silico methods and thus necessitates experimental validation for more solid outcomes. Beyond the identified factors influencing viral behavior, there could be other yet undiscovered influences. These potential factors should be targets for further research to improve HFRS therapeutic strategies.
... Thus, genes that utilize optimal codons are predicted to be efficiently expressed. Although, viruses depend on the translation machinery of host cells, the nucleotide compositions of viral genomes were observed to be different from those of their hosts (Belalov and Lukashev, 2013). A-rich genomes are found in the RNA genomes of retroviruses and influenza viruses. ...
Article
Full-text available
Schlafen (SLFN) proteins are a subset of interferon-stimulated early response genes with antiviral properties. An antiviral mechanism of SLFN11 was previously demonstrated in human immunodeficiency virus type 1 (HIV-1)-infected cells, and it was shown that SLFN11 inhibited HIV-1 virus production in a codon usage-specific manner. The codon usage patterns of many viruses are vastly different from those of their hosts. The codon usage-specific inhibition of HIV-1 expression by SLFN11 suggests that SLFN11 may be able to inhibit other viruses with a suboptimal codon usage pattern. However, the effect of SLFN11 on the replication of influenza A virus (IAV) has never been reported. The induction of SLFN11 expression was observed upon IAV infection. The reduction of SLFN11 expression also promotes influenza virus replication. Moreover, we found that overexpression of SLFN11 could reduce the expression of a reporter gene with a viral codon usage pattern, and the inhibition of viral hemagglutinin (HA) gene was codon-specific as the expression of codon optimized HA was not affected. These results indicate that SLFN11 inhibits the influenza A virus in a codon-specific manner and that SLFN11 may contribute to innate defense against influenza A viruses.
... Generally, potyvirids, such as PVY [21], TuMV [40] and PPV [26], cause great losses to the agricultural economy worldwide. The four nucleotides (adenine, cytosine, guanine, and uracil) are generally not random in the genomes of viruses and the hosts they infect [41][42][43][44][45][46][47]. Many animal RNA viruses possess A-rich genome coding sequences accompanied by a depletion of C-rich sequences [45]. ...
Article
Patatavirales is the largest order of plant RNA viruses and exclusively contains the family Potyviridae, accounting for 30% of all known plant viruses. The composition bias of animal RNA viruses and several plant RNA viruses has been determined. However, the comprehensive nucleic acid composition, codon pair usage patterns, dinucleotide preference and codon pair preference of plant RNA viruses have not been investigated to date. In this study, integrated analysis and discussion of the nucleic acid composition, codon usage patterns, dinucleotide composition and codon pair bias of potyvirids were performed using 3732 complete genome coding sequences. The nucleic acid composition of potyvirids was significantly enriched in A/U. Interestingly, the A/U-rich nucleotide composition of Patatavirales is essential for determining the preferred A-ended and U-ended codons and the overexpression of UpG and CpA dinucleotides. The codon usage patterns and codon pair bias of potyvirids were significantly correlated with their nucleic acid composition. Additionally, the codon usage pattern, dinucleotide composition and codon-pair bias of potyvirids are more dependent on the classification of the virus compared with their hosts. Our analysis provides a better understanding of future research on the origin and evolution patterns of the order Patatavirales.