ArticleLiterature Review

Haplotype-based breeding: A new insight in crop improvement

May 2024
Plant Science 346(1):112129

May 2024
346(1):112129

DOI:10.1016/j.plantsci.2024.112129

Authors:

Ranjani Raja

Tamil Nadu Agricultural University

Show all 7 authorsHide

ResearchGate has not been able to resolve any citations for this publication.

Identification of Superior Haplotypes and Haplotype Combinations for Grain Size- and Weight-Related Genes for Breeding Applications in Rice (Oryza sativa L.)

Article

Full-text available

Dec 2023

The identification of superior haplotypes and haplotype combinations is essential for haplotype-based breeding (HBB), which provides selection targets for genomics-assisted breeding. In this study, genotypes of 42 functional genes in rice were analyzed by targeted capture sequencing in a panel of 180 Indica rice accessions. In total, 69 SNPs/Indels in seven genes were detected to be associated with grain length (GL), grain width (GW), ratio of grain length–width (L/W) and thousand-grain weight (TGW) using candidate gene-based association analysis, including BG1 and GS3 for GL, GW5 for GW, BG1 and GW5 for L/W, and AET1, SNAC1, qTGW3, DHD1 and GW5 for TGW. Furthermore, two haplotypes were identified for each of the seven genes according to these associated SNPs/Indels, and the amount of genetic variation explained by different haplotypes ranged from 3.24% to 27.66%. Additionally, three, three and eight haplotype combinations for GL, L/W and TGW explained 25.38%, 5.5% and 22.49% of the total genetic variation for each trait, respectively. Further analysis showed that Minghui63 had the superior haplotype combination Haplotype Combination 4 (HC4) for TGW. The most interesting finding was that some widely used restorer lines derived from Minghui63 also have the superior haplotype combination HC4, and our breeding varieties and lines using the haplotype-specific marker panel also confirmed that the TGW of the lines was much higher than that of their sister lines without HC4, suggesting that TGW-HC4 is the superior haplotype combination for TGW and can be utilized in rice breeding.

Rice (Oryza sativa L.) Grain Size, Shape, and Weight-Related QTLs Identified Using GWAS with Multiple GAPIT Models and High-Density SNP Chip DNA Markers

Article

Full-text available

Nov 2023

This study investigated novel quantitative traits loci (QTLs) associated with the control of grain shape and size as well as grain weight in rice. We employed a joint-strategy multiple GAPIT (Genome Association and Prediction Integrated Tool) models [(Bayesian-information and Linkage-disequilibrium Iteratively Nested Keyway (BLINK)), Fixed and random model Circulating Probability Uniform (FarmCPU), Settlement of MLM Under Progressive Exclusive Relationship (SUPER), and General Linear Model (GLM)]–High-Density SNP Chip DNA Markers (60,461) to conduct a Genome-Wide Association Study (GWAS). GWAS was performed using genotype and grain-related phenotypes of 143 recombinant inbred lines (RILs). Data show that parental lines (Ilpum and Tung Tin Wan Hein 1, TTWH1, Oryza sativa L., ssp. japonica and indica, respectively) exhibited divergent phenotypes for all analyzed grain traits), which was reflected in their derived population. GWAS results revealed the association between seven SNP Chip makers and QTLs for grain length, co-detected by all GAPIT models on chromosomes (Chr) 1–3, 5, 7, and 11, were qGL1-1BFSG (AX-95918134, Chr1: 3,820,526 bp) explains 65.2–72.5% of the phenotypic variance explained (PVE). In addition, qGW1-1BFSG (AX-273945773, Chr1: 5,623,288 bp) for grain width explains 15.5–18.9% of PVE. Furthermore, BLINK or FarmCPU identified three QTLs for grain thickness independently, and explain 74.9% (qGT1Blink, AX-279261704, Chr1: 18,023,142 bp) and 54.9% (qGT2-1Farm, AX-154787777, Chr2: 2,118,477 bp) of the observed PVE. For the grain length-to-width ratio (LWR), the qLWR2BFSG (AX-274833045, Chr2: 10,000,097 bp) explains nearly 15.2–32% of the observed PVE. Likewise, the major QTL for thousand-grain weight (TGW) was detected on Chr6 (qTGW6BFSG, AX-115737727, 28,484,619 bp) and explains 32.8–54% of PVE. The qTGW6BFSG QTL coincides with qGW6-1Blink for grain width and explained 32.8–54% of PVE. Putative candidate genes pooled from major QTLs for each grain trait have interesting annotated functions that require functional studies to elucidate their function in the control of grain size, shape, or weight in rice. Genome selection analysis proposed makers useful for downstream marker-assisted selection based on genetic merit of RILs.

Genome‐wide dissection and haplotype analysis identified candidate loci for nitrogen use efficiency under drought conditions in winter wheat

Article

Full-text available

Oct 2023

Climate change causes extreme conditions like prolonged drought, which results in yield reductions due to its effects on nutrient balances such as nitrogen uptake and utilization by plants. Nitrogen (N) is a crucial nutrient element for plant growth and productivity. Understanding the mechanistic basis of nitrogen use efficiency (NUE) under drought conditions is essential to improve wheat (Triticum aestivum L.) yield. Here, we evaluated the genetic variation of NUE‐related traits and photosynthesis response in a diversity panel of 200 wheat genotypes under drought and nitrogen stress conditions to uncover the inherent genetic variation and identify quantitative trait loci (QTLs) underlying these traits. The results revealed significant genetic variations among the genotypes in response to drought stress and nitrogen deprivation. Drought impacted plant performance more than N deprivation due to its effect on water and nutrient uptake. GWAS identified a total of 27 QTLs with a significant main effect on the drought‐related traits, while 10 QTLs were strongly associated with the NUE traits. Haplotype analysis revealed two different haplotype blocks within the associated region on chromosomes 1B and 5A. The two haplotypes showed contrasting effects on N uptake and use efficiency traits. The in silico and transcript analyses implicated candidate gene coding for cold shock protein. This gene was the most highly expressed gene under several stress conditions, including drought stress. Upon validation, these QTLs on 1B and 5A could be used as a diagnostic marker for NUE and drought tolerance screening in wheat.

Exploitation of Allelic Variation and Superior Haplotypes for OsMIT3 Regulating Tiller Number in Rice

Article

Full-text available

Sep 2023

Rice (Oryza sativa L.) is the staple food for more than 60 % of the population globally and it is consumed in various forms. Increased crop yield under variable climatic circumstances is necessary in light of the world population's rapid growth, yield plateaus, resource depletion, and climate change. In order to overcome these obstacles, novel genes and alleles in the rice gene pool must be found, and unique features like C4 photosynthesis must be modified. The negative effects of climate change, stagnated yields, and diminishing agricultural resources are major obstacles. Tillering is one of the important traits to be considered for increased rice crop production and productivity. The target gene OsMIT3 phenotypically shows higher tillering and is caused by strigolactone deficiency that directly linked to carotenoid biosynthesis pathway, signaling strigolactone genes towards rice tillering. In this study, 100 diverse accessions from Rice 3K-RG panel were selected and analyzed, which shows three significant SNPs and grouped into four haplotypes group with allelic combinations of TAT(H1), ACA(H2), TAA(H3) and ACT(H4). Among the four haplo-group, H3 shows higher mean value for both total number of tillers and productive tillers, which can be further considered as a source of breeding strategy for crop yield.

Meta-QTL and haplo-pheno analysis reveal superior haplotype combinations associated with low grain chalkiness under high temperature in rice

Article

Full-text available

Mar 2023

Chalk, an undesirable grain quality trait in rice, is primarily formed due to high temperatures during the grain-filling process. Owing to the disordered starch granule structure, air spaces and low amylose content, chalky grains are easily breakable during milling thereby lowering head rice recovery and its market price. Availability of multiple QTLs associated with grain chalkiness and associated attributes, provided us an opportunity to perform a meta-analysis and identify candidate genes and their alleles contributing to enhanced grain quality. From the 403 previously reported QTLs, 64 Meta-QTLs encompassing 5262 non-redundant genes were identified. MQTL analysis reduced the genetic and physical intervals and nearly 73% meta-QTLs were narrower than 5cM and 2Mb, revealing the hotspot genomic regions. By investigating expression profiles of 5262 genes in previously published datasets, 49 candidate genes were shortlisted on the basis of their differential regulation in at least two of the datasets. We identified non-synonymous allelic variations and haplotypes in 39 candidate genes across the 3K rice genome panel. Further, we phenotyped a subset panel of 60 rice accessions by exposing them to high temperature stress under natural field conditions over two Rabi cropping seasons. Haplo-pheno analysis uncovered haplotype combinations of two starch synthesis genes, GBSSI and SSIIa, significantly contributing towards the formation of grain chalk in rice. We, therefore, report not only markers and pre-breeding material, but also propose superior haplotype combinations which can be introduced using either marker-assisted breeding or CRISPR-Cas based prime editing to generate elite rice varieties with low grain chalkiness and high HRY traits.

Photoperiod Genes Contribute to Daylength-Sensing and Breeding in Rice

Article

Full-text available

Feb 2023

Rice (Oryza sativa L.), one of the most important food crops worldwide, is a facultative short-day (SD) plant in which flowering is modulated by seasonal and temperature cues. The photoperiodic molecular network is the core network for regulating flowering in rice, and is composed of photoreceptors, a circadian clock, a photoperiodic flowering core module, and florigen genes. The Hd1-DTH8-Ghd7-PRR37 module, a photoperiodic flowering core module, improves the latitude adaptation through mediating the multiple daylength-sensing processes in rice. However, how the other photoperiod-related genes regulate daylength-sensing and latitude adaptation remains largely unknown. Here, we determined that mutations in the photoreceptor and circadian clock genes can generate different daylength-sensing processes. Furthermore, we measured the yield-related traits in various mutants, including the main panicle length, grains per panicle, seed-setting rate, hundred-grain weight, and yield per panicle. Our results showed that the prr37, elf3-1 and ehd1 mutants can change the daylength-sensing processes and exhibit longer main panicle lengths and more grains per panicle. Hence, the PRR37, ELF3-1 and Ehd1 locus has excellent potential for latitude adaptation and production improvement in rice breeding. In summary, this study systematically explored how vital elements of the photoperiod network regulate daylength sensing and yield traits, providing critical information for their breeding applications.

Polymorphism analysis of the chloroplast and mitochondrial genomes in soybean

Article

Full-text available

Jan 2023
BMC PLANT BIOL

Background Soybean is an important protein- and oil-rich crop throughout the world. Much attention has been paid to its nuclear genome, which is bi-parentally inherited and associated with many important agronomical traits. However, less is known about the genomes of the semi-autonomous and essential organelles, chloroplasts and mitochondria, of soybean. Results Here, through analyzing the polymorphisms of these organelles in 2580 soybean accessions including 107 wild soybeans, we found that the chloroplast genome is more variable than the mitochondrial genome in terms of variant density. Consistent with this, more haplotypes were found in the chloroplast genome (44 haplotypes) than the mitochondrial genome (30 haplotypes). These haplotypes were distributed extremely unevenly with the top two haplotypes (CT1 and CT2 for chloroplasts, MT1 and MT2 for mitochondria) accounting for nearly 70 and 18% of cultivated soybean accessions. Wild soybeans also exhibited more diversity in organelle genomes, harboring 32 chloroplast haplotypes and 19 mitochondrial haplotypes. However, only a small percentage of cultivated soybeans shared cytoplasm with wild soybeans. In particular, the two most frequent types of cytoplasm (CT1/MT1, CT2/MT2) were missing in wild soybeans, indicating that wild soybean cytoplasm has been poorly exploited during breeding. Consistent with the hypothesis that soybean originated in China, we found that China harbors the highest cytoplasmic diversity in the world. The geographical distributions of CT1–CT3 and MT1–MT3 in Northeast China were not significantly different from those in Middle and South China. Two mitochondrial polymorphism sites, p.457333 (T > C) and p.457550 (G > A), were found to be heterozygous in most soybeans, and heterozygosity appeared to be associated with the domestication of cultivated soybeans from wild soybeans, the improvement of landraces to generate elite cultivated soybeans, and the geographic adaptation of soybean. Conclusions The haplotypes of thousands of soybean cultivars should be helpful in evaluating the impact of cytoplasm on soybean performance and in breeding cultivars with the desired cytoplasm. Mitochondrial heterozygosity might be related to soybean adaptation, and this hypothesis needs to be further investigated.

Superior Haplotypes for Early Root Vigor Traits in Rice Under Dry Direct Seeded Low Nitrogen Condition Through Genome Wide Association Mapping

Article

Full-text available

Jul 2022

Water and land resources have been aggressively exploited in the recent decades to meet the growing demands for food. The changing climate has prompted rice scientists and farmers of the tropics and subtropics to adopt the direct seeded rice (DSR) system. DSR system of rice cultivation significantly reduces freshwater consumption and labor requirements, while increasing system productivity, resource use efficiency, and reducing greenhouse gas emissions. Early root vigor is an essential trait required in an ideal DSR system of rice cultivation to ensure a good crop stand, adequate uptake of water, nutrients and compete with weeds. The aus subpopulation which is adapted for DSR was evaluated to understand the biology of early root growth under limited nitrogen conditions over two seasons under two-time points (14 and 28 days). The correlation study identified a positive association between shoot dry weight and root dry weight. The genome-wide association study was conducted on root traits of 14 and 28 days with 2 million single-nucleotide polymorphisms (SNPs) using an efficient mixed model. QTLs over a significant threshold of p < 0.0001 and a 10% false discovery rate were selected to identify genes involved in root growth related to root architecture and nutrient acquisition from 97 QTLs. Candidate genes under these QTLs were explored. On chromosome 4, around 30 Mbp are two important peptide transporters (PTR5 and PTR6) involved in mobilizing nitrogen in the root during the early vegetative stage. In addition, several P transporters and expansin genes with superior haplotypes are discussed. A novel QTL from 21.12 to 21.46 Mb on chromosome 7 with two linkage disequilibrium (LD) blocks governing root length at 14 days were identified. The QTLs/candidate genes with superior haplotype for early root vigor reported here could be explored further to develop genotypes for DSR conditions.

Combining Genome-Wide Association Study and Gene-Based Haplotype Analysis to Identify Candidate Genes for Alkali Tolerance at the Germination Stage in Rice

Article

Full-text available

Apr 2022

Salinity–alkalinity stress is one of the main abiotic factors limiting rice production worldwide. With the widespread use of rice direct seeding technology, it has become increasingly important to improve the tolerance to salinity–alkalinity of rice varieties at the germination stage. Although we have a more comprehensive understanding of salt tolerance in rice, the genetic basis of alkali tolerance in rice is still poorly understood. In this study, we measured seven germination-related traits under alkali stress and control conditions using 428 diverse rice accessions. The alkali tolerance levels of rice germplasms varied considerably during germination. Xian/indica accessions had generally higher tolerance to alkali stress than Geng/japonica accessions at the germination stage. Using genome-wide association analysis, 90 loci were identified as significantly associated with alkali tolerance. Eight genes (LOC_Os01g12000, LOC_Os03g60240, LOC_Os03g08960, LOC_Os04g41410, LOC_Os09g25060, LOC_Os11g35350, LOC_Os12g09350, and LOC_Os12g13300) were selected as important candidate genes for alkali tolerance based on the gene functional annotation and gene-CDS-haplotype analysis. According to the expression levels of LOC_Os09g25060 (OsWRKY76), it is likely to play a negative regulatory role in alkali tolerance during rice germination. An effective strategy for improving rice alkali tolerance may be to pyramid alkali-tolerant haplotypes of multiple candidate genes to obtain the optimal haplotype combination. Our findings may provide valuable genetic information and expand the use of alkali tolerance germplasm resources in rice molecular breeding to improve the alkali tolerance at the germination stage.

Haplotype Analysis of Chloroplast Genomes for Jujube Breeding

Article

Full-text available

Mar 2022

Jujube (family Rhamnaceae) is an important economic fruit tree in China. In this study, we reported 26 chloroplast (cp) sequences of jujube using Illumina paired-end sequencing. The sequence length of cp genome was 161, 367–161, 849 bp, which was composed of a large single-copy region (89053–89437 bp) and a small single-copy region (19356–19362 bp) separated by a pair of reverse repeat regions (26478–26533 bp). Each cp genome encodes the same 130 genes, including 112 unique genes, being quite conserved in genome structure and gene sequence. A total of 118 single base substitutions (SNPs) and 130 InDels were detected in 65 jujube accessions. Phylogenetic and haplotype network construction methods were used to analyze the origin and evolution of jujube and its sour-tasting relatives. We detected 32 effective haplotypes, consisting of 20 unique jujube haplotypes and 9 unique sour–jujube haplotypes. Compared with sour–jujube, jujube showed greater haplotype diversity at the chloroplast DNA level. To cultivate crisp and sweet fruit varieties featuring strong resistance, by combining the characteristics of sour-jujube and cultivated jujube, three hybrid combinations were suggested for reciprocal crosses: “Dongzao” × “Jingzao39,” “Dongzao” × “Jingzao60,” “Dongzao” × “Jingzao28.” This study provides the basis for jujube species’ identification and breeding, and lays the foundation for future research.

Multiple haplotype-based analyses provide genetic and evolutionary insights into tomato fruit weight and composition

Article

Full-text available

Jan 2022

Improving fruit quality traits such as metabolic composition remains a challenge for tomato breeders. To better understand the genetic architecture of these traits and decipher the demographic history of the loci controlling tomato quality traits, we applied an innovative approach using multiple haplotype-based analyses, aiming to test the potentials of haplotype based study in association and genomic prediction studies. We performed and compared haplotype vs SNP-based associations (hapQTL) with multi-locus mixed model (MLMM), focusing on tomato fruit weight and metabolite contents (i.e. sugars, organic acids and amino acids). Using a panel of 163 tomato accessions genotyped with 5995 SNPs, we detected a total of 784 haplotype blocks, with an average size of haplotype blocks ~58 kb. A total of 108 significant associations for 26 traits were detected thanks to Haplotype/SNP-based Bayes models. Haplotype-based Bayes model (97 associations) outperformed SNP-based Bayes model (50 associations) and MLMM (53 associations) in identifying marker-trait associations as well as in genomic prediction (especially for those traits with moderate to low heritability). To decipher the demographic history, we identified 24 positive selective sweeps using the integrated haplotype score (iHS). Most of the significant associations for tomato quality traits were located within selective sweeps (54.63% and 71.7% in hapQTL and MLMM models, respectively). Promising candidate genes were identified controlling tomato fruit weight and metabolite contents. We thus demonstrated the benefits of using haplotypes for evolutionary and genetic studies, providing novel insights into tomato quality improvement and breeding history.

Advances in gene editing without residual transgenes in plants

Article

Full-text available

Dec 2021

Transgene residuals in edited plants affect genetic analysis, pose off-target risks, and cause regulatory concerns. Several strategies have been developed to efficiently edit target genes without leaving any transgenes in plants. Some approaches directly address this issue by editing plant genomes with DNA-free reagents. On the other hand, DNA-based techniques require another step for ensuring plants are transgene-free. Fluorescent markers, pigments, and chemical treatments have all been employed as tools to distinguish transgenic plants from transgene-free plants quickly and easily. Moreover, suicide genes have been used to trigger self-elimination of transgenic plants, greatly improving the efficiency of isolating the desired transgene-free plants. Transgenes can also be excised from plant genomes using site-specific recombination, transposition or gene editing nucleases, providing a strategy for editing asexually produced plants. Finally, haploid induction coupled with gene editing may make it feasible to edit plants that are recalcitrant to transformation. Here, we evaluate the strengths and weaknesses of recently developed approaches for obtaining edited plants without transgene residuals.

Features and applications of haplotypes in crop breeding

Article

Full-text available

Nov 2021

Climate change with altered pest-disease dynamics and rising abiotic stresses threatens resource-constrained agricultural production systems worldwide. Genomics-assisted breeding (GAB) approaches have greatly contributed to enhancing crop breeding efficiency and delivering better varieties. Fast-growing capacity and affordability of DNA sequencing has motivated large-scale germplasm sequencing projects, thus opening exciting avenues for mining haplotypes for breeding applications. This review article highlights ways to mine haplotypes and apply them for complex trait dissection and in GAB approaches including haplotype-GWAS, haplotype-based breeding, haplotypes-assisted genomic selection. Improvement strategies that efficiently deploy superior haplotypes to hasten breeding progress will be key to safeguarding global food security.

ZmCCT regulates photoperiod-dependent flowering and response to stresses in maize

Article

Full-text available

Oct 2021
BMC PLANT BIOL

Background Appropriate flowering time is very important to the success of modern agriculture. Maize (Zea mays L.) is a major cereal crop, originated in tropical areas, with photoperiod sensitivity. Which is an important obstacle to the utilization of tropical/subtropical germplasm resources in temperate regions. However, the study on the regulation mechanism of photoperiod sensitivity of maize is still in the early stage. Although it has been previously reported that ZmCCT is involved in the photoperiod response and delays maize flowering time under long-day conditions, the underlying mechanism remains unclear. Results Here, we showed that ZmCCT overexpression delays flowering time and confers maize drought tolerance under LD conditions. Implementing the Gal4-LexA/UAS system identified that ZmCCT has a transcriptional inhibitory activity, while the yeast system showed that ZmCCT has a transcriptional activation activity. DAP-Seq analysis and EMSA indicated that ZmCCT mainly binds to promoters containing the novel motifs CAAAAATC and AAATGGTC. DAP-Seq and RNA-Seq analysis showed that ZmCCT could directly repress the expression of ZmPRR5 and ZmCOL9, and promote the expression of ZmRVE6 to delay flowering under long-day conditions. Moreover, we also demonstrated that ZmCCT directly binds to the promoters of ZmHY5, ZmMPK3, ZmVOZ1 and ZmARR16 and promotes the expression of ZmHY5 and ZmMPK3, but represses ZmVOZ1 and ZmARR16 to enhance stress resistance. Additionally, ZmCCT regulates a set of genes associated with plant development. Conclusions ZmCCT has dual functions in regulating maize flowering time and stress response under LD conditions. ZmCCT negatively regulates flowering time and enhances maize drought tolerance under LD conditions. ZmCCT represses most flowering time genes to delay flowering while promotes most stress response genes to enhance stress tolerance. Our data contribute to a comprehensive understanding of the regulatory mechanism of ZmCCT in controlling maize flowering time and stress response.

Allele mining for the grain number gene An-1 in rice (Oryza sativa L.)

Article

Full-text available

Oct 2021

Rice yield has attained a plateau and hence the enhancement of grain yield is indispensable to feed the growing population, which could be achieved by the identification of superior alleles in the existing germplasm. Any variation in the pleiotropic gene, An-1 (yield gene) leads to enhanced grain number and grain size in rice. Hence, the gene was chosen for analyzing the allelic diversity/haplotype variation with 150 lines of 3K RG panel which revealed that, the gene An-1 has 20 Single Nucleotide Polymorphisms and 10 INDELs encompassing both intronic and exonic regions. The genotypes were divided into four haplotypes in the combination of seven SNPs with the maximum number of genotypes in the first haplotype and the least number of genotypes in fourth haplotype. From the study, H1 was identified as a superior haplotype. The haplo-pheno analysis identified the superior donors viz., SIGARDIS, GENIT and DAMNOEUB KAUN KHMOM harbouring superior haplotype combinations, which may be further utilized in haplotype-based breeding for the development of high yielding rice varieties.

GAPIT Version 3: Boosting Power and Accuracy for Genomic Association and Prediction

Article

Full-text available

Sep 2021

Genome-wide association study (GWAS) and genomic prediction/selection (GP/GS) are the two essential enterprises in genomic research. Due to the great magnitude and complexity of genomic and phenotypic data, analytical methods and their associated software packages are frequently advanced. GAPIT is a widely-used genomic association and prediction integrated tool as an R package. The first version was released to the public in 2012 with the implementation of the general linear model (GLM), mixed linear model (MLM), compressed MLM (CMLM), and genomic best linear unbiased prediction (gBLUP). The second version was released in 2016 with several new implementations, including enriched CMLM (ECMLM) and settlement of MLMs under progressively exclusive relationship (SUPER). All the GWAS methods are based on the single-locus test. For the first time, in the current release of GAPIT, version 3 implemented three multi-locus test methods, including multiple loci mixed model (MLMM), fixed and random model circulating probability unification (FarmCPU), and Bayesian-information and linkage-disequilibrium iteratively nested keyway (BLINK). Additionally, two GP/GS methods were implemented based on CMLM (named compressed BLUP; cBLUP) and SUPER (named SUPER BLUP; sBLUP). These new implementations not only boost statistical power for GWAS and prediction accuracy for GP/GS, but also improve computing speed and increase the capacity to analyze big genomic data. Here, we document the current upgrade of GAPIT by describing the selection of the recently developed methods, their implementations, and potential impact. All documents, including source code, user manual, demo data, and tutorials, are freely available at the GAPIT website (http://zzlab.net/GAPIT).

Genome-Wide Analysis of Potassium Channel Genes in Rice: Expression of the OsAKT and OsKAT Genes under Salt Stress

Article

Full-text available

May 2021

Potassium (K+), as a vital element, is involved in regulating important cellular processes such as enzyme activity, cell turgor, and nutrient movement in plant cells, which affects plant growth and production. Potassium channels are involved in the transport and release of potassium in plant cells. In the current study, three OsKAT genes and two OsAKT genes, along with 11 nonre-dundant putative potassium channel genes in the rice genome, were characterized based on their physiochemical properties, protein structure, evolution, duplication, in silico gene expression, and protein–protein interactions. In addition, the expression patterns of OsAKTs and OsKATs were studied in root and shoot tissues under salt stress using real-time PCR in three rice culti-vars. K+ channel genes were found to have diverse functions and structures, and OsKATs showed high genetic divergence from other K+ channel genes. Furthermore, the Ka/Ks ratios of dupli-cated gene pairs from the K+ channel gene family in rice suggested that these genes underwent purifying selection. Among the studied K+ channel proteins, OsKAT1 and OsAKT1 were identi-fied as proteins with high potential N-glycosylation and phosphorylation sites, and LEU, VAL, SER, PRO, HIS, GLY, LYS, TYR, CYC, and ARG amino acids were predicted as the binding resi-dues in the ligand-binding sites of K+ channel proteins. Regarding the coexpression network and KEGG ontology results, several metabolic pathways, including sugar metabolism, purine me-tabolism, carbon metabolism, glycerophospholipid metabolism, monoterpenoid biosynthesis, and folate biosynthesis, were recognized in the coexpression network of K+ channel proteins. Based on the available RNA-seq data, the K+ channel genes showed differential expression levels in rice tissues in response to biotic and abiotic stresses. In addition, the real-time PCR results revealed that OsAKTs and OsKATs are induced by salt stress in root and shoot tissues of rice cul-tivars, and OsKAT1 was identified as a key gene involved in the rice response to salt stress. In the present study, we found that the repression of OsAKTs, OsKAT2, and OsKAT2 in roots was related to salinity tolerance in rice. Our findings provide valuable insights for further structural and functional assays of K+ channel genes in rice.

Superior haplotypes towards development of low glycemic index rice with preferred grain and cooking quality

Article

Full-text available

May 2021

Increasing trends in the occurrence of diabetes underline the need to develop low glycemic index (GI) rice with preferred grain quality. In the current study, a diverse set of 3 K sub-panel of rice consisting of 150 accessions was evaluated for resistant starch and predicted glycemic index, including nine other quality traits under transplanted situation. Significant variations were noticed among the accessions for the traits evaluated. Trait associations had shown that amylose content possess significant positive and negative association with resistant starch and predicted glycemic index. Genome-wide association studies with 500 K SNPs based on MLM model resulted in a total of 41 marker-trait associations (MTAs), which were further confirmed and validated with mrMLM multi-locus model. We have also determined the allelic effect of identified MTAs for 11 targeted traits and found favorable SNPs for 8 traits. A total of 11 genes were selected for haplo-pheno analysis to identify the superior haplotypes for the target traits where haplotypes ranges from 2 ( Os10g0469000 -GC) to 15 ( Os06g18720 -AC). Superior haplotypes for RS and PGI, the candidate gene Os06g11100 (H4-3.28% for high RS) and Os08g12590 (H13-62.52 as intermediate PGI). The identified superior donors possessing superior haplotype combinations may be utilized in Haplotype-based breeding to developing next-generation tailor-made high quality healthier rice varieties suiting consumer preference and market demand.

Designing Future Crops: Genomics-Assisted Breeding Comes of Age

Article

Full-text available

Apr 2021
TRENDS PLANT SCI

Over the past decade, genomics-assisted breeding (GAB) has been instrumental in harnessing the potential of modern genome resources and characterizing and exploiting allelic variation for germplasm enhancement and cultivar development. Sustaining GAB in the future (GAB 2.0) will rely upon a suite of new approaches that fast-track targeted manipulation of allelic variation for creating novel diversity and facilitate their rapid and efficient incorporation in crop improvement programs. Genomic breeding strategies that optimize crop genomes with accumulation of beneficial alleles and purging of deleterious alleles will be indispensable for designing future crops. In coming decades, GAB 2.0 is expected to play a crucial role in breeding more climate-smart crop cultivars with higher nutritional value in a cost-effective and timely manner.

Computational methods for chromosome-scale haplotype reconstruction

Article

Full-text available

Apr 2021
GENOME BIOL

Shilpa Garg

High-quality chromosome-scale haplotype sequences of diploid genomes, polyploid genomes, and metagenomes provide important insights into genetic variation associated with disease and biodiversity. However, whole-genome short read sequencing does not yield haplotype information spanning whole chromosomes directly. Computational assembly of shorter haplotype fragments is required for haplotype reconstruction, which can be challenging owing to limited fragment lengths and high haplotype and repeat variability across genomes. Recent advancements in long-read and chromosome-scale sequencing technologies, alongside computational innovations, are improving the reconstruction of haplotypes at the level of whole chromosomes. Here, we review recent and discuss methodological progress and perspectives in these areas.

Status and prospects of genome‐wide association studies in plants

Article

Full-text available

Jan 2021

Genome‐wide association studies (GWAS) have developed into a powerful and ubiquitous tool for the investigation of complex traits. In large part, this was fueled by advances in genomic technology, enabling us to examine genome‐wide genetic variants across diverse genetic materials. The development of the mixed model framework for GWAS dramatically reduced the number of false positives compared with naïve methods. Building on this foundation, many methods have since been developed to increase computational speed or improve statistical power in GWAS. These methods have allowed the detection of genomic variants associated with either traditional agronomic phenotypes or biochemical and molecular phenotypes. In turn, these associations enable applications in gene cloning and in accelerated crop breeding through marker assisted selection or genetic engineering. Current topics of investigation include rare‐variant analysis, synthetic associations, optimizing the choice of GWAS model, and utilizing GWAS results to advance knowledge of biological processes. Ongoing research in these areas will facilitate further advances in GWAS methods and their applications.

Haplotype analysis of data from UAV imagery of rice MAGIC population for the trait dissection of biomass and plant architecture

Article

Full-text available

Dec 2020

Unmanned aerial vehicles (UAVs) are popular tools for high-throughput phenotyping of crops in the field. However, their use for evaluation of individual lines is limited in crop breeding because research on what the UAV image data represent is still developing. Here, we investigated the connection between shoot biomass of rice plants and the vegetation fraction (VF) estimated from high-resolution orthomosaic images taken by a UAV 10 m above a field during the vegetative stage. Haplotype-based genome-wide association studies of multi-parental advanced generation inter-cross (MAGIC) lines revealed four QTL for VF. VF was correlated with shoot biomass, but the haplotype effect on VF was better correlated with that on shoot biomass at these QTL. Further genetic characterization revealed the relationships between these QTL and plant spreading habit, final shoot biomass and panicle weight. Thus, genetic analysis using high-throughput phenotyping data derived from low-altitude, high-resolution UAV images during early stage of rice in the field provides insight into plant growth, architecture, final biomass and yield.

A Meta-Analysis of Quantitative Trait Loci Associated with Multiple Disease Resistance in Rice (Oryza sativa L.)

Article

Full-text available

Nov 2020

Rice blast, sheath blight and bacterial leaf blight are major rice diseases found worldwide. The development of resistant cultivars is generally perceived as the most effective way to combat these diseases. Plant disease resistance is a polygenic trait where a combinatorial effect of major and minor genes affects this trait. To locate the source of this trait, various quantitative trait loci (QTL) mapping studies have been performed in the past two decades. However, investigating the congruency between the reported QTL is a daunting task due to the heterogeneity amongst the QTLs studied. Hence, the aim of our study is to integrate the reported QTLs for resistance against rice blast, sheath blight and bacterial leaf blight and objectively analyze and consolidate the location of QTL clusters in the chromosomes, reducing the QTL intervals and thus identifying candidate genes within the selected meta-QTL. A total of twenty-seven studies for resistance QTLs to rice blast (8), sheath blight (15) and bacterial leaf blight (4) was compiled for QTL projection and analyses. Cumulatively, 333 QTLs associated with rice blast (114), sheath blight (151) and bacterial leaf blight (68) resistance were compiled, where 303 QTLs could be projected onto a consensus map saturated with 7633 loci. Meta-QTL analysis on 294 QTLs yielded 48 meta-QTLs, where QTLs with membership probability lower than 60% were excluded, reducing the number of QTLs within the meta-QTL to 274. Further, three meta-QTL regions (MQTL2.5, MQTL8.1 and MQTL9.1) were selected for functional analysis on the basis that MQTL2.5 harbors the highest number of QTLs; meanwhile, MQTL8.1 and MQTL9.1 have QTLs associated with all three diseases mentioned above. The functional analysis allows for determination of enriched gene ontology and resistance gene analogs (RGAs) and other defense-related genes. To summarize, MQTL2.5, MQTL8.1 and MQTL9.1 have a considerable number of R-genes that account for 10.21%, 4.08% and 6.42% of the total genes found in these meta-QTLs, respectively. Defense genes constitute around 3.70%, 8.16% and 6.42% of the total number of genes in MQTL2.5, MQTL8.1 and MQTL9.1, respectively. This frequency is higher than the total frequency of defense genes in the rice genome, which is 0.0096% (167 defense genes/17,272 total genes). The integration of the QTLs facilitates the identification of QTL hotspots for rice blast, sheath blight and bacterial blight resistance with reduced intervals, which helps to reduce linkage drag in breeding. The candidate genes within the promising regions could be utilized for improvement through genetical engineering.

Discovery of beneficial haplotypes for complex traits in maize landraces

Article

Full-text available

Oct 2020

Genetic variation is of crucial importance for crop improvement. Landraces are valuable sources of diversity, but for quantitative traits efficient strategies for their targeted utilization are lacking. Here, we map haplotype-trait associations at high resolution in ~1000 doubled-haploid lines derived from three maize landraces to make their native diversity for early development traits accessible for elite germplasm improvement. A comparative genomic analysis of the discovered haplotypes in the landrace-derived lines and a panel of 65 breeding lines, both genotyped with 600k SNPs, points to untapped beneficial variation for target traits in the landraces. The superior phenotypic performance of lines carrying favorable landrace haplotypes as compared to breeding lines with alternative haplotypes confirms these findings. Stability of haplotype effects across populations and environments as well as their limited effects on undesired traits indicate that our strategy has high potential for harnessing beneficial haplotype variation for quantitative traits from genetic resources.

CRISPR-Mediated Engineering across the Central Dogma in Plant Biology for Basic Research and Crop Improvement

Article

Full-text available

Jan 2021
MOL PLANT

The central dogma (CD) of molecular biology constitutes the transfer of genetic information from DNA to RNA to protein. Major CD processes governing genetic flow include the cell cycle, DNA replication, chromosome packaging, epigenetic changes, transcription, posttranscriptional alterations, translation, and posttranslational modifications. The CD processes are tightly regulated in plants to maintain their genetic integrity throughout the life cycle as well as to pass the genetic material to the next generation. Engineering of various CD processes involved in gene regulation will accelerate crop improvement to feed the growing world population. CRISPR technology enables programmable editing of CD processes to alter DNA, RNA, or protein, which would have been impossible in the past. Here, an overview of recent advancements in CRISPR tool development and CRISPR-based CD modulations that expedite basic and applied plant research is provided. Furthermore, CRISPR applications in major thriving areas, such as gene discovery (allele mining and cryptic gene activation), introgression (de novo domestication and haploid induction), and application of desired traits beneficial to farmers or consumers (biotic/abiotic stress-resilient crops, plant cell factories, and delayed senescence), are described. Finally, the global regulatory policies, challenges, and prospects for CRISPR-mediated crop improvement are summarized.

Alterations in Stomatal Response to Fluctuating Light Increase Biomass and Yield of Rice under Drought Conditions

Article

Full-text available

Oct 2020
PLANT J

The acceleration of stomatal closure upon high to low light transition could improve plant water use efficiency and drought tolerance. Herein, using genome-wide association study, we showed that the genetic variation in OsNHX1 was strongly associated with the changes in τcl , the time constant of stomatal closure, in 206 rice accessions. OsNHX1 overexpression in rice resulted in a decrease in τcl , and an increase in biomass, grain yield under drought. Conversely, OsNHX1 knockout by CRISPR/CAS9 shows opposite trends for these traits. We further found three haplotypes spanning the OsNHX1 promoter and CDS regions. Two among them, HapII and HapIII, were found to be associated with a high and low τcl , respectively. A near-isogenic line (NIL, S464) was developed through replacing the genomic region harboring HapII (~10 kb) from MH63 (recipient) rice cultivar by the same sized genomic region containing Hap III from 02428 (donor). Compared with MH63, S464 shows a reduction by 35% in τcl and an increase by 40% in the grain yield under drought. However, under normal conditions, S464 maintains closely similar grain yield as MH63. The global distribution of the two OsNHX1 haplotypes is associated with the local precipitation. Taken together, the natural variation in OsNHX1 could be utilized to manipulate the stomatal dynamics for an improved rice drought tolerance.

qqman: an R package for visualizing GWAS results using Q-Q and manhattan plots

Preprint

Full-text available

May 2014

Stephen D. Turner

Genome-wide association studies (GWAS) have identified thousands of human trait-associated single nucleotide polymorphisms. Here, I describe a freely available R package for visualizing GWAS results using Q-Q and manhattan plots. The qqman package enables the flexible creation of manhattan plots, both genome-wide and for single chromosomes, with optional highlighting of SNPs of interest. Availability: qqman is released under the GNU General Public License, and is freely available on the Comprehensive R Archive Network (http://cran.r-project.org/package=qqman). The source code is available on GitHub (https://github.com/stephenturner/qqman).

Superior haplotypes for haplotype‐based breeding for drought tolerance in pigeonpea (Cajanus cajan L.)

Article

Full-text available

Jun 2020
PLANT BIOTECHNOL J

Haplotype‐based breeding, a recent promising breeding approach to develop tailored made crop varieties deals with identification of superior haplotypes and their deployment in breeding programs. In this context, whole genome re‐sequencing data of 292 genotypes from pigeonpea reference set was mined to identify the superior haplotypes for 10 drought‐responsive candidate genes. A total of 83, 132 and 60 haplotypes were identified in breeding lines, landraces and wild species, respectively. Candidate gene‐based association analysis of these 10 genes on a subset of 137 accessions of the pigeonpea reference set revealed 23 strong marker‐trait associations (MTAs) in five genes influencing seven drought responsive component traits. Haplo‐pheno analysis for the strongly associated genes resulted in the identification of most promising haplotypes for three genes regulating five component drought traits. Gene C. cajan_23080 ‐H2 was found to be the superior most haplotype in terms of plant weight (PW), fresh weight (FW) and turgid weight (TW), C.cajan_30211‐H6 for PW, FW, TW and dry weight (DW), C.cajan_26230‐H11 for FW and DW and C.cajan_26230‐H5 for relative water content (RWC) under drought stress condition. The identified donor genotypes possessing superior haplotype combinations may be utilized in haplotype‐based breeding to develop next‐generation tailor‐made better drought responsive pigeonpea cultivars.

Transcriptomics of Mature Rice (Oryza Sativa L. Koshihikari) Seed under Hot Conditions by DNA Microarray Analyses

Article

Full-text available

May 2020

Bakku Ranjith Kumar

Higher temperature conditions during the final stages of rice seed development (seed filling and maturation) are known to cause damage to both rice yield and rice kernel quality. The western and central parts of Japan especially have seen record high temperatures during the past decade, resulting in the decrease of rice kernel quality. In this study, we looked at the rice harvested from a town in the central Kanto-plains (Japan) in 2010. The daytime temperatures were above the critical limits ranging from 34 to 38 °C at the final stages of seed development and maturity allowing us to investigate high-temperature effects in the actual field condition. Three sets of dry mature rice seeds (commercial), each with specific quality standards, were obtained from Japan Agriculture (JA Zen-Noh) branch in Ami-town of Ibaraki Prefecture in September 2010: grade 1 (top quality, labeled as Y1), grade 2 (medium quality, labeled as Y2), and grade 3 (out-of-grade or low quality, labeled as Y3). The research objective was to examine particular alterations in genome-wide gene expression in grade 2 (Y2) and grade 3 (Y3) seeds compared to grade 1 (Y1). We followed the high-temperature spike using a high-throughput omics-approach DNA microarray (Agilent 4 × 44 K rice oligo DNA chip) in conjunction with MapMan bioinformatics analysis. As expected, rice seed quality analysis revealed low quality in Y3 > Y2 over Y1 in taste, amylose, protein, and fatty acid degree, but not in water content. Differentially expressed gene (DEG) analysis from the transcriptomic profiling data revealed that there are more than one hundred upregulated (124 and 373) and downregulated (106 and 129) genes in Y2 (grade 2 rice seed) and Y3 (grade 3 rice seed), respectively. Bioinformatic analysis of DEGs selected as highly regulated differentially expressed (HRDE) genes revealed changes in function of genes related to metabolism, defense/stress response, fatty acid biosynthesis, and hormones. This research provides, for the first time, the seed transcriptome profile for the classified low grades (grade 2, and out-of-grade; i.e., grade 3) of rice under high-temperature stress condition

QTL mapping and haplotype analysis revealed candidate genes for grain thickness in rice (Oryza sativa L.)

Article

Full-text available

May 2020
MOL BREEDING

Grain size and shape are important for grain yield and quality in rice (Oryza sativa L.). Grain thickness (GT) is one of the constituent characteristics of grain shape and has a strong influence on grain weight in rice. In this study, a japonica cultivar (02428) with a thick grain phenotype was crossed with an indica cultivar (YZX) with a thin grain phenotype to construct a set of 192 recombinant inbred lines (RILs). Quantitative trait locus (QTL) analysis was performed with a high-density genetic map harbouring 2711 bin markers. We identified 6 QTLs for GT in three environments and two stable QTLs with high heritability residing on the known grain width (GW) genes GW5 and GL7/GW7, verifying the effectiveness and reliability of our study. We detected a novel QTL on chromosome 9 with strong association signals for both GW and GT. Using transcript profiles and quantitative RT-PCR, we identified a candidate gene, Os09g0535500, which encodes a zinc finger C3HC4-type domain, that showed the highest expression level and significant differential expression between parents in the development of young panicle and grain. Further haplotype-phenotype association analysis revealed that Os09g0535500 was strongly correlated with GW and GT. Thus, our results indicated that Os09g0535500 was the most promising candidate gene in WTG9 and is likely important for rice grain development, which lays a foundation for further functional validation and breeding utilization.

Role of New Plant Breeding Technologies for Food Security and Sustainable Agricultural Development

Article

Full-text available

Apr 2020

Matin Qaim

New plant breeding technologies (NPBTs), including genetically modified and gene-edited crops, offer large potentials for sustainable agricultural development and food security while addressing shortcomings of the Green Revolution. This article reviews potentials, risks, and actually observed impacts of NPBTs. Regulatory aspects are also discussed. While the science is exciting and some clear benefits are already observable, overregulation and public misperceptions may obstruct efficient development and use of NPBTs. Overregulation is particularly observed in Europe, but also affects developing countries in Africa and Asia, which could benefit the most from NPBTs. Regulatory reforms and a more science-based public debate are required.

A platinum standard pan-genome resource that represents the population structure of Asian rice

Article

Full-text available

Apr 2020

As the human population grows from 7.8 billion to 10 billion over the next 30 years, breeders must do everything possible to create crops that are highly productive and nutritious, while simultaneously having less of an environmental footprint. Rice will play a critical role in meeting this demand and thus, knowledge of the full repertoire of genetic diversity that exists in germplasm banks across the globe is required. To meet this demand, we describe the generation, validation and preliminary analyses of transposable element and long-range structural variation content of 12 near-gap-free reference genome sequences (RefSeqs) from representatives of 12 of 15 subpopulations of cultivated Asian rice. When combined with 4 existing RefSeqs, that represent the 3 remaining rice subpopulations and the largest admixed population, this collection of 16 Platinum Standard RefSeqs (PSRefSeq) can be used as a template to map resequencing data to detect virtually all standing natural variation that exists in the pan-genome of cultivated Asian rice.

ShinyGO: a graphical enrichment tool for animals and plants

Article

Full-text available

Dec 2019
BIOINFORMATICS

Motivation: Gene lists are routinely produced from various genome-wide studies. Enrichment analysis can link these gene lists with underlying molecular pathways and functional categories such as gene ontology (GO) and other databases. Results: To complement existing tools, we developed ShinyGO based on a large annotation database derived from Ensembl and STRING-db for 59 plant, 256 animal, 115 archaeal, and 1678 bacterial species. ShinyGO's novel features include graphical visualization of enrichment results and gene characteristics, and application program interface (API) access to KEGG and STRING for the retrieval of pathway diagrams and protein-protein interaction networks. ShinyGO is an intuitive, graphical web application that can help researchers gain actionable insights from gene lists. Availability: http://ge-lab.org/go/. Supplementary information: Supplementary data are available at Bioinformatics online.

Rice Stress-Resistant SNP Database

Article

Full-text available

Dec 2019

Background: Rice (Oryza sativa L.) yield is limited inherently by environmental stresses, including biotic and abiotic stresses. Thus, it is of great importance to perform in-depth explorations on the genes that are closely associated with the stress-resistant traits in rice. The existing rice SNP databases have made considerable contributions to rice genomic variation information but none of them have a particular focus on integrating stress-resistant variation and related phenotype data into one web resource. Results: Rice Stress-Resistant SNP database (http://bioinformatics.fafu.edu.cn/RSRS) mainly focuses on SNPs specific to biotic and abiotic stress-resistant ability in rice, and presents them in a unified web resource platform. The Rice Stress-Resistant SNP (RSRS) database contains over 9.5 million stress-resistant SNPs and 797 stress-resistant candidate genes in rice, which were detected from more than 400 stress-resistant rice varieties. We incorporated the SNPs function, genome annotation and phenotype information into this database. Besides, the database has a user-friendly web interface for users to query, browse and visualize a specific SNP efficiently. RSRS database allows users to query the SNP information and their relevant annotations for individual variety or more varieties. The search results can be visualized graphically in a genome browser or displayed in formatted tables. Users can also align SNPs between two or more rice accessions. Conclusion: RSRS database shows great utility for scientists to further characterize the function of variants related to environmental stress-resistant ability in rice.

Haplotype block analysis of an Argentinean hexaploid wheat collection and GWAS for yield components and adaptation

Article

Full-text available

Dec 2019
BMC PLANT BIOL

Background: Increasing wheat (Triticum aestivum L.) production is required to feed a growing human population. In order to accomplish this task a deeper understanding of the genetic structure of cultivated wheats and the detection of genomic regions significantly associated with the regulation of important agronomic traits are necessary steps. To better understand the genetic basis and relationships of adaptation and yield related traits, we used a collection of 102 Argentinean hexaploid wheat cultivars genotyped with the 35k SNPs array, grown from two to six years in three different locations. Based on SNPs data and gene-related molecular markers, we performed a haplotype block characterization of the germplasm and a genome-wide association study (GWAS). Results: The genetic structure of the collection revealed four subpopulations, reflecting the origin of the germplasm used by the main breeding programs in Argentina. The haplotype block characterization showed 1268 blocks of different sizes spread along the genome, including highly conserved regions like the 1BS chromosome arm where the 1BL/1RS wheat/rye translocation is located. Based on GWAS we identified ninety-seven chromosome regions associated with heading date, plant height, thousand grain weight, grain number per spike and fruiting efficiency at harvest (FEh). In particular FEh stands out as a promising trait to raise yield potential in Argentinean wheats; we detected fifteen haplotypes/markers associated with increased FEh values, eleven of which showed significant effects in all three evaluated locations. In the case of adaptation, the Ppd-D1 gene is consolidated as the main determinant of the life cycle of Argentinean wheat cultivars. Conclusion: This work reveals the genetic structure of the Argentinean hexaploid wheat germplasm using a wide set of molecular markers anchored to the Ref Seq v1.0. Additionally GWAS detects chromosomal regions (haplotypes) associated with important yield and adaptation components that will allow improvement of these traits through marker-assisted selection.

Super-Pangenome by Integrating the Wild Side of a Species for Accelerated Crop Improvement

Article

Full-text available

Nov 2019
TRENDS PLANT SCI

The pangenome provides genomic variations in the cultivated gene pool for a given species. However, as the crop's gene pool comprises many species, especially wild relatives with diverse genetic stock, here we suggest using accessions from all available species of a given genus for the development of a more comprehensive and complete pangenome, which we refer to as a super-pangenome. The super-pangenome provides a complete genomic variation repertoire of a genus and offers unprecedented opportunities for crop improvement. This opinion article focuses on recent developments in crop pangenomics, the need for a super-pangenome that should include wild species, and its application for crop improvement.

Accurate, scalable and integrative haplotype estimation

Article

Full-text available

Nov 2019

The number of human genomes being genotyped or sequenced increases exponentially and efficient haplotype estimation methods able to handle this amount of data are now required. Here we present a method, SHAPEIT4, which substantially improves upon other methods to process large genotype and high coverage sequencing datasets. It notably exhibits sub-linear running times with sample size, provides highly accurate haplotypes and allows integrating external phasing information such as large reference panels of haplotypes, collections of pre-phased variants and long sequencing reads. We provide SHAPEIT4 in an open source format and demonstrate its performance in terms of accuracy and running times on two gold standard datasets: the UK Biobank data and the Genome In A Bottle. Haplotype information inferred by phasing is useful in genetic and genomic analysis. Here, the authors develop SHAPEIT4, a phasing method that exhibits sub-linear running time, provides accurate haplotypes and enables integration of external phasing information.

A resource-efficient tool for mixed model association analysis of large-scale data

Article

Full-text available

Dec 2019
Nat Genet

The genome-wide association study (GWAS) has been widely used as an experimental design to detect associations between genetic variants and a phenotype. Two major confounding factors, population stratification and relatedness, could potentially lead to inflated GWAS test statistics and hence to spurious associations. Mixed linear model (MLM)-based approaches can be used to account for sample structure. However, genome-wide association (GWA) analyses in biobank samples such as the UK Biobank (UKB) often exceed the capability of most existing MLM-based tools especially if the number of traits is large. Here, we develop an MLM-based tool (fastGWA) that controls for population stratification by principal components and for relatedness by a sparse genetic relationship matrix for GWA analyses of biobank-scale data. We demonstrate by extensive simulations that fastGWA is reliable, robust and highly resource-efficient. We then apply fastGWA to 2,173 traits on array-genotyped and imputed samples from 456,422 individuals and to 2,048 traits on whole-exome-sequenced samples from 46,191 individuals in the UKB. fastGWA is a mixed linear model–based approach for performing genome-wide association analyses at biobank scale, while controlling for population stratification and relatedness.

Improved polygenic prediction by Bayesian multiple regression on summary statistics

Article

Full-text available

Nov 2019

Accurate prediction of an individual’s phenotype from their DNA sequence is one of the great promises of genomics and precision medicine. We extend a powerful individual-level data Bayesian multiple regression model (BayesR) to one that utilises summary statistics from genome-wide association studies (GWAS), SBayesR. In simulation and cross-validation using 12 real traits and 1.1 million variants on 350,000 individuals from the UK Biobank, SBayesR improves prediction accuracy relative to commonly used state-of-the-art summary statistics methods at a fraction of the computational resources. Furthermore, using summary statistics for variants from the largest GWAS meta-analysis (n ≈ 700, 000) on height and BMI, we show that on average across traits and two independent data sets that SBayesR improves prediction R2 by 5.2% relative to LDpred and by 26.5% relative to clumping and p value thresholding. Various approaches are being used for polygenic prediction including Bayesian multiple regression methods that require access to individual-level genotype data. Here, the authors extend BayesR to utilise GWAS summary statistics (SBayesR) and show that it outperforms other summary statistic-based methods.

Exploring effective approaches for haplotype block phasing

Article

Full-text available

Oct 2019
BMC BIOINFORMATICS

Background: Knowledge of phase, the specific allele sequence on each copy of homologous chromosomes, is increasingly recognized as critical for detecting certain classes of disease-associated mutations. One approach for detecting such mutations is through phased haplotype association analysis. While the accuracy of methods for phasing genotype data has been widely explored, there has been little attention given to phasing accuracy at haplotype block scale. Understanding the combined impact of the accuracy of phasing tool and the method used to determine haplotype blocks on the error rate within the determined blocks is essential to conduct accurate haplotype analyses. Results: We present a systematic study exploring the relationship between seven widely used phasing methods and two common methods for determining haplotype blocks. The evaluation focuses on the number of haplotype blocks that are incorrectly phased. Insights from these results are used to develop a haplotype estimator based on a consensus of three tools. The consensus estimator achieved the most accurate phasing in all applied tests. Individually, EAGLE2, BEAGLE and SHAPEIT2 alternate in being the best performing tool in different scenarios. Determining haplotype blocks based on linkage disequilibrium leads to more correctly phased blocks compared to a sliding window approach. We find that there is little difference between phasing sections of a genome (e.g. a gene) compared to phasing entire chromosomes. Finally, we show that the location of phasing error varies when the tools are applied to the same data several times, a finding that could be important for downstream analyses. Conclusions: The choice of phasing and block determination algorithms and their interaction impacts the accuracy of phased haplotype blocks. This work provides guidance and evidence for the different design choices needed for analyses using haplotype blocks. The study highlights a number of issues that may have limited the replicability of previous haplotype analysis.

Enhancing Genetic Gain through Genomic Selection: From Livestock to Plants

Article

Full-text available

Oct 2019

Although long-term genetic gain has been achieved through increasing use of modern breeding methods and technologies, the rate of genetic gain needs to be accelerated to meet humanity's demand for agricultural products. In this regard, genomic selection (GS) has been considered most promising for genetic improvement of the complex traits controlled by many genes each with minor effects. Livestock scientists pioneered GS application largely due to livestock's significantly higher individual values and the greater reduction in generation interval that can be achieved in GS. Large-scale application of GS in plants can be achieved by refining field management to improve heritability estimation and prediction accuracy and developing optimum GS models with the consideration of genotype-by-environment interaction and non-additive effects, along with significant cost reduction. Moreover, it would be more effective to integrate GS with other breeding tools and platforms for accelerating the breeding process and thereby further enhancing genetic gain. In addition, establishing an open-source breeding network and developing transdisciplinary approaches would be essential in enhancing breeding efficiency for small- and medium-sized enterprises and agricultural research systems in developing countries. New strategies centered on GS for enhancing genetic gain need to be developed.

RiceRelativesGD: A genomic database of rice relatives for rice research

Article

Full-text available

Jan 2019

Rice (Oryza sativa L.) is one of the most important crops worldwide. Its relatives, including phylogenetically related species of rice and paddy weeds with a similar ecological niche, can provide crucial genetic resources (such as resistance to biotic and abiotic stresses and high photosynthetic efficiency) for rice research. Although many rice genomic databases have been constructed, a database providing large-scale curated genomic data from rice relatives and offering specific gene resources is still lacking. Here, we present RiceRelativesGD, a user-friendly genomic database of rice relatives. RiceRelativesGD integrates large-scale genomic resources from 2 cultivated rice and 11 rice relatives, including 208 321 specific genes and 13 643 genes related to photosynthesis and responsive to external stimuli. Diverse bioinformatics tools are embedded in the database, which allow users to search, visualize and download the information of interest. To our knowledge, this is the first genomic database providing a centralized genetic resource of rice relatives. RiceRelativesGD will serve as a significant and comprehensive knowledgebase for the rice community.

Comparative analysis of chloroplast genomes of two Chinese local citrus varieties and haplotype analysis with other citrus species

Article

Nov 2023
S AFR J BOT

Unraveling the contribution of OsSOS2 in conferring salinity and drought tolerance in a high‐yielding rice

Article

Jan 2022

Abiotic stresses are emerging as a potential threat to sustainable agriculture worldwide. Soil salinity and drought will be the major limiting factors for rice productivity in years to come. The Salt Overly Sensitive (SOS) pathway plays a key role in salinity tolerance by maintaining the cellular ion homeostasis, with SOS2, a S/T kinase, being a vital component. The present study investigated the role of the OsSOS2, a SOS2 homolog from rice, in improving salinity and drought tolerance. Transgenic plants with either overexpression (OE) or knockdown (KD) of OsSOS2 were raised in one of the high‐yielding cultivars of rice – IR64. Using a combined approach based on physiological, biochemical, anatomical, microscopic, molecular, and agronomic assessment, the evidence presented in this study advocates the role of OsSOS2 in improving salinity and drought tolerance in rice. The OE plants were found to have favourable ion and redox homeostasis when grown in the presence of salinity, while the KD plants showed the reverse pattern. Several key stress‐responsive genes were found to work in an orchestrated manner to contribute to this phenotype. Notably, the OE plants showed tolerance to stress at both the seedling and the reproductive stages, addressing the two most sensitive stages of the plant. Keeping in mind the importance of developing crops pants with tolerance to multiple stresses, the present study established the potential of OsSOS2 for biotechnological applications to improve salinity and drought stress tolerance in diverse cultivars of rice.

Gramene: A Resource for Comparative Analysis of Plants Genomes and Pathways

Chapter

Jan 2022

Gramene is an integrated bioinformatics resource for accessing, visualizing, and comparing plant genomes and biological pathways. Originally targeting grasses, Gramene has grown to host annotations for over 90 plant genomes including agronomically important cereals (e.g., maize, sorghum, wheat, teff), fruits and vegetables (e.g., apple, watermelon, clementine, tomato, cassava), specialty crops (e.g., coffee, olive tree, pistachio, almond), and plants of special or emerging interest (e.g., cotton, tobacco, cannabis, or hemp). For some species, the resource includes multiple varieties of the same species, which has paved the road for the creation of species-specific pan-genome browsers. The resource also features plant research models, including Arabidopsis and C4 warm-season grasses and brassicas, as well as other species that fill phylogenetic gaps for plant evolution studies. Its strength derives from the application of a phylogenetic framework for genome comparison and the use of ontologies to integrate structural and functional annotation data. This chapter outlines system requirements for end-users and database hosting, data types and basic navigation within Gramene, and provides examples of how to (1) explore Gramene’s search results, (2) explore gene-centric comparative genomics data visualizations in Gramene, and (3) explore genetic variation associated with a gene locus. This is the first publication describing in detail Gramene’s integrated search interface—intended to provide a simplified entry portal for the resource’s main data categories (genomic location, phylogeny, gene expression, pathways, and external references) to the most complete and up-to-date set of plant genome and pathway annotations.

Haplotype reconstruction in connected tetraploid F1 populations

Article

Jul 2021
GENETICS

In diploid species, many multiparental populations have been developed to increase genetic diversity and quantitative trait loci (QTL) mapping resolution. In these populations, haplotype reconstruction has been used as a standard practice to increase the power of QTL detection in comparison with the marker-based association analysis. However, such software tools for polyploid species are few and limited to a single biparental F1 population. In this study, a statistical framework for haplotype reconstruction has been developed and implemented in the software PolyOrigin for connected tetraploid F1 populations with shared parents, regardless of the number of parents or mating design. Given a genetic or physical map of markers, PolyOrigin first phases parental genotypes, then refines the input marker map, and finally reconstructs offspring haplotypes. PolyOrigin can utilize single nucleotide polymorphism (SNP) data coming from arrays or from sequence-based genotyping; in the latter case, bi-allelic read counts can be used (and are preferred) as input data to minimize the influence of genotype calling errors at low depth. With extensive simulation we show that PolyOrigin is robust to the errors in the input genotypic data and marker map. It works well for various population designs with ≥30 offspring per parent and for sequences with read depth as low as 10x. PolyOrigin was further evaluated using an autotetraploid potato dataset with a 3 × 3 half-diallel mating design. In conclusion, PolyOrigin opens up exciting new possibilities for haplotype analysis in tetraploid breeding populations.

Genome-wide association studies

Article

Dec 2021

Genome-wide association studies (GWAS) test hundreds of thousands of genetic variants across many genomes to find those statistically associated with a specific trait or disease. This methodology has generated a myriad of robust associations for a range of traits and diseases, and the number of associated variants is expected to grow steadily as GWAS sample sizes increase. GWAS results have a range of applications, such as gaining insight into a phenotype’s underlying biology, estimating its heritability, calculating genetic correlations, making clinical risk predictions, informing drug development programmes and inferring potential causal relationships between risk factors and health outcomes. In this Primer, we provide the reader with an introduction to GWAS, explaining their statistical basis and how they are conducted, describe state-of-the art approaches and discuss limitations and challenges, concluding with an overview of the current and future applications for GWAS results. Uffelmann et al. describe the key considerations and best practices for conducting genome-wide association studies (GWAS), techniques for deriving functional inferences from the results and applications of GWAS in understanding disease risk and trait architecture. The Primer also provides information on the best practices for data sharing and discusses important ethical considerations when considering GWAS populations and data.

Pyramiding superior haplotypes and epistatic alleles to accelerate wood quality and yield improvement in poplar breeding

Article

Nov 2021
IND CROP PROD

Genetic improvement of woody bioenergy crops is essential to maximize the genetic gain of industrially useful products. Conventional selective breeding to improve complex traits is laborious and time-consuming. A molecular marker-assisted selection approach based on multi-omics genetic dissection and pyramiding gene module utilization has not been developed for woody industrial crops. We initially identified 80 correlated genes enriched in co-expression modules that sustainably participate in lignocellulosic biosynthesis during various developmental periods in Populus. Using an adult germplasm population (15 years old, 435 accessions) of Populus tomentosa, we integrated association mapping, expression quantitative trait loci, and epistasis analyses to reveal the pleiotropy of causative genes within the core co-expression modules, such as ALG14, GSL8, SMT1, and IRX15-L.2, that drive natural variations in gene expression and harvested wood traits. We further pyramided two superior haplotypes and one desirable epistatic allele (Hap_01PtoSMT1+Hap_01PtoALG14+SNP2PtoIRX15-L.2) to alleviate linkage drag in multi-trait selection and improve industrial pulpwood products. Early-period selection study and genetic interpretation of the allelic differences of candidate elite trees in a juvenile germplasm population (5 years old, 435 accessions) demonstrated the accuracy and efficiency of our haplotype-based selection method relative to phenotype-based selection. This study provides novel insights into the improvement of industrial wood traits and also facilitates the rational breeding of perennial trees.

The Pigeon Pea CcCIPK14‐CcCBL1 Pair Positively Modulates Drought Tolerance by Enhancing Flavonoid Biosynthesis

Article

Mar 2021

Calcineurin B‐like (CBL)‐interacting protein kinases (CIPKs) play a central role in Ca²⁺ signalling and promote drought tolerance in plants. The CIPK gene family in pigeon pea (Cajanus cajan L.), a major food crop affected by drought, has not previously been characterised. Here, we identified 28 CIPK genes in the pigeon pea genome. Five CcCIPK genes were strongly upregulated in roots upon drought treatment and were selected for further characterisation. Overexpression of CcCIPK13 and CcCIPK14 increased survival rates by 2‐ to 3‐fold relative to controls after 14 days of drought. Furthermore, the three major flavonoids, genistin, genistein, and apigenin, were significantly upregulated in the same transgenic plants. Using CcCIPK14 as bait, we performed a yeast two‐hybrid screen and identified six interactors, including CcCBL1. CcCIPK14 exhibited autophosphorylation and phosphorylation of CcCBL1 in vitro. CcCBL1‐overexpressed plants displayed higher survival rates upon drought stress as well as higher expression of flavonoid biosynthetic genes and flavonoid content. CcCIPK14‐overexpressed plants in which CcCBL1 transcript levels were reduced by RNA interference had lower survival rates, which indicated CcCBL1 in the same pathway as CcCIPK14. Together, our results demonstrate a role for the CcCIPK14‐CcCBL1 complex in drought stress tolerance through the regulation of flavonoid biosynthesis in pigeon pea.

SbWRKY30 enhances the drought tolerance of plants and regulates a drought stress-responsive gene, SbRD19, in sorghum

Article

Feb 2020
J PLANT PHYSIOL

WRKY transcription factors have been suggested to play important roles in response and adaptation to drought stress. However, how sorghum WRKY transcription factors function in drought stress is still unclear. Here, we identify a WRKY transcription factor of sorghum, SbWRKY30, which is induced significantly by drought stress. SbWRKY30 is mainly expressed in sorghum taproot and leaf. SbWRKY30 has transcriptional activation activity and functions in the nucleus. Heterologous expression of SbWRKY30 confers tolerance to drought stress in Arabidopsis (Arabidopsis thaliana) and rice by affecting root architecture. In addition, SbWRKY30 transgenic Arabidopsis and rice plants have higher proline contents and SOD, POD, and CAT activities but lower MDA contents than wild-type plants after drought stress. As a homologous gene of the drought stress-responsive gene RD19 of Arabidopsis, SbRD19 overexpression in Arabidopsis improved the drought tolerance of plants relative to wild-type plants. Further analysis demonstrated that SbWRKY30 could induce SbRD19 expression through binding to the W-box element in the promoter of SbRD19. These results suggest that SbWRKY30 functions as a positive regulator in response to drought stress. Therefore, SbWRKY30 may serve as a promising candidate gene for molecular breeding to generate drought-tolerant crops.