Vysaul Nyirongo

Vysaul Nyirongo

About

70
Publications
6,947
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,336
Citations

Publications

Publications (70)
Article
Full-text available
Background: The -α 3.7I -thalassaemia deletion is very common throughout Africa because it protects against malaria. When undertaking studies to investigate human genetic adaptations to malaria or other diseases, it is important to account for any confounding effects of α-thalassaemia to rule out spurious associations. Methods: In this study, we ha...
Article
Background: The -α 3.7I -thalassaemia deletion is very common throughout Africa because it protects against malaria. When undertaking studies to investigate human genetic adaptations to malaria or other diseases, it is important to account for any confounding effects of α-thalassaemia to rule out spurious associations. Methods: In this study we hav...
Article
The malaria parasite Plasmodium falciparum invades human red blood cells via interactions between host and parasite surface proteins. By analyzing genome sequence data from human populations, including 1269 individuals from sub-Saharan Africa, we identify a diverse array of large copy number variants affecting the host invasion receptor genes GYPA...
Preprint
Full-text available
Plasmodium falciparum invades human red blood cells by a series of interactions between host and parasite surface proteins. Here we analyse whole genome sequence data from worldwide human populations, including 765 new genomes from across sub-Saharan Africa, and identify a diverse array of large copy number variants affecting the host invasion rece...
Data
Single SNP association test results with adjustment for additive effect of G6PD+202.DOI: http://dx.doi.org/10.7554/eLife.15085.007
Data
G6PDd score association test results.DOI: http://dx.doi.org/10.7554/eLife.15085.015
Data
(A) Summary of study designs of contributing partner studies to MalariaGEN Consortial Project 1 (CP1). (B) Genotyped sample distribution. (C) Summary of 65 SNPs selected for analysis and successfully genotyped. (D) G6PD+202 female association test results. (E) G6PD+202 male association test results. (F) G6PD+202 all individuals association test res...
Data
(A) SNP selection across G6PD region for genotyping. (B) SpectroDESIGNER assay design file for 135 G6PD locus SNPs in four multiplexes. (C) SpectroDESIGNER assay design file for 107 G6PD locus SNPs in four multiplexes. (D) SpectroDESIGNER assay design file for 68 G6PD locus SNPs in three multiplexes. DOI: http://dx.doi.org/10.7554/eLife.15085.020
Data
Single SNP association test results.DOI: http://dx.doi.org/10.7554/eLife.15085.006
Article
Full-text available
Glucose-6-phosphate dehydrogenase (G6PD) deficiency is believed to confer protection against Plasmodium falciparum malaria, but the precise nature of the protective effecthas proved difficult to define as G6PD deficiency has multiple allelic variants with different effects in males and females, and it has heterogeneous effects on the clinical outco...
Article
The malaria parasite Plasmodium falciparum invades human red blood cells via interactions between host and parasite surface proteins. By analyzing genome sequence data from human populations, including 1269 individuals from sub-Saharan Africa, we identify a diverse array of large copy number variants affecting the host invasion receptor genes GYPA...
Article
Full-text available
ELife digest Our genomes contain a record of historical events. This is because when groups of people are separated for generations, the DNA sequence in the two groups’ genomes will change in different ways. Looking at the differences in the genomes of people from the same population can help researchers to understand and reconstruct the historical...
Article
Similarity between two individuals in the combination of genetic markers along their chromosomes indicates shared ancestry and can be used to identify historical connections between different population groups due to admixture. We use a genome-wide, haplotype-based, analysis to characterise the structure of genetic diversity and gene-flow in a coll...
Preprint
Full-text available
Understanding patterns of genetic diversity is a crucial component of medical research in Africa. Here we use haplotype-based population genetics inference to describe gene-flow and admixture in a collection of 48 African groups with a focus on the major populations of the sub-Sahara. Our analysis presents a framework for interpreting haplotype div...
Article
Full-text available
The high prevalence of sickle haemoglobin in Africa shows that malaria has been a major force for human evolutionary selection, but surprisingly few other polymorphisms have been proven to confer resistance to malaria in large epidemiological studies. To address this problem, we conducted a multi-centre genome-wide association study (GWAS) of life-...
Article
Full-text available
Many human genetic associations with resistance to malaria have been reported, but few have been reliably replicated. We collected data on 11,890 cases of severe malaria due to Plasmodium falciparum and 17,441 controls from 12 locations in Africa, Asia and Oceania. We tested 55 SNPs in 27 loci previously reported to associate with severe malaria. T...
Article
Full-text available
Background The vast majority of deaths in the Kilifi study area are not recorded through official systems of vital registration. As a result, few data are available regarding causes of death in this population. Objective To describe the causes of death (CODs) among residents of all ages within the Kilifi Health and Demographic Surveillance System...
Article
Full-text available
Sickle cell disease (SCD) is common in many parts of sub-Saharan Africa (SSA), where it is associated with high early mortality. In the absence of newborn screening, most deaths among children with SCD go unrecognized and unrecorded. As a result, SCD does not receive the attention it deserves as a leading cause of death among children in SSA. In th...
Article
Full-text available
Combining data from genome-wide association studies (GWAS) conducted at different locations, using genotype imputation and fixed-effects meta-analysis, has been a powerful approach for dissecting complex disease genetics in populations of European ancestry. Here we investigate the feasibility of applying the same approach in Africa, where genetic d...
Data
Example of cluster plot from Malawi cohort with outlying sets of individuals. (TIF)
Data
Distribution of relatedness between most-related pairs. (TIF)
Data
Comparison of logistic regression (SNPTEST) and mixed model (MMM) P values. (TIF)
Data
SNPs showing highly divergent P values between logistic regression and mixed model scans. (TIF)
Data
Comparison of meta-analysis P values versus Bayes factors under the fixed-effect model. (TIF)
Data
Quantile-quantile plots of the region-based test in the three cohort and in the meta-analysis. The genomic control inflation factor is given in the title of the plots. (TIF)
Data
Manhattan plot showing –log10 P values (thresholded at 10) for additive, dominant, heterozygote, recessive, and general models, and additive model conditional on the genotype at the sickle locus rs334, across all imputed SNPs. Meta-analysis P values for all three cohorts and for the East African cohorts are also shown for additive, dominant, recess...
Data
The distribution of ethnic groups in Kenyan samples that were imputed with higher or lower quality (as defined by the red line in Figure S12). The difference in the two distributions is highly significant (Fisher's exact test, P = 4×10−4), suggesting that ethnic differences contribute to the bimodal distribution of imputation quality seen in Figure...
Data
–log10(P values) for test of association using the mixed model. (TIF)
Data
The distribution of imputation quality (measured by type2 r2) across imputed Kenyan samples. The red line is at r2 = 0.909, and is the minimum between the two peaks. (TIF)
Data
Pre-imputation individual QC. (DOCX)
Data
Top: signal of association in the HBB region after conditioning on the genotype at the known causal locus rs334. Bottom: signal of association in the ABO region after conditioning on the genotype at rs8176719. (TIF)
Data
Example output from the imputation quality control pipeline for the Kenya imputation. a) per-SNP certainty (mean maximum posterior genotype call); b) per-SNP accuracy (type2 r2); c) per-individual type2 r2, averaged across segments; d) per-segment heterozygous call accuracy (proportion of true heterozygous calls that are correctly imputed with high...
Data
Population-specific PCA analysis of Kenyan samples. (TIF)
Data
a) Empirical distribution, across approximately 20,000 gene regions, of the maximum likelihood estimate of the eta parameter (see Text S2), for the region-based test. Overlaid (red line) is the assumed prior distribution under the alternative used to calculate Bayes factors in the region-based analysis. b) Scatter plot of the log10 combined Bayes F...
Data
Post-imputation sample exclusions. (DOCX)
Data
Genomic Inflation factors (λ) for logistic regression and mixed-model scans. (DOCX)
Data
Enrichment of low region based test P values in three previously defined sets of regions. Each P value in the table results from a one-sided binomial test for an enrichment in the number of regions with empirical P value below the given threshold. The bottom row gives a summary of the distribution of the number of SNPs in each region. Note that the...
Data
Manhattan plot showing –log10 P values (thresholded at 10) for additive, dominant, heterozygote, recessive, and general models, and additive model conditional on the genotype at the sickle locus rs334, across all non-excluded genotyped SNPs. Meta-analysis P values for all three cohorts and for the East African cohorts are also shown for additive, d...
Data
Population-specific PCA analysis of Gambian samples. (TIF)
Data
Population-specific PCA analysis of Malawian samples. (TIF)
Data
Comparison of fixed, structured, correlated and independent-effect models at the ABO and HBB loci. The height of each bar represents the posterior probability that the corresponding model is true, under the assumption that one of the models is true. (TIF)
Data
Details on the 3 study sites and genotyping platforms. (DOCX)
Data
P values for correlation between the first 5 PCs and case/control status. (DOCX)
Data
Supplementary statistical details. (PDF)
Data
ROC curve showing empirical true positive rate (y-axis) against false positive rate (x-axis) for each method used to detect regional association (regional test with Fisher meta-analysis, regional test with Bayesian meta-analysis, best single-SNP frequentist meta-analysis in region, best single-SNP Bayes factor for each of the four choices of correl...
Data
Pre-imputation SNP QC. (DOCX)
Data
Regions showing most association in single-SNP and regional association test analyses. (XLSX)
Data
Details of quality control. (DOCX)
Article
Full-text available
Malawi is one of the countries in the sub-Saharan Africa with high prevalence of HIV/AIDS. This paper ana- lyzes socio-demographic effects using estimates and projections by the United Nations Population Division. It compares estimates and projections for both short term (2005-2020) and also long term (1980-2050), with the reality of HIV/AIDS and w...
Chapter
This chapter considers the problem of matching configurations of biological macromolecules when both alignment and superposition transformations are unknown. Alignment denotes correspondence – a bijection or mapping – between points in different structures according to some objectives or constraints. Superposition denotes rigid-body transformations...
Article
One of the key ingredients in drug discovery is the derivation of conceptual templates called pharmacophores. A pharmacophore model characterizes the physicochemical properties common to all active molecules, called ligands, bound to a particular protein receptor, together with their relative spatial arrangement. Motivated by this important applica...
Article
Full-text available
Large-scale studies of genomic variation could assist efforts to eliminate malaria. But there are scientific, ethical and practical challenges to carrying out such studies in developing countries, where the burden of disease is greatest. The Malaria Genomic Epidemiology Network (MalariaGEN) is now working to overcome these obstacles, using a consor...
Article
We propose a simple procedure for generating virtual protein C(alpha) traces. One of the key ingredients of our method, to build a three-dimensional structure from a random sequence of amino acids, is to work directly on torsional angles of the chain which we sample from a von Mises distribution. With simple modeling of the hydrophobic effect in pr...
Article
Full-text available
In conducting and reporting of medical research, there are some common pitfalls in using statistical methodology which may result in invalid inferences being made. This paper is aimed to highlight to inexperienced statisticians or non-statistician some of the common statistical pitfalls encountered when using statistics to interpret data in medical...
Article
Full-text available
Large-scale studies of genomic variation could assist efforts to eliminate malaria. But there are scientific, ethical and practical challenges to carrying out such studies in developing countries, where the burden of disease is greatest. The Malaria Genomic Epidemiology Network (MalariaGEN) is now working to overcome these obstacles, using a consor...
Article
Full-text available
This paper deals with the problem of estimating fracture planes, given only the data at borehole intersections with fractures. We formulate an appropriate model for the problem and give a solution to fitting the planes using a Markov chain Monte Carlo (MCMC) implementation. The basics of MCMC are presented, with particular emphasis given to reversi...
Data
Full-text available
Case 1 Results. Results for alcohol dehydrogenase (1hdx_1) matching against its own SCOP family. Tables 1–2: Without amino acid property. Tables 3–4: With amino acid property
Data
Full-text available
Case 4 Results. Results for alcohol dehydrogenase and FAD/NAD(P)-binding domain. Tables 1–5: Without physico-chemistry. Tables 5–10: With physico-chemistry.
Data
Full-text available
Case 2 Results. Results for 17 – β hydroxysteroid dehydrogenase and family. Tables 1–5: Without amino acid property. Tables 6–10: With amino acid property.
Data
Full-text available
Case 3 Results. Results for alcohol dehydrogenase (1hdx_1) and superfamily. Tables 1–14: Without physico-chemistry. Tables 14–28: With physico-chemistry.
Article
Full-text available
Matching functional sites is a key problem for the understanding of protein function and evolution. The commonly used graph theoretic approach, and other related approaches, require adjustment of a matching distance threshold a priori according to the noise in atomic positions. This is difficult to pre-determine when matching sites related by varyi...
Chapter
Full-text available
The paper deals with a stochastic stereologic problem of estimating fracture lines, given only the data at boreholes. We formulate an appropriate model. The problem is challenging since neither the lines (slope, intercept) are known, nor their number. We give an MCMC implementation where all the parameters are allowed to vary. We examine sensitivit...
Article
The explosion in volume of protein structural information prior to any knowledge of protein biochemical function has made the characterisation of protein functional sites to be an area of huge interest. Structural similarity of functional sites from proteins with unknown function to those with known functions can be used to infer on the function of...
Article
Protein structure simulations are important for understan ding and exploring properties of proteins and evaluating algorithms in bioinformatics. For example, computer-generated protein structures designed to mimic real a protein, decoys can be us ed to test the validity of a protein model. The model is considered correct only if is able to identify...
Article
Thesis (Ph.D.) -- University of Leeds (Department of Statistics), 2006.

Network

Cited By