Nam-Phuong D. Nguyen

Nam-Phuong D. Nguyen
University of California, San Diego | UCSD · Department of Computer Science and Engineering (CSE)

PhD

About

52
Publications
22,820
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
6,128
Citations

Publications

Publications (52)
Article
Replication stress (RS) is a primary source of genomic instability, tumorigenesis, and cancer progression. RS is defined as an uncoupling of the replicative helicase and DNA polymerase, resulting in long stretches of fragile single stranded DNA (ssDNA) that is prone to damage. Excessive RS can result in replication catastrophe and cell death, which...
Article
Full-text available
DNA viruses are important infectious agents known to mediate a large number of human diseases, including cancer. Viral integration into the host genome and the formation of hybrid transcripts are also associated with increased pathogenicity. The high variability of viral genomes, however requires the use of sensitive ensemble hidden Markov models t...
Article
Full-text available
Adaptive radiation is an important mechanism of organismal diversification and can be triggered by new ecological opportunities. Although poorly studied in this regard, parasites are an ideal group in which to study adaptive radiations because of their close associations with host species. Both experimental and comparative studies suggest that the...
Article
Full-text available
We investigated the prevalence of coronaviruses in 44 bats from four families in northeastern Eswatini using high-throughput sequencing of fecal samples. We found evidence of coronaviruses in 18% of the bats. We recovered full or near-full-length genomes from two bat species: Chaerephon pumilus and Afronycteris nana , as well as additional coronavi...
Conference Paper
The recent discovery and clinical validation of KRAS inhibitors (KRASi) has ushered in a new therapeutic approach to directly address the previously undruggable mutant KRAS-driven cancers. Unfortunately, as with other oncogene-directed therapies, acquired resistance to KRASi has been observed that is partially attributed to secondary mutations in K...
Article
Purpose: Human papillomavirus (HPV) plays a major role in oncogenesis and circular extrachromosomal DNA (ecDNA) is found in many cancers. However, the relationship between HPV and circular ecDNA in human cancer is not understood. Experimental design: Forty-four primary tumor tissue samples were obtained from a cohort of HPV-positive OPSCC patien...
Preprint
Full-text available
Adaptive radiation is an important mechanism of organismal diversification, and can be triggered by new ecological opportunities. Although poorly studied in this regard, parasites present an ideal system to study adaptive radiations because of their close associations with host species. Both experimental and comparative studies suggest that the ect...
Conference Paper
Introduction: Tumors with oncogene copy number amplification are aggressive, have poor prognosis and, to date, have been very difficult to treat. Computational analyses in a large pan-cancer study revealed that ecDNA comprises over 50% of highly amplified oncogenes. We sought to determine the underlying mechanisms that render tumors with amplified...
Article
Full-text available
Extrachromosomal DNA (ecDNA) amplification promotes intratumoral genetic heterogeneity and accelerated tumor evolution1–3; however, its frequency and clinical impact are unclear. Using computational analysis of whole-genome sequencing data from 3,212 cancer patients, we show that ecDNA amplification frequently occurs in most cancer types but not in...
Article
3123 Background: In the KEYNOTE-059 study, the anti-PD-1 checkpoint inhibitor pembrolizumab was shown to have a modest overall response of 11.6%. Common predictors of response including, high microsatellite instability (MSI-H), PD-L1 expression, tumor mutational burden (TMB) and tumor inflammation signature (TIS), were not individually sufficient f...
Preprint
Full-text available
Extrachromosomal DNA (ecDNA) amplification promotes high oncogene copy number, intratumoral genetic heterogeneity, and accelerated tumor evolution, but its frequency and clinical impact are not well understood. Here we show, using computational analysis of whole-genome sequencing data from 1,979 cancer patients, that ecDNA amplification occurs in a...
Article
Full-text available
Oncogenes are commonly amplified on particles of extrachromosomal DNA (ecDNA) in cancer1,2, but our understanding of the structure of ecDNA and its effect on gene regulation is limited. Here, by integrating ultrastructural imaging, long-range optical mapping and computational analysis of whole-genome sequencing, we demonstrate the structure of circ...
Article
Full-text available
Green plants (Viridiplantae) include around 450,000–500,000 species1,2 of great diversity and have important roles in terrestrial and aquatic ecosystems. Here, as part of the One Thousand Plant Transcriptomes Initiative, we sequenced the vegetative transcriptomes of 1,124 species that span the diversity of plants in a broad sense (Archaeplastida),...
Article
Green plants (Viridiplantae) include around 450,000–500,000 species1,2 of great diversity and have important roles in terrestrial and aquatic ecosystems. Here, as part of the One Thousand Plant Transcriptomes Initiative, we sequenced the vegetative transcriptomes of 1,124 species that span the diversity of plants in a broad sense (Archaeplastida),...
Article
Full-text available
Microbiomes are vast communities of microorganisms and viruses that populate all natural ecosystems. Viruses have been considered to be the most variable component of microbiomes, as supported by virome surveys and examples of high genomic mosaicism. However, recent evidence suggests that the human gut virome is remarkably stable compared with that...
Article
Full-text available
Green plants (Viridiplantae) include around 450,000–500,000 species1,2 of great diversity and have important roles in terrestrial and aquatic ecosystems. Here, as part of the One Thousand Plant Transcriptomes Initiative, we sequenced the vegetative transcriptomes of 1,124 species that span the diversity of plants in a broad sense (Archaeplastida),...
Preprint
Full-text available
Microbiomes are vast communities of microbes and viruses that populate all natural ecosystems. Viruses have been considered the most variable component of microbiomes, as supported by virome surveys and examples of high genomic mosaicism. However, recent evidence suggests that the human gut virome is remarkably stable compared to other environments...
Article
Full-text available
Aligning sequences for phylogenetic analysis (multiple sequence alignment; MSA) is an important, but increasingly computationally expensive step with the recent surge in DNA sequence data. Much of this sequence data is publicly available, but can be extremely fragmentary (i.e., a combination of full genomes and genomic fragments), which can compoun...
Article
Full-text available
The diversification of parasite groups often occurs at the same time as the diversification of their hosts. However, most studies demonstrating this concordance only examine single host–parasite groups. Multiple diverse lineages of ectoparasitic lice occur across both birds and mammals. Here, we describe the evolutionary history of lice based on an...
Article
Full-text available
Insects with restricted diets rely on symbiotic bacteria to provide essential metabolites missing in their diet. The blood-sucking lice are obligate, host-specific parasites of mammals and are themselves host to symbiotic bacteria. In human lice, these bacterial symbionts supply the lice with B-vitamins. Here we sequenced the genomes of symbiotic a...
Article
Full-text available
Novel sequencing technologies are rapidly expanding the size of datasets that can be applied to phylogenetic studies. Currently the most commonly used phylogenomic approaches involve some form of genome reduction. While these approaches make assembling phylogenomic datasets more economical for organisms with large genomes, they reduce the genomic c...
Article
Parasitic "wing lice" (Phthiraptera: Columbicola) and their dove and pigeon hosts are a well-recognized model system for coevolutionary studies at the intersection of micro- and macroevolution. Selection on lice in microevolutionary time occurs as pigeons and doves defend themselves against lice by preening. In turn, behavioral and morphological ad...
Article
Full-text available
Background Given a new biological sequence, detecting membership in a known family is a basic step in many bioinformatics analyses, with applications to protein structure and function prediction and metagenomic taxon identification and abundance profiling, among others. Yet family identification of sequences that are distantly related to sequences...
Article
Full-text available
Outbreaks of zoonotic diseases in humans and livestock are not uncommon, and an important component in containment of such emerging viral diseases is rapid and reliable diagnostics. Such methods are often PCR-based and hence require the availability of sequence data from the pathogen. Rattus norvegicus (R. norvegicus) is a known reservoir for impor...
Article
Full-text available
The standard pipeline for 16S amplicon analysis starts by clustering sequences within a percent sequence similarity threshold (typically 97%) into ‘Operational Taxonomic Units’ (OTUs). From each OTU, a single sequence is selected as a representative. This representative sequence is annotated, and that annotation is applied to all remaining sequence...
Conference Paper
Many biological questions rely upon multiple sequence alignments (MSAs) and phylogenetic trees of large datasets. However, accurate MSA estimation is difficult for large datasets, especially when the dataset evolved under high rates of evolution or contains fragmentary sequences.
Article
Full-text available
Many biological questions, including the estimation of deep evolutionary histories and the detection of remote homology between protein sequences, rely upon multiple sequence alignments (MSAs) and phylogenetic trees of large datasets. However, accurate large-scale multiple sequence alignment is very difficult, especially when the dataset contains f...
Article
Full-text available
Abstract We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improv...
Article
Motivation: Abundance profiling (also called 'phylogenetic profiling') is a crucial step in understanding the diversity of a metagenomic sample, and one of the basic techniques used for this is taxonomic identification of the metagenomic reads. Results: We present taxon identification and phylogenetic profiling (TIPP), a new marker-based taxon i...
Article
Full-text available
Reconstructing the origin and evolution of land plants and their algal relatives is a fundamental problem in plant phylogenetics, and is essential for understanding how critical adaptations arose, including the embryo, vascular tissue, seeds, and flowers. Despite advances inmolecular systematics, some hypotheses of relationships remain weakly resol...
Article
Full-text available
Reconstructing the origin and evolution of land plants and their algal relatives is a fundamental problem in plant phylogenetics, and is essential for understanding how critical adaptations arose, including the embryo, vascular tissue, seeds, and flowers. Despite advances in molecular systematics, some hypotheses of relationships remain weakly reso...
Article
Full-text available
The 1,000 plants (1KP) project is an international multi-disciplinary consortium that has generated transcriptome data from over 1,000 plant species, with exemplars for all of the major lineages across the Viridiplantae (green plants) clade. Here, we describe how to access the data used in a phylogenomics analysis of the first 85 species, and how t...
Conference Paper
In this paper, we introduce a new and highly scalable algorithm, PASTA, for large-scale multiple sequence alignment estimation. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing...
Article
www.cs.utexas.edu.edu We address the problem of Phylogenetic Placement, in which the objective is to insert short molecular sequences (called query sequences) into an existing phylogenetic tree and alignment on full-length sequences for the same gene. Phylogenetic placement has the potential to provide information beyond pure “species identificatio...
Article
Full-text available
Supertree methods combine trees on subsets of the full taxon set together to produce a tree on the entire set of taxa. Of the many supertree methods, the most popular is MRP (Matrix Representation with Parsimony), a method that operates by first encoding the input set of source trees by a large matrix (the "MRP matrix") over {0,1, ?}, and then runn...
Conference Paper
Electronic design automation (EDA) tools have facilitated the design of ever more complex integrated circuits each year. Synthetic biology would also benefit from the development of genetic design automation (GDA) tools. Existing GDA tools require biologists to design genetic circuits at the molecular level, roughly equivalent to designing electron...
Article
This paper presents results on the design and analysis of a robust genetic Muller C-element. The Muller C-element is a standard logic gate commonly used to synchronize independent processes in most asynchronous electronic circuits. Synthetic biological logic gates have been previously demonstrated, but there remain many open issues in the design of...
Article
Full-text available
The power of electronic computation is due in part to the development of modular gate structures that can be coupled to carry out sophisticated logical operations and whose performance can be readily modelled. However, the equivalences between electronic and biochemical operations are far from obvious. In order to help cross between these disciplin...
Article
Full-text available
iBioSim is a tool that supports learning of genetic circuit models, efficient abstraction-based analysis of these models and the design of synthetic genetic circuits. iBioSim includes project management features and a graphical user interface that facilitate the development and maintenance of genetic circuit models as well as both experimental and...
Conference Paper
Full-text available
Synthetic biology uses engineering principles to design circuits out of genetic materials that are inserted into bacteria to perform various tasks. While synthetic combinational Boolean logic gates have been constructed, there are many open issues in the design of sequential logic gates. One such gate common in most asynchronous circuits is the Mul...
Article
Full-text available
EDA tools have facilitated the design of ever more complex integrated circuits each year. Synthetic biology would also benefit from the development of genetic design automation (GDA) tools. Existing GDA tools require bi-ologists to design genetic circuits at the molecular level, roughly equivalent to designing electronic circuits at the layout leve...
Article
Full-text available
Electronic Design Automation (EDA) tools have facilitated the design of ever more complex integrated circuits each year. Synthetic biology would also benefit from the development of Genetic Design Automation (GDA) tools. Existing GDA tools require biologists to design genetic circuits at the molecular level, roughly equivalent to designing electron...

Network

Cited By