Daniel H Huson

Daniel H Huson
University of Tuebingen | EKU Tübingen · Center for Bioinformatics and Department of Computer Science

Mathematics, Diploma 1986, PhD 1990, Habilitation 1997

About

349
Publications
127,501
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
69,561
Citations
Additional affiliations
August 2015 - July 2017
National University of Singapore
Position
  • Professor
April 2011 - March 2015
Nanyang Technological University
Position
  • Professor

Publications

Publications (349)
Article
Full-text available
The concept of an autocatalytic network of reactions that can form and persist, starting from just an available food source, has been formalized by the notion of a reflexively autocatalytic and food-generated (RAF) set. The theory and algorithmic results concerning RAFs have been applied to a range of settings, from metabolic questions arising at t...
Article
Full-text available
DNA methylation is an epigenetic mechanism for regulating gene expression, and it plays an important role in many biological processes. While methylation sites can be identified using laboratory techniques, much work is being done on developing computational approaches using machine learning. Here, we present a deep-learning algorithm for determini...
Preprint
Full-text available
Medium-chain carboxylates are used in various industrial applications. These chemicals are typically extracted from palm oil, which is deemed not sustainable. Recent research has focused on microbial chain elongation using reactors to produce medium-chain carboxylates, such as n-caproate (C6) and n-caprylate (C8), from organic substrates such as wa...
Preprint
Full-text available
In a recent publication, we introduce the PGPT ontology and PGPT-db of bacterial plant growth-promotion traits and associated database of protein sequences, and provide several tools for bacterial genome analysis on the PLaBAse server. Here, we extend the scope of the PGPT ontology to perform PGPT analysis of metagenomic datasets. First, we introdu...
Article
A microbial community maintains its ecological dynamics via metabolite crosstalk. Hence, knowledge of the metabolome, alongside its populace, would help us understand the functionality of a community and also predict how it will change in atypical conditions. Methods that employ low-cost metagenomic sequencing data can predict the metabolic potenti...
Article
Full-text available
NeighborNet constructs phylogenetic networks to visualize distance data. It is a popular method used in a wide range of applications. While several studies have investigated its mathematical features, here we focus on computational aspects. The algorithm operates in three steps. We present a new simplified formulation of the first step, which aims...
Preprint
Full-text available
The concept of an autocatalytic network of reactions that can form and persist, starting from just an available food source, has been formalised by the notion of a Reflexively-Autocatalytic and Food generated (RAF) set. The theory and algorithmic results concerning RAFs have been applied to a range of settings, from metabolic questions arising at t...
Article
Full-text available
Methanogenesis allows methanogenic archaea to generate cellular energy for their growth while producing methane. Thermophilic hydrogenotrophic species of the genus Methanothermobacter have been recognized as robust biocatalysts for a circular carbon economy and are already applied in power-to-gas technology with biomethanation, which is a platform...
Article
Full-text available
Transformer-based language models are successfully used to address massive text-related tasks. DNA methylation is an important epigenetic mechanism, and its analysis provides valuable insights into gene regulation and biomarker identification. Several deep learning–based methods have been proposed to identify DNA methylation, and each seeks to stri...
Preprint
Full-text available
The process of DNA 5-methylcytosine modification has been widely studied in mammals and and plays an important role in epigenetics. Several computational approaches have been developed to aid the identification of methylation sites. In this study, we introduce a novel deep-learning framework MR-DNR that aims at predicting specific methylation sites...
Chapter
Metagenomics is the study of microbiomes using DNA sequencing technologies. Basic computational tasks are to determine the taxonomic composition (who is out there?), the functional composition (what can they do?), and also to correlate changes of composition to changes in external parameters (how do they compare?). One approach to address these iss...
Article
Full-text available
Phylogenetic analysis frequently leads to the creation of many phylogenetic trees, either from using multiple genes or methods, or through bootstrapping or Bayesian analysis. A consensus tree is often used to summarize what the trees have in common. Consensus networks were introduced to also allow the visualization of the main incompatibilities amo...
Preprint
Full-text available
Microbial community maintains its ecological dynamics via metabolites crosstalk. Hence knowledge of the metabolome, alongside its populace, would help us understand the functionality of that community and also predict how it alters in atypical conditions. The metabolic potential of a community from low-cost metagenomic sequencing data signifies the...
Article
Full-text available
Motivation: Metagenomic projects often involve large numbers of large sequencing datasets (totaling hundreds of gigabytes of data). Thus, computational preprocessing and analysis are usually performed on a server. The results of such analyses are then usually explored interactively. One approach is to use MEGAN, an interactive program that allows...
Preprint
Full-text available
Antibiotics have been an essential part of modern medicine since their initial discovery. The continuous exploration of new antibiotic candidates remains a necessity given the increasing emergence of resistance to antimicrobial compounds among pathogens. An important group of last-resort antibiotics, the glycopeptide antibiotics (GPAs), have been s...
Chapter
Third-generation sequencing technologies are being increasingly used in microbiome research and this has given rise to new challenges in computational microbiome analysis. Oxford Nanopore's MinION is a portable sequencer that streams data that can be basecalled on-the-fly. Here we give an introduction to the MAIRA software, which is designed to ana...
Preprint
Full-text available
Transformer-based language models are successfully used to address massive text-related tasks. DNA methylation is an important epigenetic mechanism and its analysis provides valuable insights into gene regulation and biomarker identification. Several deep learning-based methods have been proposed to identify DNA methylation and each seeks to strike...
Preprint
Full-text available
Methanogenesis allows methanogenic archaea (methanogens) to generate cellular energy for their growth while producing methane. Hydrogenotrophic methanogens thrive on carbon dioxide and molecular hydrogen as sole carbon and energy sources. Thermophilic and hydrogenotrophic Methanothermobacter spp. have been recognized as robust biocatalysts for a ci...
Preprint
Full-text available
Motivation Metagenomic projects of large sequencing datasets (totaling hundreds of gigabytes of data). Thus, computational preprocessing and analysis are usually performed on a server rather than on a personal computer. The results of such analyses are then usually explored interactively. One approach is to use MEGAN, an interactive program that al...
Article
Full-text available
In microbiome analysis, functional profiling is based on assigning reads or contigs to terms or nodes in a functional classification system. There are a number of large, general-purpose functional classifications that are in use, such as eggNOG, KEGG, InterPro and SEED. Smaller, special-purpose classifications include CARD, EC, MetaCyc and VFDB. He...
Article
Motivation: Metagenomics is the study of microbiomes using DNA sequencing. A microbiome consists of an assemblage of microbes that is associated with a "theater of activity" (ToA). An important question is, to what degree does the taxonomic and functional content of the former depend on the (details of the) latter? Here we investigate a related te...
Article
Full-text available
During the last two decades, yeast has been used as a biological tool to produce various small molecules, biofuels, etc., using an inexpensive bioprocess. The application of Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-CRISPR-associated protein (Cas) techniques in yeast genetic and metabolic engineering has made a paradigm shi...
Article
Full-text available
Agroindustrial waste, such as fruit residues, are a renewable, abundant, low-cost, commonly-used carbon source. Biosurfactants are molecules of increasing interest due to their multifunctional properties, biodegradable nature and low toxicity, in comparison to synthetic surfactants. A better understanding of the associated microbial communities wil...
Preprint
Full-text available
Motivation Metagenomics is the study of microbiomes using DNA sequencing. A microbiome consists of an assemblage of microbes that is associated with a “theater of activity” (ToA). To what degree does the taxonomic and functional content of the former depend on the (details of the) latter? More technically, given a taxonomic and/or functional profil...
Article
Full-text available
This mini-review aims at raising the interest in contractile phage tail-like particles (CPTPs) of bacteria as an efficient and pest-specific alternative to conventional chemical pesticides in agriculture, horticulture and forestry. CPTPs are used by various bacteria in diverse environments for interbacterial competition or for manipulation of eukar...
Article
Full-text available
In microbiome analysis, one main approach is to align metagenomic sequencing reads against a protein reference database, such as NCBI-nr, and then to perform taxonomic and functional binning based on the alignments. This approach is embodied, for example, in the standard DIAMOND1MEGAN analysis pipeline, which first aligns reads against NCBI-nr usin...
Article
Full-text available
New long read sequencing technologies offer huge potential for effective recovery of complete, closed genomes from complex microbial communities. Using long read data (ONT MinION) obtained from an ensemble of activated sludge enrichment bioreactors we recover 22 closed or complete genomes of community members, including several species known to pla...
Preprint
Full-text available
In microbiome analysis, one main approach is to align metagenomic sequencing reads against a protein-reference database such as NCBI-nr, and then to perform taxonomic and functional binning based on the alignments. This approach is embodied, for example, in the standard DIAMOND+MEGAN analysis pipeline, which first aligns reads against NCBI-nr using...
Preprint
Full-text available
Microbial biosurfactants are of major interest due to their multifunctional properties, biodegradable nature and low toxicity. Agroindustrial waste, such as fruit waste, can be used as substrates for producing bacteria. In this study, six samples of fruit waste, from oranges, mangoes and mixed fruits, were self-fermented, and then subjected to shor...
Article
Full-text available
Microbial studies typically involve the sequencing and assembly of draft genomes for individual microbes or whole microbiomes. Given a draft genome, one first task is to determine its phylogenetic context, that is, to place it relative to the set of related reference genomes. We provide a new interactive graphical tool that addresses this task usin...
Article
Periodic tilings play a role in the decorative arts, in construction and in crystal structures. Combinatorial tiling theory allows the systematic generation, visualization and exploration of such tilings of the plane, sphere and hyperbolic plane, using advanced algorithms and software. Here we present a “galaxy” of tilings that consists of the set...
Article
Full-text available
Sulfolobaceae family, comprising diverse thermoacidophilic and aerobic sulfur-metabolizing Archaea from various geographical locations, offers an ideal opportunity to infer the evolutionary dynamics across the members of this family. Comparative pan-genomics coupled with evolutionary analyses has revealed asymmetric genome evolution within the Sulf...
Preprint
Full-text available
A bstract Microbial studies typically involve the sequencing and assembly of draft genomes for individual microbes or whole microbiomes. Given a draft genome, one first task is to determine its phylogenetic context, that is, to place it relative to the set of related reference genomes. We provide a new interactive graphical tool that addresses this...
Article
Rooted phylogenetic networks provide a way to describe species’ relationships when evolution departs from the simple model of a tree. However, networks inferred from genomic data can be highly tangled, making it difficult to discern the main reticulation signals present. In this paper, we describe a natural way to transform any rooted phylogenetic...
Article
Full-text available
One main approach to computational analysis of microbiome sequences is to first align against a reference database of annotated protein sequences (NCBI‐nr) and then perform taxonomic and functional binning of the sequences based on the resulting alignments. For both short and long reads (or assembled contigs), alignment is performed using DIAMOND,...
Article
Full-text available
Bulk production of medium-chain carboxylates (MCCs) with 6–12 carbon atoms is of great interest to biotechnology. Open cultures (e.g., reactor microbiomes) have been utilized to generate MCCs in bioreactors. When in-line MCC extraction and prevention of product inhibition is required, the bioreactors have been operated at mildly acidic pH (5.0–5.5)...
Article
Full-text available
The current COVID-19 pandemic, caused by the rapid worldwide spread of the SARS-CoV-2 virus, is having severe consequences for human health and the world economy. The virus affectsdifferent individuals differently, with many infected patients showing only mild symptoms, andothers showing critical illness. To lessen the impact of the epidemic, one p...
Article
Full-text available
Metabolism across all known living systems combines two key features. First, all of the molecules that are required are either available in the environment or can be built up from available resources via other reactions within the system. Second, the reactions proceed in a fast and synchronized fashion via catalysts that are also produced within th...
Article
Full-text available
Background: Advances in mobile sequencing devices and laptop performance make metagenomic sequencing and analysis in the field a technologically feasible prospect. However, metagenomic analysis pipelines are usually designed to run on servers and in the cloud. Results: MAIRA is a new standalone program for interactive taxonomic and functional an...
Preprint
Rooted phylogenetic networks provide a way to describe species' relationships when evolution departs from the simple model of a tree. However, networks inferred from genomic data can be highly tangled, making it difficult to discern the main reticulation signals present. In this paper, we describe a natural way to transform any rooted phylogenetic...
Preprint
Full-text available
Periodic tilings play a role in the decorative arts, in construction and in crystal structures. Combinatorial tiling theory allows the systematic generation, visualization and exploration of such tilings of the plane, sphere and hyperbolic plane, using advanced algorithms and software.Here we present a "galaxy" of tilings that consists of the set o...
Preprint
Full-text available
Background Bulk production of medium-chain carboxylates (MCCs) with 6-12 carbon atoms is of great interest to biotechnology. Open cultures ( e . g ., reactor microbiomes) have been utilized to generate MCCs in bioreactors. When in-line MCC extraction and prevention of product inhibition is required, the bioreactors have been operated at mildly acid...
Preprint
Full-text available
The current COVID-19 pandemic, caused by the rapid world-wide spread of the SARS-CoV-2 virus, is having severe consequences for human health and the world economy. The virus effects individuals quite differently, with many infected patients showing only mild symptoms, and others showing critical illness. To lessen the impact of the pandemic, one im...
Preprint
Full-text available
Metabolism across all known living systems combines two key features. First, all of the molecules that are required are either available in the environment or can be built up from available resources via other reactions within the system. Second, the reactions proceed in a fast and synchronised fashion via catalysts that are also produced within th...
Conference Paper
Full-text available
8930 Impact of selective digestive and oropharyngeal decontamination on the gut microbiome and resistome in intensive care patients Background: Selective decontamination of the digestive tract (SDD) and of the oropharynx (SOD) are prophylactic interventions to reduce mortality and infectious complications in intensive care unit (ICU) patients. Thes...
Preprint
Full-text available
New long read sequencing technologies offer huge potential for effective recovery of complete, closed genomes from complex microbial communities. Using long read (MinION) obtained from an ensemble of activated sludge enrichment bioreactors, we 1) describe new methods for validating long read assembled genomes using their counterpart short read meta...
Article
Full-text available
Several studies have demonstrated that the viral genome can be methylated by the host cell during progression from persistent infection to cervical cancer. The aim of this study was to investigate whether methylation at a specific site could predict the development of viral persistence and whether viral load shows a correlation with specific methyl...
Article
Full-text available
Recent findings suggest an implication of the gut microbiome in Parkinson’s disease (PD) patients. PD onset and progression has also been linked with various environmental factors such as physical activity, exposure to pesticides, head injury, nicotine, and dietary factors. In this study, we used a mouse model, overexpressing the complete human SNC...
Preprint
Motivation Antibiotic resistance is widely recognized as a severe threat to current medical practice. Each antibiotic therapy drives the emergence and subsequent retention of antibiotics resistance genes within the human gut microbiome. However, the details on how the resistance spreads between bacteria within the human gut remain unknown, as does...
Preprint
Full-text available
A bstract In phylogenetics, a set of gene trees is often summarized by a consensus tree, such as the majority consensus, which is based on the set of all splits that are present in more than 50% of the input trees. A “consensus network” is obtained by lowering the threshold and considering all splits that are contained in 10% of the trees, say, and...
Chapter
Full-text available
Metagenomics has become a part of the standard toolkit for scientists interested in studying microbes in the environment. Compared to 16S rDNA sequencing, which allows coarse taxonomic profiling of samples, shotgun metagenomic sequencing provides a more detailed analysis of the taxonomic and functional content of samples. Long read technologies, su...
Article
Objectives The aim of the study was to measure the impact of antibiotic exposure on the acquisition of colonization with extended-spectrum β-lactamase-producing Gram-negative bacteria (ESBL-GNB) accounting for individual- and group-level confounding using machine-learning methods. Methods Patients hospitalized between September 2010 and June 2013...
Preprint
Full-text available
Background Recent findings suggest an implication of the gut microbiome in Parkinson’s disease patients. Parkinson’s disease onset and progression has also been linked with various environmental factors such as physical activity, exposure to pesticides, head injury, nicotine, and dietary factors. Objectives In this study, we used a transgenic mous...
Article
Full-text available
Background Short-read sequencing technologies have long been the work-horse of microbiome analysis. Continuing technological advances are making the application of long-read sequencing to metagenomic samples increasingly feasible. Results We demonstrate that whole bacterial chromosomes can be obtained from an enriched community, by application of...
Article
Host genetic variation influences microbiome composition. While studies have focused on associations between the gut microbiome and specific alleles, gene copy number (CN) also varies. We relate microbiome diversity to CN variation of the AMY1 locus, which encodes salivary amylase, facilitating starch digestion. After imputing AMY1-CN for ∼1,000 su...
Article
The microbiota and the gastrointestinal mucus layer play a pivotal role in protection against non-typhoidal Salmonella enterica serovar Typhimurium (S. Tm) colitis. Here, we analyzed the course of Salmonella colitis in mice lacking a functional mucus layer in the gut. Unexpectedly, in contrast to mucus-proficient littermates, genetically deficient...
Article
Full-text available
Type VI secretion systems and tailocins, two bacterial phage tail-like particles, have been reported to foster interbacterial competition. Both nanostructures enable their producer to kill other bacteria competing for the same ecological niche. Previously, type VI secretion systems and particularly R-type tailocins were considered highly specific,...
Preprint
Full-text available
Background Short-read sequencing technologies have long been the work-horse of microbiome analysis. Continuing technological advances are making the application of long-read sequencing to metagenomic samples increasingly feasible. Results We demonstrate that whole bacterial chromosomes can be obtained from a complex community, by application of Mi...
Preprint
Full-text available
New long read sequencing technologies offer huge potential for effective recovery of complete, closed genomes. While much progress has been made on cultured isolates, the ability of these methods to recover genomes of member taxa in complex microbial communities is less clear. Here we examine the ability of long read data to recover genomes from en...
Preprint
Full-text available
Host genetic variation influences the composition of the human microbiome. While studies have focused on associations between the microbiome and single nucleotide polymorphisms in genes, their copy number (CN) can also vary. Here, in a study of human subjects including a 2-week standard diet, we relate oral and gut microbiome to CN at the AMY1 locu...
Article
Full-text available
Background: There are numerous computational tools for taxonomic or functional analysis of microbiome samples, optimized to run on hundreds of millions of short, high quality sequencing reads. Programs such as MEGAN allow the user to interactively navigate these large datasets. Long read sequencing technologies continue to improve and produce incr...
Article
Bacterial viruses contribute to the dynamics of the microbiome communities, as they are involved in the horizontal gene transfer. Previously we studied changes in the gut microbiome of the two healthy individuals over the course of a 6-days antibiotics treatment and subsequent 28 days recovery time (Willmann et al., 2015). Now, from the same sample...
Article
Full-text available
Indigenous populations of the Americas experienced high mortality rates during the early contact period as a result of infectious diseases, many of which were introduced by Europeans. Most of the pathogenic agents that caused these outbreaks remain unknown. Through the introduction of a new metagenomic analysis tool called MALT, applied here to sea...
Article
Full-text available
The larvae of the greater wax moth, Galleria mellonella , are pests of active beehives. In infection biology, these larvae are playing a more and more attractive role as an invertebrate host model. Here, we report on the first genome sequence of Galleria mellonella .
Preprint
Full-text available
Background There are numerous computational tools for taxonomic or functional analysis of microbiome samples, optimized to run on hundreds of millions of short, high quality sequencing reads. Programs such as MEGAN allow the user to interactively navigate these large datasets. Long read sequencing technologies continue to improve and produce increa...
Chapter
Early microbiome studies focused on estimating the taxonomic composition of an assemblage of microbes using amplicon sequencing. With improved throughput and decreased cost of sequencing, whole genome shotgun (WGS) sequencing of environmental samples has become a standard procedure in microbial studies. This allows a more detailed analysis of the t...
Article
Full-text available
Microbial nitrogen transformation processes such as denitrification represent major sources of the potent greenhouse gas nitrous oxide (N2O). Soil biochar amendment has been shown to significantly decrease N2O emissions in various soils. However, the effect of biochar on the structure and function of microbial communities that actively perform nitr...
Article
Full-text available
With the rise of multi-drug resistant pathogens and the decline in number of potential new antibiotics in development there is a fervent need to reinvigorate the natural products discovery pipeline. Most antibiotics are derived from secondary metabolites produced by microorganisms and plants. To avoid suicide, an antibiotic producer harbors resista...
Article
Full-text available
BackgroundA key step in microbiome sequencing analysis is read assignment to taxonomic units. This is often performed using one of four taxonomic classifications, namely SILVA, RDP, Greengenes or NCBI. It is unclear how similar these are and how to compare analysis results that are based on different taxonomies. ResultsWe provide a method and softw...
Article
Full-text available
Recent genomic data have revealed multiple interactions between Neanderthals and modern humans, but there is currently little genetic evidence regarding Neanderthal behaviour, diet, or disease. Here we describe the shotgun-sequencing of ancient DNA from five specimens of Neanderthal calcified dental plaque (calculus) and the characterization of reg...
Article
Indigenous populations of the Americas experienced high mortality rates during the early contact period as a result of infectious diseases, many of which were introduced by Europeans. Most of the pathogenic agents that caused these outbreaks remain unknown. Using a metagenomic tool called MALT to search for traces of ancient pathogen DNA, we were a...
Preprint
Full-text available
Indigenous populations of the Americas experienced high mortality rates during the early contact period as a result of infectious diseases, many of which were introduced by Europeans. Most of the pathogenic agents that caused these outbreaks remain unknown. Using a metagenomic tool called MALT to search for traces of ancient pathogen DNA, we were a...
Article
Full-text available
Background Microbiome sequencing projects typically collect tens of millions of short reads per sample. Depending on the goals of the project, the short reads can either be subjected to direct sequence analysis or be assembled into longer contigs. The assembly of whole genomes from metagenomic sequencing reads is a very difficult problem. However,...
Article
Full-text available
Background Taxonomic profiling of microbial communities is often performed using small subunit ribosomal RNA (SSU) amplicon sequencing (16S or 18S), while environmental shotgun sequencing is often focused on functional analysis. Large shotgun datasets contain a significant number of SSU sequences and these can be exploited to perform an unbiased SS...
Article
Full-text available
In this work we report a detailed analysis of the topology and phylogenetics of family 2 glycoside hydrolases (GH2). We distinguish five topologies or domain architectures based on the presence and distribution of protein domains defined in Pfam and Interpro databases. All of them share a central TIM barrel (catalytic module) with two β-sandwich do...
Data
Phylogenetic analysis of the GH2C domain. The tree was generated as described in the legend of Fig 3, but including a tag for each sequence. The tag corresponds to the GI number and a descriptive legend of the corresponding domain architecture. In most cases, where the GH2N-GH2d-GH2C tandem is conserved, only the composition of the C-terminal domai...
Data
Structural superposition of the β-galactosidases from Thermotoga. maritima (TmLac) and Kluyveromyces lactis (KlLac). The structural model of TmLac (orange) was aligned with one of the subunits of KlLac (green, PDB code 3OB8). Residues contributing to the catalytic pocket are highlighted. (TIF)
Data
Domain architectures of all GH2 sequences analyzed in this study. Sequences are classified according to DA type. Each sequence is identified by the GI number and Accession number from Genbank, EMBL or DDBJ databases. The specific domain architecture, organism of origin (source) and Genbank definition is shown. Biochemical characterization, as recor...
Data
Cluster and subcluster classification of DA type 5 proteins with unidentified C-terminal extensions. (DOCX)
Data
Pipeline followed for the analysis of GH2 domain architectures and phylogenetic tree construction. (TIF)
Data
Summary of Genbank annotations and biochemical characterization, as recorded in the CAZy database, of enzyme activity for each DA type. (DOCX)
Data
Cluster and subcluster classification of DA type 4 proteins. (DOCX)
Data
Binary vectors assigned to each sequence to describe the different domain architectures. Sheet S1.1 specifies the Pfam domain assigned to each position in the binary vector. Each Pfam code is followed by an underscore symbol and a number to indicate the presence of tandem repeats when necessary. Sheet S1.2 indicates the binary vector that correspon...

Network

Cited By