![Daniel H Huson](https://i1.rgstatic.net/ii/profile.image/360159185522688-1462880049045_Q128/Daniel-Huson.jpg)
Daniel H HusonUniversity of Tuebingen | EKU Tübingen · Center for Bioinformatics and Department of Computer Science
Daniel H Huson
Mathematics, Diploma 1986, PhD 1990, Habilitation 1997
About
349
Publications
127,501
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
69,561
Citations
Introduction
Additional affiliations
August 2015 - July 2017
April 2011 - March 2015
Publications
Publications (349)
The concept of an autocatalytic network of reactions that can form and persist, starting from just an available food source, has been formalized by the notion of a reflexively autocatalytic and food-generated (RAF) set. The theory and algorithmic results concerning RAFs have been applied to a range of settings, from metabolic questions arising at t...
DNA methylation is an epigenetic mechanism for regulating gene expression, and it plays an important role in many biological processes. While methylation sites can be identified using laboratory techniques, much work is being done on developing computational approaches using machine learning. Here, we present a deep-learning algorithm for determini...
Medium-chain carboxylates are used in various industrial applications. These chemicals are typically extracted from palm oil, which is deemed not sustainable. Recent research has focused on microbial chain elongation using reactors to produce medium-chain carboxylates, such as n-caproate (C6) and n-caprylate (C8), from organic substrates such as wa...
In a recent publication, we introduce the PGPT ontology and PGPT-db of bacterial plant growth-promotion traits and associated database of protein sequences, and provide several tools for bacterial genome analysis on the PLaBAse server. Here, we extend the scope of the PGPT ontology to perform PGPT analysis of metagenomic datasets. First, we introdu...
A microbial community maintains its ecological dynamics via metabolite crosstalk. Hence, knowledge of the metabolome, alongside its populace, would help us understand the functionality of a community and also predict how it will change in atypical conditions. Methods that employ low-cost metagenomic sequencing data can predict the metabolic potenti...
NeighborNet constructs phylogenetic networks to visualize distance data. It is a popular method used in a wide range of applications. While several studies have investigated its mathematical features, here we focus on computational aspects. The algorithm operates in three steps. We present a new simplified formulation of the first step, which aims...
The concept of an autocatalytic network of reactions that can form and persist, starting from just an available food source, has been formalised by the notion of a Reflexively-Autocatalytic and Food generated (RAF) set. The theory and algorithmic results concerning RAFs have been applied to a range of settings, from metabolic questions arising at t...
Methanogenesis allows methanogenic archaea to generate cellular energy for their growth while producing methane. Thermophilic hydrogenotrophic species of the genus Methanothermobacter have been recognized as robust biocatalysts for a circular carbon economy and are already applied in power-to-gas technology with biomethanation, which is a platform...
Transformer-based language models are successfully used to address massive text-related tasks. DNA methylation is an important epigenetic mechanism, and its analysis provides valuable insights into gene regulation and biomarker identification. Several deep learning–based methods have been proposed to identify DNA methylation, and each seeks to stri...
The process of DNA 5-methylcytosine modification has been widely studied in mammals and and plays an important role in epigenetics. Several computational approaches have been developed to aid the identification of methylation sites. In this study, we introduce a novel deep-learning framework MR-DNR that aims at predicting specific methylation sites...
Metagenomics is the study of microbiomes using DNA sequencing technologies. Basic computational tasks are to determine the taxonomic composition (who is out there?), the functional composition (what can they do?), and also to correlate changes of composition to changes in external parameters (how do they compare?). One approach to address these iss...
Phylogenetic analysis frequently leads to the creation of many phylogenetic trees, either from using multiple genes or methods, or through bootstrapping or Bayesian analysis. A consensus tree is often used to summarize what the trees have in common. Consensus networks were introduced to also allow the visualization of the main incompatibilities amo...
Microbial community maintains its ecological dynamics via metabolites crosstalk. Hence knowledge of the metabolome, alongside its populace, would help us understand the functionality of that community and also predict how it alters in atypical conditions. The metabolic potential of a community from low-cost metagenomic sequencing data signifies the...
Motivation:
Metagenomic projects often involve large numbers of large sequencing datasets (totaling hundreds of gigabytes of data). Thus, computational preprocessing and analysis are usually performed on a server. The results of such analyses are then usually explored interactively. One approach is to use MEGAN, an interactive program that allows...
Antibiotics have been an essential part of modern medicine since their initial discovery. The continuous exploration of new antibiotic candidates remains a necessity given the increasing emergence of resistance to antimicrobial compounds among pathogens. An important group of last-resort antibiotics, the glycopeptide antibiotics (GPAs), have been s...
Third-generation sequencing technologies are being increasingly used in microbiome research and this has given rise to new challenges in computational microbiome analysis. Oxford Nanopore's MinION is a portable sequencer that streams data that can be basecalled on-the-fly. Here we give an introduction to the MAIRA software, which is designed to ana...
Transformer-based language models are successfully used to address massive text-related tasks. DNA methylation is an important epigenetic mechanism and its analysis provides valuable insights into gene regulation and biomarker identification. Several deep learning-based methods have been proposed to identify DNA methylation
and each seeks to strike...
Methanogenesis allows methanogenic archaea (methanogens) to generate cellular energy for their growth while producing methane. Hydrogenotrophic methanogens thrive on carbon dioxide and molecular hydrogen as sole carbon and energy sources. Thermophilic and hydrogenotrophic Methanothermobacter spp. have been recognized as robust biocatalysts for a ci...
Motivation
Metagenomic projects of large sequencing datasets (totaling hundreds of gigabytes of data). Thus, computational preprocessing and analysis are usually performed on a server rather than on a personal computer. The results of such analyses are then usually explored interactively. One approach is to use MEGAN, an interactive program that al...
In microbiome analysis, functional profiling is based on assigning reads or contigs to terms or nodes in a functional classification system. There are a number of large, general-purpose functional classifications that are in use, such as eggNOG, KEGG, InterPro and SEED. Smaller, special-purpose classifications include CARD, EC, MetaCyc and VFDB. He...
Motivation:
Metagenomics is the study of microbiomes using DNA sequencing. A microbiome consists of an assemblage of microbes that is associated with a "theater of activity" (ToA). An important question is, to what degree does the taxonomic and functional content of the former depend on the (details of the) latter? Here we investigate a related te...
During the last two decades, yeast has been used as a biological tool to produce various small molecules, biofuels, etc., using an inexpensive bioprocess. The application of Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-CRISPR-associated protein (Cas) techniques in yeast genetic and metabolic engineering has made a paradigm shi...
Agroindustrial waste, such as fruit residues, are a renewable, abundant, low-cost, commonly-used carbon source. Biosurfactants are molecules of increasing interest due to their multifunctional properties, biodegradable nature and low toxicity, in comparison to synthetic surfactants. A better understanding of the associated microbial communities wil...
Motivation
Metagenomics is the study of microbiomes using DNA sequencing. A microbiome consists of an assemblage of microbes that is associated with a “theater of activity” (ToA). To what degree does the taxonomic and functional content of the former depend on the (details of the) latter? More technically, given a taxonomic and/or functional profil...
This mini-review aims at raising the interest in contractile phage tail-like particles (CPTPs) of bacteria as an efficient and pest-specific alternative to conventional chemical pesticides in agriculture, horticulture and forestry. CPTPs are used by various bacteria in diverse environments for interbacterial competition or for manipulation of eukar...
In microbiome analysis, one main approach is to align metagenomic sequencing reads against a protein reference database, such as NCBI-nr, and then to perform taxonomic and functional binning based on the alignments. This approach is embodied, for example, in the standard DIAMOND1MEGAN analysis pipeline, which first aligns reads against NCBI-nr usin...
New long read sequencing technologies offer huge potential for effective recovery of complete, closed genomes from complex microbial communities. Using long read data (ONT MinION) obtained from an ensemble of activated sludge enrichment bioreactors we recover 22 closed or complete genomes of community members, including several species known to pla...
In microbiome analysis, one main approach is to align metagenomic sequencing reads against a protein-reference database such as NCBI-nr, and then to perform taxonomic and functional binning based on the alignments. This approach is embodied, for example, in the standard DIAMOND+MEGAN analysis pipeline, which first aligns reads against NCBI-nr using...
Microbial biosurfactants are of major interest due to their multifunctional properties, biodegradable nature and low toxicity. Agroindustrial waste, such as fruit waste, can be used as substrates for producing bacteria. In this study, six samples of fruit waste, from oranges, mangoes and mixed fruits, were self-fermented, and then subjected to shor...
Microbial studies typically involve the sequencing and assembly of draft genomes for individual microbes or whole microbiomes. Given a draft genome, one first task is to determine its phylogenetic context, that is, to place it relative to the set of related reference genomes. We provide a new interactive graphical tool that addresses this task usin...
Periodic tilings play a role in the decorative arts, in construction and in crystal structures. Combinatorial tiling theory allows the systematic generation, visualization and exploration of such tilings of the plane, sphere and hyperbolic plane, using advanced algorithms and software. Here we present a “galaxy” of tilings that consists of the set...
Sulfolobaceae family, comprising diverse thermoacidophilic and aerobic sulfur-metabolizing Archaea from various geographical locations, offers an ideal opportunity to infer the evolutionary dynamics across the members of this family. Comparative pan-genomics coupled with evolutionary analyses has revealed asymmetric genome evolution within the Sulf...
A bstract
Microbial studies typically involve the sequencing and assembly of draft genomes for individual microbes or whole microbiomes. Given a draft genome, one first task is to determine its phylogenetic context, that is, to place it relative to the set of related reference genomes. We provide a new interactive graphical tool that addresses this...
Rooted phylogenetic networks provide a way to describe species’ relationships when evolution departs from the simple model of a tree. However, networks inferred from genomic data can be highly tangled, making it difficult to discern the main reticulation signals present. In this paper, we describe a natural way to transform any rooted phylogenetic...
One main approach to computational analysis of microbiome sequences is to first align against a reference database of annotated protein sequences (NCBI‐nr) and then perform taxonomic and functional binning of the sequences based on the resulting alignments. For both short and long reads (or assembled contigs), alignment is performed using DIAMOND,...
Bulk production of medium-chain carboxylates (MCCs) with 6–12 carbon atoms is of great interest to biotechnology. Open cultures (e.g., reactor microbiomes) have been utilized to generate MCCs in bioreactors. When in-line MCC extraction and prevention of product inhibition is required, the bioreactors have been operated at mildly acidic pH (5.0–5.5)...
The current COVID-19 pandemic, caused by the rapid worldwide spread of the SARS-CoV-2 virus, is having severe consequences for human health and the world economy. The virus affectsdifferent individuals differently, with many infected patients showing only mild symptoms, andothers showing critical illness. To lessen the impact of the epidemic, one p...
Metabolism across all known living systems combines two key features. First, all of the molecules that are required are either available in the environment or can be built up from available resources via other reactions within the system. Second, the reactions proceed in a fast and synchronized fashion via catalysts that are also produced within th...
Background:
Advances in mobile sequencing devices and laptop performance make metagenomic sequencing and analysis in the field a technologically feasible prospect. However, metagenomic analysis pipelines are usually designed to run on servers and in the cloud.
Results:
MAIRA is a new standalone program for interactive taxonomic and functional an...
Rooted phylogenetic networks provide a way to describe species' relationships when evolution departs from the simple model of a tree. However, networks inferred from genomic data can be highly tangled, making it difficult to discern the main reticulation signals present. In this paper, we describe a natural way to transform any rooted phylogenetic...
Periodic tilings play a role in the decorative arts, in construction and in crystal structures. Combinatorial tiling theory allows the systematic generation, visualization and exploration of such tilings of the plane, sphere and hyperbolic plane, using advanced algorithms and software.Here we present a "galaxy" of tilings that consists of the set o...
Background
Bulk production of medium-chain carboxylates (MCCs) with 6-12 carbon atoms is of great interest to biotechnology. Open cultures ( e . g ., reactor microbiomes) have been utilized to generate MCCs in bioreactors. When in-line MCC extraction and prevention of product inhibition is required, the bioreactors have been operated at mildly acid...
The current COVID-19 pandemic, caused by the rapid world-wide spread of the SARS-CoV-2 virus, is having severe consequences for human health and the world economy. The virus effects individuals quite differently, with many infected patients showing only mild symptoms, and others showing critical illness. To lessen the impact of the pandemic, one im...
Metabolism across all known living systems combines two key features. First, all of the molecules that are required are either available in the environment or can be built up from available resources via other reactions within the system. Second, the reactions proceed in a fast and synchronised fashion via catalysts that are also produced within th...
8930 Impact of selective digestive and oropharyngeal decontamination on the gut microbiome and resistome in intensive care patients Background: Selective decontamination of the digestive tract (SDD) and of the oropharynx (SOD) are prophylactic interventions to reduce mortality and infectious complications in intensive care unit (ICU) patients. Thes...
New long read sequencing technologies offer huge potential for effective recovery of complete, closed genomes from complex microbial communities. Using long read (MinION) obtained from an ensemble of activated sludge enrichment bioreactors, we 1) describe new methods for validating long read assembled genomes using their counterpart short read meta...
Several studies have demonstrated that the viral genome can be methylated by the host cell during progression from persistent infection to cervical cancer. The aim of this study was to investigate whether methylation at a specific site could predict the development of viral persistence and whether viral load shows a correlation with specific methyl...
Recent findings suggest an implication of the gut microbiome in Parkinson’s disease (PD) patients. PD onset and progression has also been linked with various environmental factors such as physical activity, exposure to pesticides, head injury, nicotine, and dietary factors. In this study, we used a mouse model, overexpressing the complete human SNC...
Motivation
Antibiotic resistance is widely recognized as a severe threat to current medical practice. Each antibiotic therapy drives the emergence and subsequent retention of antibiotics resistance genes within the human gut microbiome. However, the details on how the resistance spreads between bacteria within the human gut remain unknown, as does...
A bstract
In phylogenetics, a set of gene trees is often summarized by a consensus tree, such as the majority consensus, which is based on the set of all splits that are present in more than 50% of the input trees. A “consensus network” is obtained by lowering the threshold and considering all splits that are contained in 10% of the trees, say, and...
Metagenomics has become a part of the standard toolkit for scientists interested in studying microbes in the environment. Compared to 16S rDNA sequencing, which allows coarse taxonomic profiling of samples, shotgun metagenomic sequencing provides a more detailed analysis of the taxonomic and functional content of samples. Long read technologies, su...
Objectives
The aim of the study was to measure the impact of antibiotic exposure on the acquisition of colonization with extended-spectrum β-lactamase-producing Gram-negative bacteria (ESBL-GNB) accounting for individual- and group-level confounding using machine-learning methods.
Methods
Patients hospitalized between September 2010 and June 2013...
Background
Recent findings suggest an implication of the gut microbiome in Parkinson’s disease patients. Parkinson’s disease onset and progression has also been linked with various environmental factors such as physical activity, exposure to pesticides, head injury, nicotine, and dietary factors.
Objectives
In this study, we used a transgenic mous...
Background
Short-read sequencing technologies have long been the work-horse of microbiome analysis. Continuing technological advances are making the application of long-read sequencing to metagenomic samples increasingly feasible.
Results
We demonstrate that whole bacterial chromosomes can be obtained from an enriched community, by application of...
Host genetic variation influences microbiome composition. While studies have focused on associations between the gut microbiome and specific alleles, gene copy number (CN) also varies. We relate microbiome diversity to CN variation of the AMY1 locus, which encodes salivary amylase, facilitating starch digestion. After imputing AMY1-CN for ∼1,000 su...
The microbiota and the gastrointestinal mucus layer play a pivotal role in protection against non-typhoidal Salmonella enterica serovar Typhimurium (S. Tm) colitis. Here, we analyzed the course of Salmonella colitis in mice lacking a functional mucus layer in the gut. Unexpectedly, in contrast to mucus-proficient littermates, genetically deficient...
Type VI secretion systems and tailocins, two bacterial phage tail-like particles, have been reported to foster interbacterial competition. Both nanostructures enable their producer to kill other bacteria competing for the same ecological niche. Previously, type VI secretion systems and particularly R-type tailocins were considered highly specific,...
Background
Short-read sequencing technologies have long been the work-horse of microbiome analysis. Continuing technological advances are making the application of long-read sequencing to metagenomic samples increasingly feasible.
Results
We demonstrate that whole bacterial chromosomes can be obtained from a complex community, by application of Mi...
New long read sequencing technologies offer huge potential for effective recovery of complete, closed genomes. While much progress has been made on cultured isolates, the ability of these methods to recover genomes of member taxa in complex microbial communities is less clear. Here we examine the ability of long read data to recover genomes from en...
Host genetic variation influences the composition of the human microbiome. While studies have focused on associations between the microbiome and single nucleotide polymorphisms in genes, their copy number (CN) can also vary. Here, in a study of human subjects including a 2-week standard diet, we relate oral and gut microbiome to CN at the AMY1 locu...
Background:
There are numerous computational tools for taxonomic or functional analysis of microbiome samples, optimized to run on hundreds of millions of short, high quality sequencing reads. Programs such as MEGAN allow the user to interactively navigate these large datasets. Long read sequencing technologies continue to improve and produce incr...
Bacterial viruses contribute to the dynamics of the microbiome communities, as they are involved in the horizontal gene transfer. Previously we studied changes in the gut microbiome of the two healthy individuals over the course of a 6-days antibiotics treatment and subsequent 28 days recovery time (Willmann et al., 2015). Now, from the same sample...
Indigenous populations of the Americas experienced high mortality rates during the early contact period as a result of infectious diseases, many of which were introduced by Europeans. Most of the pathogenic agents that caused these outbreaks remain unknown. Through the introduction of a new metagenomic analysis tool called MALT, applied here to sea...
The larvae of the greater wax moth, Galleria mellonella , are pests of active beehives. In infection biology, these larvae are playing a more and more attractive role as an invertebrate host model. Here, we report on the first genome sequence of Galleria mellonella .
Background
There are numerous computational tools for taxonomic or functional analysis of microbiome samples, optimized to run on hundreds of millions of short, high quality sequencing reads. Programs such as MEGAN allow the user to interactively navigate these large datasets. Long read sequencing technologies continue to improve and produce increa...
Early microbiome studies focused on estimating the taxonomic composition of an assemblage of microbes using amplicon sequencing. With improved throughput and decreased cost of sequencing, whole genome shotgun (WGS) sequencing of environmental samples has become a standard procedure in microbial studies. This allows a more detailed analysis of the t...
Microbial nitrogen transformation processes such as denitrification represent major sources of the potent greenhouse gas nitrous oxide (N2O). Soil biochar amendment has been shown to significantly decrease N2O emissions in various soils. However, the effect of biochar on the structure and function of microbial communities that actively perform nitr...
With the rise of multi-drug resistant pathogens and the decline in number of potential new antibiotics in development there is a fervent need to reinvigorate the natural products discovery pipeline. Most antibiotics are derived from secondary metabolites produced by microorganisms and plants. To avoid suicide, an antibiotic producer harbors resista...
BackgroundA key step in microbiome sequencing analysis is read assignment to taxonomic units. This is often performed using one of four taxonomic classifications, namely SILVA, RDP, Greengenes or NCBI. It is unclear how similar these are and how to compare analysis results that are based on different taxonomies. ResultsWe provide a method and softw...
Recent genomic data have revealed multiple interactions between Neanderthals and modern humans, but there is currently little genetic evidence regarding Neanderthal behaviour, diet, or disease. Here we describe the shotgun-sequencing of ancient DNA from five specimens of Neanderthal calcified dental plaque (calculus) and the characterization of reg...
Indigenous populations of the Americas experienced high mortality rates during the early contact period as a result of infectious diseases, many of which were introduced by Europeans. Most of the pathogenic agents that caused these outbreaks remain unknown. Using a metagenomic tool called MALT to search for traces of ancient pathogen DNA, we were a...
Indigenous populations of the Americas experienced high mortality rates during the early contact period as a result of infectious diseases, many of which were introduced by Europeans. Most of the pathogenic agents that caused these outbreaks remain unknown. Using a metagenomic tool called MALT to search for traces of ancient pathogen DNA, we were a...
Background
Microbiome sequencing projects typically collect tens of millions of short reads per sample. Depending on the goals of the project, the short reads can either be subjected to direct sequence analysis or be assembled into longer contigs. The assembly of whole genomes from metagenomic sequencing reads is a very difficult problem. However,...
Background
Taxonomic profiling of microbial communities is often performed using small subunit ribosomal RNA (SSU) amplicon sequencing (16S or 18S), while environmental shotgun sequencing is often focused on functional analysis. Large shotgun datasets contain a significant number of SSU sequences and these can be exploited to perform an unbiased SS...
In this work we report a detailed analysis of the topology and phylogenetics of family 2 glycoside hydrolases (GH2). We distinguish five topologies or domain architectures based on the presence and distribution of protein domains defined in Pfam and Interpro databases. All of them share a central TIM barrel (catalytic module) with two β-sandwich do...
Phylogenetic analysis of the GH2C domain.
The tree was generated as described in the legend of Fig 3, but including a tag for each sequence. The tag corresponds to the GI number and a descriptive legend of the corresponding domain architecture. In most cases, where the GH2N-GH2d-GH2C tandem is conserved, only the composition of the C-terminal domai...
Structural superposition of the β-galactosidases from Thermotoga. maritima (TmLac) and Kluyveromyces lactis (KlLac).
The structural model of TmLac (orange) was aligned with one of the subunits of KlLac (green, PDB code 3OB8). Residues contributing to the catalytic pocket are highlighted.
(TIF)
Domain architectures of all GH2 sequences analyzed in this study.
Sequences are classified according to DA type. Each sequence is identified by the GI number and Accession number from Genbank, EMBL or DDBJ databases. The specific domain architecture, organism of origin (source) and Genbank definition is shown. Biochemical characterization, as recor...
Cluster and subcluster classification of DA type 5 proteins with unidentified C-terminal extensions.
(DOCX)
Pipeline followed for the analysis of GH2 domain architectures and phylogenetic tree construction.
(TIF)
Summary of Genbank annotations and biochemical characterization, as recorded in the CAZy database, of enzyme activity for each DA type.
(DOCX)
Cluster and subcluster classification of DA type 4 proteins.
(DOCX)
Binary vectors assigned to each sequence to describe the different domain architectures.
Sheet S1.1 specifies the Pfam domain assigned to each position in the binary vector. Each Pfam code is followed by an underscore symbol and a number to indicate the presence of tandem repeats when necessary. Sheet S1.2 indicates the binary vector that correspon...