An example of the transformation to Balanced Subgraph. An arbitrary tanglegram of the input trees (upper left) is transformed into a bipartite graph (lower left). Continuous lines denote =-edges, dashed lines =-edges. This instance can be solved by breaking one edge, e.g. {u 2 , v 2 }, which leads to a valid 2-coloring of the vertices (lower right). The vertices of one color, here u 1 and u 2 , are switched to obtain an optimal tanglegram (upper right).

Source publication

A Faster Fixed-Parameter Approach to Drawing Binary Tanglegrams

Conference Paper

Full-text available

Sep 2009

Given two binary phylogenetic trees covering the same n species, it is useful to compare them by drawing them with leaves arranged side-by-side. To facilitate comparison, we would like to arrange the trees to minimize the number of crossings k induced by connecting pairs of identical species. This is the NP-hard Tanglegram Layout problem. By provid...

The Largest Crossing Number of Tanglegrams

Article

Mar 2024
ELECTRON J COMB

A tanglegram $\mathcal{T}$ consists of two rooted binary trees with the same number of leaves, and a perfect matching between the two leaf sets. In a layout, the tanglegrams is drawn with the leaves on two parallel lines, the trees on either side of the strip created by these lines are drawn as plane trees, and the perfect matching is drawn in straight line segments inside the strip. The tanglegram crossing number ${\rm cr}({\mathcal{T}})$ of $\mathcal{T}$ is the smallest number of crossings of pairs of matching edges, over all possible layouts of $\mathcal{T}$. The size of the tanglegram is the number of matching edges, say $n$. An earlier paper showed that the maximum of the tanglegram crossing number of size $n$ tanglegrams is $<\frac{1}{2}\binom{n}{2}$; but is at least $\frac{1}{2}\binom{n}{2}-\frac{n^{3/2}-n}{2}$ for infinitely many $n$. Now we make better bounds: the maximum crossing number of a size $n$ tanglegram is at most $\frac{1}{2}\binom{n}{2}-\frac{n}{4}$, but for infinitely many $n$, at least $\frac{1}{2}\binom{n}{2}-\frac{n\log_2 n}{4}$. The problem shows analogy with the Unbalancing Lights Problem of Gale and Berlekamp.

Block Crossings in One-Sided Tanglegrams

Preprint

Full-text available

May 2023

Tanglegrams are drawings of two rooted binary phylogenetic trees and a matching between their leaf sets. The trees are drawn crossing-free on opposite sides with their leaf sets facing each other on two vertical lines. Instead of minimizing the number of pairwise edge crossings, we consider the problem of minimizing the number of block crossings, that is, two bundles of lines crossing each other locally. With one tree fixed, the leaves of the second tree can be permuted according to its tree structure. We give a complete picture of the algorithmic complexity of minimizing block crossings in one-sided tanglegrams by showing NP-completeness, constant-factor approximations, and a fixed-parameter algorithm. We also state first results for non-binary trees.

The Rosetta Stone Hypothesis-Based Interaction of the Tumor Suppressor Proteins Nit1 and Fhit

Article

Full-text available

Jan 2023

In previous studies, we have identified the tumor suppressor proteins Fhit (fragile histidine triad) and Nit1 (Nitrilase1) as interaction partners of β-catenin both acting as repressors of the canonical Wnt pathway. Interestingly, in D. melanogaster and C. elegans these proteins are expressed as NitFhit fusion proteins. According to the Rosetta Stone hypothesis, if proteins are expressed as fusion proteins in one organism and as single proteins in others, the latter should interact physically and show common signaling function. Here, we tested this hypothesis and provide the first biochemical evidence for a direct association between Nit1 and Fhit. In addition, size exclusion chromatography of purified recombinant human Nit1 showed a tetrameric structure as also previously observed for the NitFhit Rosetta Stone fusion protein Nft-1 in C. elegans. Finally, in line with the Rosetta Stone hypothesis we identified Hsp60 and Ubc9 as other common interaction partners of Nit1 and Fhit. The interaction of Nit1 and Fhit may affect their enzymatic activities as well as interaction with other binding partners.

Visualizing Co-phylogenetic reconciliations

Article

Jan 2020
THEOR COMPUT SCI

We introduce a hybrid metaphor for the visualization of the reconciliations of co-phylogenetic trees, that are mappings among the nodes of two trees with constraints on the leaves. The typical application is the visualization of the co-evolution of hosts and parasites in biology. Our strategy combines a space-filling and a node-link approach. Differently from traditional methods, it guarantees an unambiguous and downward representation whenever the reconciliation is time-consistent (i.e., biologically-feasible). We address the problem of the minimization of the number of crossings in the representation, by giving a characterization of planar instances and by establishing the complexity of the problem. Finally, we propose heuristics for computing representations with few crossings.

Tanglegrams: A Reduction Tool for Mathematical Phylogenetics

Article

Full-text available

Jul 2015
IEEE ACM T COMPUT BI

Many discrete mathematics problems in phylogenetics are defined in terms of the relative labeling of pairs of leaf-labeled trees. These relative labelings are naturally formalized as tanglegrams, which have previously been an object of study in coevolutionary analysis. Although there has been considerable work on planar drawings of tanglegrams, they have not been fully explored as combinatorial objects until recently. In this paper, we describe how many discrete mathematical questions on trees "factor" through a problem on tanglegrams, and how understanding that factoring can simplify analysis. Depending on the problem, it may be useful to consider a unordered version of tanglegrams, and/or their unrooted counterparts. For all of these definitions, we show how the isomorphism types of tanglegrams can be understood in terms of double cosets of the symmetric group, and we investigate their automorphisms. Understanding tanglegrams better will isolate the distinct problems on leaf-labeled pairs of trees and reveal natural symmetries of spaces associated with such problems.

Exact methods for nonlinear combinatorial optimization

Article

Sep 2014

Frank Baumann

We consider combinatorial optimization problems with nonlinear objective functions. Solution approaches for this class of problems proposed so far are either highly problem-specific or they apply generic algorithms for constrained nonlinear optimization, which often does not yield satisfactory results in practice. Our aim is to develop, implement and experimentally evaluate exact algorithms that address the nonlinearity of the objective function and at the same time exploit the underlying combinatorial structure of the problem. To this end we follow two approaches. The first combines good polyhedral descriptions of the objective function and the feasible set in a branch and cut-algorithm. The second approach is based on Lagrangean decomposition. By decomposing the original problem into an unconstrained nonlinear problem and a linear combinatorial problem, we are able to compute strong dual bounds for the optimal value. The computation of lower bounds is then embedded into a branch and bound-algorithm. For many applications there already exist efficient algorithms for the combinatorial subproblem, thus an important aspect of this thesis is the study of the corresponding unconstrained nonlinear subproblems. Both approaches have the advantage that they can easily be adapted to a wide range of nonlinear combinatorial problems.We devise both polyhedral and decomposition- based algorithms for submodular applications from wireless network design and portfolio optimization and evaluate their performance experimentally. Exploiting the equivalence between unconstrained binary quadratic optimization and the maximum cut problem gives rise to a branch and cut-algorithm for quadratic combinatorial problems which we use to compute optimal layouts of tanglegrams, an application from computational biology. Additionally we study the effect of quadratic reformulation of linear constraints, both theoretically and experimentally. The last class of nonlinear combinatorial problems we consider are two-scenario problems. Here we propose a new technique to compute lower bounds in the unconstrained subproblem of the decomposition. Our computational study of the two-scenario minimum spanning tree problem shows that the new Lagrangean decomposition-based algorithm is able to solve significantly larger instances than the standard linearization approach.

Social Choice Meets Graph Drawing: How to Get Subexponential Time Algorithms for Ranking and Drawing Problems

Article

Aug 2014

We analyze a common feature of p-Kemeny AGGregation (p-KAGG) and p-One-Sided Crossing Minimization (p-OSCM) to provide new insights and findings of interest to both the graph drawing community and the social choice community. We obtain parameterized subexponential-time algorithms for p-KAGG — a problem in social choice theory — and for p-OSCM — a problem in graph drawing. These algorithms run intime O∗(2O(√k log k)), where k is the parameter, and significantly improve the previous best algorithms with running times O∗(1.403k) and O∗(1.4656k), respectively. We also study natural “above-guarantee” versions of these problems and show them to be fixed parameter tractable. In fact, we show that the above-guarantee versions of these problems are equivalent to a weighted variant of p-directed feedback arc set. Our results for the above-guarantee version of p-KAGG reveal an interesting contrast. We show that when the number of “votes” in the input to p-KAGG is odd the above guarantee version can still be solved in time O∗(2O(√l log k)), while if it is eventhen the problem cannot have a subexponential time algorithm unless the exponential time hypothesis fails (equivalently, unless FPT = M[1]).

4-Hydroxyphenylglycine biosynthesis in Herpetosiphon aurantiacus: A case of gene duplication and catalytic divergence

Article

Feb 2012
ARCH MICROBIOL

The nonproteinogenic amino acid 4-hydroxyphenylglycine (HPG) arises from the diversion of the tyrosine degradation pathway into secondary metabolism, and its biosynthesis requires a set of three enzymes. The gene cassette for HPG biosynthesis is widely spread in actinomycete bacteria, which incorporate the amino acid as a building block into various peptide antibiotics, but it has never been reported from another taxonomic group of eubacteria. A genome mining study has now revealed a putative HPG pathway in the predatory bacterium Herpetosiphon aurantiacus, which is phylogenetically distinct from Actinomycetes. Anomalies in the active center of one annotated key enzyme raised questions about the true product of this pathway, prompting an in vitro reconstitution attempt. This study confirmed the capability of H. aurantiacus for HPG production. Sequence analysis of the aberrant 4-hydroxymandelate synthase refines the existing model on the catalytic differentiation of iron(II)-dependent dioxygenases. Furthermore, we report a comprehensive analysis on the phylogeny of these enzymes, which sheds light on the evolution of paralogous gene sets and the ensuing metabolic diversity in a barely studied bacterium.

Compression via Matroids: A Randomized Polynomial Kernel for Odd Cycle Transversal

Article

Jul 2011
ACM Trans Algorithm

The Odd Cycle Transversal problem (OCT) asks whether a given graph can be made bipartite by deleting at most $k$ of its vertices. In a breakthrough result Reed, Smith, and Vetta (Operations Research Letters, 2004) gave a $\BigOh(4^kkmn)$ time algorithm for it, the first algorithm with polynomial runtime of uniform degree for every fixed $k$. It is known that this implies a polynomial-time compression algorithm that turns OCT instances into equivalent instances of size at most $\BigOh(4^k)$, a so-called kernelization. Since then the existence of a polynomial kernel for OCT, i.e., a kernelization with size bounded polynomially in $k$, has turned into one of the main open questions in the study of kernelization. This work provides the first (randomized) polynomial kernelization for OCT. We introduce a novel kernelization approach based on matroid theory, where we encode all relevant information about a problem instance into a matroid with a representation of size polynomial in $k$. For OCT, the matroid is built to allow us to simulate the computation of the iterative compression step of the algorithm of Reed, Smith, and Vetta, applied (for only one round) to an approximate odd cycle transversal which it is aiming to shrink to size $k$. The process is randomized with one-sided error exponentially small in $k$, where the result can contain false positives but no false negatives, and the size guarantee is cubic in the size of the approximate solution. Combined with an $\BigOh(\sqrt{\log n})$-approximation (Agarwal et al., STOC 2005), we get a reduction of the instance to size $\BigOh(k^{4.5})$, implying a randomized polynomial kernelization.

Tanglegrams for rooted phylogenetic trees and networks

Article

Full-text available

Jul 2011
BIOINFORMATICS

In systematic biology, one is often faced with the task of comparing different phylogenetic trees, in particular in multi-gene analysis or cospeciation studies. One approach is to use a tanglegram in which two rooted phylogenetic trees are drawn opposite each other, using auxiliary lines to connect matching taxa. There is an increasing interest in using rooted phylogenetic networks to represent evolutionary history, so as to explicitly represent reticulate events, such as horizontal gene transfer, hybridization or reassortment. Thus, the question arises how to define and compute a tanglegram for such networks. In this article, we present the first formal definition of a tanglegram for rooted phylogenetic networks and present a heuristic approach for computing one, called the NN-tanglegram method. We compare the performance of our method with existing tree tanglegram algorithms and also show a typical application to real biological datasets. For maximum usability, the algorithm does not require that the trees or networks are bifurcating or bicombining, or that they are on identical taxon sets. The algorithm is implemented in our program Dendroscope 3, which is freely available from www.dendroscope.org. scornava@informatik.uni-tuebingen.de; huson@informatik.uni-tuebingen.de.

Citations