Home
French National Centre for Scientific Research
Laboratoire d'Informatique, de Robotique et de Microélectronique de Montpellier (LIRMM)
Fabio Pardi

Fabio Pardi
French National Centre for Scientific Research | CNRS · Laboratoire d'Informatique, de Robotique et de Microélectronique de Montpellier (LIRMM)

Doctor of Philosophy

About

Publications

5,271

Reads

5,970

Citations

Skills and Expertise

Computational Phylogenetics

Biomathematics

Algorithms

Theoretical Biology

Evolutionary Bioinformatics

Publications

Phylogenetic Inference: Distance‐Based Methods

Chapter

Apr 2024

Fabio Pardi

Figure 2. Results on phylogenetic placement accuracy of different...

Figure 3. Results on k-mer filtering for the mutual information filter...

Figure 4. Running time of different phylogenetic placement tools on...

Datasets used to evaluate the accuracy or speed of phylogenetic...

Preprocessing times on four reference datasets. a

EPIK: Precise and scalable evolutionary placement with informative k-mers

Article

Full-text available

Nov 2023

Motivation: Phylogenetic placement enables phylogenetic analysis of massive collections of newly sequenced DNA, when de novo tree inference is too unreliable or inefficient. Assuming that a high-quality reference tree is available, the idea is to seek the correct placement of the new sequences in that tree. Recently, alignment-free approaches to ph...

Computing Phylo-k-Mers

Article

May 2023

Finding the correct position of new sequences within an established phylogenetic tree is an increasingly relevant problem in evolutionary bioinformatics and metagenomics. Recently, alignment-free approaches for this task have been proposed. One such approach is based on the concept of phylogenetically-informative $k$ -mers or phylo- $k$ -mers fo...

Computing Phylo-k-mers

Preprint

Sep 2022

Phylogenetically informed k-mers, or phylo-k-mers for short, are k-mers that are predicted to appear within a given genomic region at predefined locations of a fixed phylogeny. Given a reference alignment for this genomic region and assuming a phylogenetic model of sequence evolution, we can compute a probability score for any given k-mer at any gi...

Example of PNS for a 100-vertebrates tree. Here we show graphically the...

Comparison of different weighting schemes. Bars show weights assigned...

Computational demand of different approaches to character frequency...

Equilibrium frequency inference error. Comparison of the accuracy of...

Equilibrium frequency inference error under different scenarios....

A phylogenetic approach for weighting genetic sequences

Article

Full-text available

May 2021

Background Many important applications in bioinformatics, including sequence alignment and protein family profiling, employ sequence weighting schemes to mitigate the effects of non-independence of homologous sequences and under- or over-representation of certain taxa in a dataset. These schemes aim to assign high weights to sequences that are ‘nov...

Fig 1. Example of a phylogenetic network. The top node represents the...

Fig 2. Illustration of the concepts and notation employed to describe...

Fig 3. Illustration of Rule 2. Given (a) the partial likelihoods for...

Fig 4. Illustration of Rule 3. Given (a) the partial likelihoods for...

On the inference of complex phylogenetic networks by Markov Chain Monte-Carlo

Article

Full-text available

May 2021

For various species, high quality sequences and complete genomes are nowadays available for many individuals. This makes data analysis challenging, as methods need not only to be accurate, but also time efficient given the tremendous amount of data to process. In this article, we introduce an efficient method to infer the evolutionary history of in...

Fig. 2. Trade-off between recall and specificity for the binary...

Rapid screening and detection of inter-type viral recombinants using Phylo- K -Mers

Article

Full-text available

Dec 2020

Motivation Novel recombinant viruses may have important medical and evolutionary significance, as they sometimes display new traits not present in the parental strains. This is particularly concerning when the new viruses combine fragments coming from phylogenetically-distinct viral types. Here, we consider the task of screening large collections o...

Figure 4 Equilibrium frequency inference error. Comparison of the...

Phylogenetic Novelty Scores: a New Approach for Weighting Genetic Sequences

Preprint

Full-text available

Dec 2020

Computing the probability of gene trees concordant with the species tree in the multispecies coalescent

Article

Dec 2020

The multispecies coalescent process models the genealogical relationships of genes sampled from several species, enabling useful predictions about phenomena such as the discordance between a gene tree and the species phylogeny due to incomplete lineage sorting. Conversely, knowledge of large collections of gene trees can inform us about several asp...

PEWO: A collection of workflows to benchmark phylogenetic placement

Article

Full-text available

Jul 2020

Motivation: Phylogenetic placement (PP) is a process of taxonomic identification for which several tools are now available. However, it remains difficult to assess which tool is more adapted to particular genomic data or a particular reference taxonomy. We developed PEWO, the first benchmarking tool dedicated to PP assessment. Its automated workfl...

Rapid screening and detection of inter-type viral recombinants using phylo-k-mers

Preprint

Full-text available

Jun 2020

Motivation: Novel recombinant viruses may have important medical and evolutionary significance, as they sometimes display new traits not present in the parental strains. This is particularly concerning when the new viruses combine fragments coming from phylogenetically-distinct viral types. Here, we consider the task of screening large collections...

Computing the probability of gene trees concordant with the species tree in the multispecies coalescent

Preprint

Jan 2020

The multispecies coalescent process models the genealogical relationships of genes sampled from several species, enabling useful predictions about phenomena such as the discordance between the gene tree and the species phylogeny due to incomplete lineage sorting. Conversely, knowledge of large collections of gene trees can inform us about several a...

Cutting an alignment with Ockham's razor

Preprint

Oct 2019

In this article, we investigate different parsimony-based approaches towards finding recombination breakpoints in a multiple sequence alignment. This recombination detection task is crucial in order to avoid errors in evolutionary analyses caused by mixing together portions of sequences which had a different evolution history. Following an overview...

Rapid alignment-free phylogenetic identification of metagenomic sequences

Article

Full-text available

Jan 2019

Motivation Taxonomic classification is at the core of environmental DNA analysis. When a phylogenetic tree can be built as a prior hypothesis to such classification, phylogenetic placement (PP) provides the most informative type of classification because each query sequence is assigned to its putative origin in the tree. This is useful whenever pre...

Fig. 1 Although each sequence has length |V |, only columns u and v are...

Although each sequence has length |V|, only columns u and v are shown....

The four switchings possible for Ne\documentclass[12pt]{minimal}...

Alternative representations of a phylogenetic network having some...

The four trees with edge probabilities displayed by the network in...

Finding a most parsimonious or likely tree in a network with respect to an alignment

Article

Full-text available

Jan 2019

Phylogenetic networks are often constructed by merging multiple conflicting phylogenetic signals into a directed acyclic graph. It is interesting to explore whether a network constructed in this way induces biologically-relevant phylogenetic signals that were not present in the input. Here we show that, given a multiple alignment A for a set of tax...

Cutting an alignment with Ockham's razor

Article

Jan 2019

Rapid alignment-free phylogenetic identification of metagenomic sequences

Preprint

Full-text available

May 2018

Fig 1. Phylogenetic network showing hypothetical evolutionary scenarios...

Fig 2. Two networks such that any sequence of rooted LST moves...

Fig 3. An arc removal. https://doi.org/10.1371/journal.pcbi.1005611.g003

Fig 4. The seven different variants of the rNNI move. Dashed edges...

Fig 5. Lemma 4 does not hold for general networks. Two rooted networks...

Rearrangement Moves on Rooted Phylogenetic Networks

Article

Full-text available

Aug 2017

Phylogenetic tree reconstruction is usually done by local search heuristics that explore the space of the possible tree topologies via simple rearrangements of their structure. Tree rearrangement heuristics have been used in combination with practically all optimization criteria in use, from maximum likelihood and parsimony to distance-based princi...

S1 Text

Data

Aug 2017

Supporting information: Proofs omitted from the main text. This document provides the proofs of Lemmas 2, 3, 5, Theorem 4, and Proposition 2. It ends with a few remarks on the size of rNNI neighborhoods. (PDF)

Distance-Based Phylogenetic Inference

Article

Dec 2016

A popular approach in phylogenetics consists in estimating a matrix of evolutionary distances between pairs of taxa, and then using this information to reconstruct a phylogenetic tree for those taxa. In this article, we first explain how distances should be defined and estimated, and then focus on the task of inferring a phylogenetic tree that acco...

Do Branch Lengths Help to Locate a Tree in a Phylogenetic Network?

Article

Jul 2016

Phylogenetic networks are increasingly used in evolutionary biology to represent the history of species that have undergone reticulate events such as horizontal gene transfer, hybrid speciation and recombination. One of the most fundamental questions that arise in this context is whether the evolution of a gene with one copy in all species can be e...

Figure 1: Pipelines of the analyses applied to both data sets,...

Figure 2: Accuracy of branch length estimates in the simulated data...

Figure 3: Estimation accuracy for gene rates in the simulated data set....

Figure 4: Accuracy of branch length estimates in the OrthoMaM data set....

Figure 5: Estimation accuracy for gene rates in the OrthoMaM data set....

Fast and accurate branch lengths estimation for phylogenomic trees

Article

Full-text available

Jan 2016

Branch lengths are an important attribute of phylogenetic trees, providing essential information for many studies in evolutionary biology. Yet, part of the current methodology to reconstruct a phylogeny from genomic information — namely supertree methods — focuses on the topology or structure of the phylogenetic tree, rather than the evolutionary d...

Distance-Based Phylogeny Reconstruction: Safety and Edge Radius

Chapter

Jan 2016

Distance-based methods in phylogenetics

Book

Jan 2016

Distance-Based Phylogeny Reconstruction: Safety and Edge Radius

Book

Full-text available

Jun 2015

A phylogeny is an evolutionary tree tracing the shared history, including common ancestors, of a set of extant species or “taxa”. Phylogenies are increasingly reconstructed on the basis of molecular data (DNA and protein sequences) using statistical techniques such as likelihood and Bayesian methods. Algorithmically, these techniques suffer from th...

Reconstructible Phylogenetic Networks: Do Not Distinguish the Indistinguishable

Article

Full-text available

Apr 2015

Phylogenetic networks represent the evolution of organisms that have undergone reticulate events, such as recombination, hybrid speciation or lateral gene transfer. An important way to interpret a phylogenetic network is in terms of the trees it displays, which represent all the possible histories of the characters carried by the organisms in the n...

Combinatorics of Distance-Based Tree Inference

Article

Full-text available

Sep 2012

Several popular methods for phylogenetic inference (or hierarchical clustering) are based on a matrix of pairwise distances between taxa (or any kind of objects): The objective is to construct a tree with branch lengths so that the distances between the leaves in that tree are as close as possible to the input distances. If we hold the structure (t...

Robustness of Phylogenetic Inference Based on Minimum Evolution

Article

Full-text available

May 2010

Minimum evolution is the guiding principle of an important class of distance-based phylogeny reconstruction methods, including neighbor-joining (NJ), which is the most cited tree inference algorithm to date. The minimum evolution principle involves searching for the tree with minimum length, where the length is estimated using various least-squares...

Approximate Maximum Parsimony and Ancestral Maximum Likelihood

Article

Full-text available

Jan 2010

We explore the maximum parsimony (MP) and ancestral maximum likelihood (AML) criteria in phylogenetic tree reconstruction. Both problems are NP-hard, so we seek approximate solutions. We formulate the two problems as Steiner tree problems under appropriate distances. The gist of our approach is the succinct characterization of Steiner trees for a s...

Budgeted Phylogenetic Diversity on Circular Split Systems

Article

Full-text available

Apr 2009

In the last 15 years, Phylogenetic Diversity (PD) has gained interest in the community of conservation biologists as a surrogate measure for assessing biodiversity. We have recently proposed two approaches to select taxa for maximizing PD, namely PD with budget constraints and PD on split systems. In this paper, we will unify these two strategies a...

Distribution of phylogenetic diversity under random extinction

Article

Apr 2008

Phylogenetic diversity is a measure for describing how much of an evolutionary tree is spanned by a subset of species. If one applies this to the unknown subset of current species that will still be present at some future time, then this 'future phylogenetic diversity' provides a measure of the impact of various extinction scenarios in biodiversity...

Determination and validation of principal gene products

Article

Full-text available

Feb 2008

Motivation: Alternative splicing has the potential to generate a wide range of protein isoforms. For many computational applications and for experimental research, it is important to be able to concentrate on the isoform that retains the core biological function. For many genes this is far from clear. Results: We have combined five methods into...

Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project

Article

Full-text available

Jul 2007

We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowle...

Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome

Article

Full-text available

Jul 2007

A key component of the ongoing ENCODE project involves rigorous comparative sequence analyses for the initially targeted 1% of the human genome. Here, we present orthologous sequence generation, alignment, and evolutionary constraint analyses of 23 mammalian species for all ENCODE targets. Alignments were generated using four different methods; com...

Resource-Aware Taxon Selection for Maximizing Phylogenetic Diversity

Article

Jul 2007

Phylogenetic diversity (PD) is a useful metric for selecting taxa in a range of biological applications, for example, bioconservation and genomics, where the selection is usually constrained by the limited availability of resources. We formalize taxon selection as a conceptually simple optimization problem, aiming to maximize PD subject to resource...

Identification and analysis of functional elements in 1 genome by the ENCODE pilot project

Article

Jan 2007

Species Choice for Comparative Genomics: Being Greedy Works

Article

Full-text available

Jan 2006

Synopsis What would happen if sequencing centres around the world were to choose genomes without consulting each other and without devising long-term strategies? When several parties are involved in decisions with interacting consequences, experience teaches that cooperation and planning are usually necessary to guarantee the best result. Similarly...

GSMA: Software implementation of the genome search meta-analysis method

Article

Full-text available

Jan 2006

Meta-analysis can be used to pool results of genome-wide linkage scans. This is of great value in complex diseases, where replication of linked regions occurs infrequently. The genome search meta-analysis (GSMA) method is widely used for this analysis, and a computer program is now available to implement the GSMA. Availability:Author Webpage Contac...

SNP Selection for Association Studies: Maximizing Power across SNP Choice and Study Size

Article

Dec 2005

Selection of single nucleotide polymorphisms (SNPs) is a problem of primary importance in association studies and several approaches have been proposed. However, none provides a satisfying answer to the problem of how many SNPs should be selected, and how this should depend on the pattern of linkage disequilibrium (LD) in the region under considera...

Meta-analysis of genome scans of age-related macular degeneration

Article

Full-text available

Sep 2005

A genetic contribution to the development of age-related macular degeneration (AMD) is well established. Several genome-wide linkage studies have identified a number of putative susceptibility loci for AMD but only a few of these regions have been replicated in independent studies. Here, we perform a meta-analysis of six AMD genome screens using th...

On the Structural Differences Between Markers and Genomic AC Microsatellites

Article

Jun 2005

AC microsatellites have proved particularly useful as genetic markers. For some purposes, such as in population biology, the inferences drawn depend on the quantitative values of their mutation rates. This, together with intrinsic biological interest, has led to widespread study of microsatellite mutational mechanisms. Now, however, inconsistencies...

Species choice for comparative genomics: No need for cooperation

Article