|
Jean-Christophe Aude,
Yolande Diaz-Lazcoz,
Jean-Jacques Codani and
Jean-Loup Risler. Applications of the Pyramidal Clustering Method to Biological Objects. In CC, Vol. 23(3-4):303-315, 1999. Keywords: from distances, phylogenetic network, phylogeny, Program Pyramids, pyramid, reconstruction, software, visualization. Note: http://dx.doi.org/10.1016/S0097-8485(99)00006-6.
|
|
|
Vineet Bafna and
Vikas Bansal. The Number of Recombination Events in a Sample History: Conflict Graph and Lower Bounds. In TCBB, Vol. 1(2):78-90, 2004. Keywords: ARG, bound, minimum number, phylogeny, recombination. Note: http://www-cse.ucsd.edu/users/vbafna/pub/tcbb04.pdf.
Toggle abstract
"We consider the following problem: Given a set of binary sequences, determine lower bounds on the minimum number of recombinations required to explain the history of the sample, under the infinite-sites model of mutation. The problem has implications for finding recombination hotspots and for the Ancestral Recombination Graph reconstruction problem. Hudson and Kaplan gave a lower bound based on the four-gamete test. In practice, their bound R m often greatly underestimates the minimum number of recombinations. The problem was recently revisited by Myers and Griffiths, who introduced two new lower bounds R h and R s which are provably better, and also yield good bounds in practice. However, the worst-case complexities of their procedures for computing R h and R s are exponential and super-exponential, respectively. In this paper, we show that the number of nontrivial connected components, Rc, in the conflict graph for a given set of sequences, computable in time O(nm 2), is also a lower bound on the minimum number of recombination events. We show that in many cases, R c is a better bound than R h. The conflict graph was used by Gusfield et al. to obtain a polynomial time algorithm for the galled tree problem, which is a special case of the Ancestral Recombination Graph (ARG) reconstruction problem. Our results also offer some insight into the structural properties of this graph and are of interest for the general Ancestral Recombination Graph reconstruction problem."
|
|
|
|
|
Hans-Jürgen Bandelt and
Andreas W. M. Dress. Weak hierarchies associated with similarity measures: an additive clustering technique. In BMB, Vol. 51:113-166, 1989. Keywords: abstract network, clustering, from distances, from trees, phylogenetic network, phylogeny, Program WeakHierarchies, reconstruction, weak hierarchy. Note: http://dx.doi.org/10.1007/BF02458841.
Toggle abstract
"A new and apparently rather useful and natural concept in cluster analysis is studied: given a similarity measure on a set of objects, a sub-set is regarded as a cluster if any two objects a, b inside this sub-set have greater similarity than any third object outside has to at least one of a, b. These clusters then form a closure system which can be described as a hypergraph without triangles. Conversely, given such a system, one may attach some weight to each cluster and then compose a similarity measure additively, by letting the similarity of a pair be the sum of weights of the clusters containing that particular pair. The original clusters can be reconstructed from the obtained similarity measure. This clustering model is thus located between the general additive clustering model of Shepard and Arabie (1979) and the standard hierarchical model. Potential applications include fitting dendrograms with few additional nonnested clusters and simultaneous representation of some families of multiple dendrograms (in particular, two-dendrogram solutions), as well as assisting the search for phylogenetic relationships by proposing a somewhat larger system of possibly relevant "family groups", from which an appropriate choice (based on additional insight or individual preferences) remains to be made. © 1989 Society for Mathematical Biology."
|
|
|
Hans-Jürgen Bandelt and
Andreas W. M. Dress. A canonical decomposition theory for metrics on a finite set. In Advances in Mathematics, Vol. 92(1):47-105, 1992. Keywords: abstract network, circular split system, from distances, split, split decomposition, split network, weak hierarchy, weakly compatible.
Toggle abstract
"We consider specific additive decompositions d = d1 + ... + dn of metrics, defined on a finite set X (where a metric may give distance zero to pairs of distinct points). The simplest building stones are the slit metrics, associated to splits (i.e., bipartitions) of the given set X. While an additive decomposition of a Hamming metric into split metrics is in no way unique, we achieve uniqueness by restricting ourselves to coherent decompositions, that is, decompositions d = d1 + ... + dn such that for every map f:X → R with f(x) + f(y) ≥ d(x, y) for all x, y ε{lunate} X there exist maps f1, ..., fn: X → R with f = f1 + ... + fn and fi(x) + fi(y) ≥ di(x, y) for all i = 1,..., n and all x, y ε{lunate} X. These coherent decompositions are closely related to a geometric decomposition of the injective hull of the given metric. A metric with a coherent decomposition into a (weighted) sum of split metrics will be called totally split-decomposable. Tree metrics (and more generally, the sum of two tree metrics) are particular instances of totally split-decomposable metrics. Our main result confirms that every metric admits a coherent decomposition into a totally split-decomposable metric and a split-prime residue, where all the split summands and hence the decomposition can be determined in polynomial time, and that a family of splits can occur this way if and only if it does not induce on any four-point subset all three splits with block size two. © 1992."
|
|
|
|
|
Hans-Jürgen Bandelt and
Andreas W. M. Dress. An order theoretic framework for overlapping clustering. In DM, Vol. 136(1-3):21-37, 1994.
Toggle abstract
"Cluster analysis deals with procedures which - given a finite collection X of objects together with some kind of local dissimilarity information - identify those subcollections C of objects from X, called clusters, which exhibit a comparatively low degree of internal dissimilarity. In this note we study arbitrary mappings φ which assign to each subcollection A ⊆ X of objects its internal degree of dissimilarity φ (A), subject only to the natural condition that A ⊆ B ⊆ X implies φ (A) ̌ φ (B), and we analyse on a rather abstract, purely order theoretic level how assumptions concerning the way such a mapping φ might be constructed from local data (that is, data involving only a few objects at a time) influence the degree of overlapping observed within the resulting family of clusters, - and vice versa. Hence, unlike previous order theoretic approaches to cluster analysis, we do not restrict our attention to nonoverlapping, hierarchical clustering. Instead, we regard a dissimilarity function φ as an arbitrary isotone mapping from a finite partially ordered set I - e.g. the set P(X) of all subsets A of a finite set X - into a (partially) ordered set R - e.g. the nonnegative real numbers - and we study the correspondence between the two subsets C(φ) and D(φ) of I, formed by the elements whose images are inaccessible from above and from below, respectively. While D(φ) constitutes the local data structure from which φ can be built up, C(φ) embodies the family of clusters associated with φ. Our results imply that in case I: = P(X) and R: = R≥0 one has # D ̌ n for all Dε{lunate}D(φ) and some fixed nε{lunate}N if and only if{A figure is presented} for all C0,..., Cnε{lunate}C(φ) if and only if this holds for all subsets C0,..., Cn ⊆ X, generalizing a well-known criterion for n-conformity of hypergraphs as well as corresponding results due to Batbedat, dealing with the case n = 2. © 1994."
|
|
|
Sergey Bereg and
Yuanyi Zhang. Phylogenetic Networks Based on the Molecular Clock Hypothesis. In TCBB, Vol. 3(4), 2006. Note: http://www.utdallas.edu/~yzhang/Homepage/Papers/prep-tcbb.pdf.
Toggle abstract
A classical result in phylogenetic trees is that a binary phylogenetic tree adhering to the molecular clock hypothesis exists if and only if the matrix of distances between taxa is ultrametric. The ultrametric condition is very restrictive. In this paper we study phylogenetic networks that can be constructed assuming the molecular clock hypothesis. We characterize distance matrices that admit such networks for 3 and 4 taxa. We also design two algorithms for constructing networks optimizing the least-squares fit.
|
|
|
|
|
|
|
|
|
|
|
Mihaela Baroni,
Stefan Grünewald,
Vincent Moulton and
Charles Semple. Bounding the number of hybridization events for a consistent evolutionary history. In JOMB, Vol. 51(2):171-182, 2005. Keywords: agreement forest, bound, explicit network, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, reconstruction, SPR distance. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BGMS05.pdf.
Toggle abstract
"Evolutionary processes such as hybridisation, lateral gene transfer, and recombination are all key factors in shaping the structure of genes and genomes. However, since such processes are not always best represented by trees, there is now considerable interest in using more general networks instead. For example, in recent studies it has been shown that networks can be used to provide lower bounds on the number of recombination events and also for the number of lateral gene transfers that took place in the evolutionary history of a set of molecular sequences. In this paper we describe the theoretical performance of some related bounds that result when merging pairs of trees into networks. © Springer-Verlag 2005."
|
|
|
Jaroslaw Byrka,
Pawel Gawrychowski,
Katharina Huber and
Steven Kelk. Worst-case optimal approximation algorithms for maximizing triplet consistency within phylogenetic networks. In Journal of Discrete Algorithms, Vol. 8(1):65-75, 2010. Keywords: approximation, explicit network, from triplets, galled tree, level k phylogenetic network, phylogenetic network, phylogeny, reconstruction. Note: http://arxiv.org/abs/0710.3258.
Toggle abstract
"The study of phylogenetic networks is of great interest to computational evolutionary biology and numerous different types of such structures are known. This article addresses the following question concerning rooted versions of phylogenetic networks. What is the maximum value of p ∈ [0, 1] such that for every input set T of rooted triplets, there exists some network N such that at least p | T | of the triplets are consistent with N? We call an algorithm that computes such a network (where p is maximum) worst-case optimal. Here we prove that the set containing all triplets (the full triplet set) in some sense defines p. Moreover, given a network N that obtains a fraction p′ for the full triplet set (for any p′), we show how to efficiently modify N to obtain a fraction ≥ p′ for any given triplet set T. We demonstrate the power of this insight by presenting a worst-case optimal result for level-1 phylogenetic networks improving considerably upon the 5/12 fraction obtained recently by Jansson, Nguyen and Sung. For level-2 phylogenetic networks we show that p ≥ 0.61. We emphasize that, because we are taking | T | as a (trivial) upper bound on the size of an optimal solution for each specific input T, the results in this article do not exclude the existence of approximation algorithms that achieve approximation ratio better than p. Finally, we note that all the results in this article also apply to weighted triplet sets. © 2009 Elsevier B.V. All rights reserved."
|
|
|
Magnus Bordewich,
Simone Linz,
Katherine St. John and
Charles Semple. A reduction algorithm for computing the hybridization number of two trees. In EBIO, Vol. 3:86-98, 2007. Keywords: agreement forest, FPT, from rooted trees, hybridization, phylogenetic network, phylogeny, Program HybridNumber. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BLSS07.pdf.
|
|
|
Hans-Jürgen Bandelt,
Vincent Macaulay and
Martin Richards. Median networks: speedy construction and greedy reduction, one simulation, and two case studies from human mtDNA. In MPE, Vol. 16:8-28, 2000. Keywords: from sequences, from splits, median network, phylogenetic network, phylogeny, reconstruction. Note: http://www.stats.gla.ac.uk/~vincent/papers/speedy.pdf.
Toggle abstract
"Molecular data sets characterized by few phylogenetically informative characters with a broad spectrum of mutation rates, such as intraspecific control-region sequence variation of human mitochondrial DNA (mtDNA), can be usefully visualized in the form of median networks. Here we provide a step-by-step guide to the construction of such networks by hand. We improve upon a previously implemented algorithm by outlining an efficient parametrized strategy amenable to large data sets, greedy reduction, which makes it possible to reconstruct some of the confounding recurrent mutations. This entails some postprocessing as well, which assists in capturing more parsimonious solutions. To simplify the creation of the resulting network by hand, we describe a speedy approach to network construction, based on a careful planning of the processing order. A coalescent simulation tailored to human mtDNA variation in Eurasia testifies to the usefulness of reduced median networks, while highlighting notorious problems faced by all phylogenetic methods in this context. Finally, we discuss two case studies involving the comparison of characters in the two hypervariable segments of the human mtDNA control region in the light of the worldwide control-region sequence database, as well as additional restriction fragment length polymorphism information. We conclude that only a minority of the mutations that hit the second segment occur at sites that would have a mutation rate comparable to those at most sites in the first segment. Discarding the known 'noisy' sites of the second segment enhances the analysis. (C) 2000 Academic Press."
|
|
|
Magnus Bordewich and
Charles Semple. On the computational complexity of the rooted subtree prune and regraft distance. In ACOM, Vol. 8:409-423, 2005. Keywords: agreement forest, from rooted trees, NP complete, SPR distance. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BS04.pdf.
Toggle abstract
"The graph-theoretic operation of rooted subtree prune and regraft is increasingly being used as a tool for understanding and modelling reticulation events in evolutionary biology. In this paper, we show that computing the rooted subtree prune and regraft distance between two rooted binary phylogenetic trees on the same label set is NP-hard. This resolves a longstanding open problem. Furthermore, we show that this distance is fixed parameter tractable when parameterised by the distance between the two trees."
|
|
|
Magnus Bordewich and
Charles Semple. Computing the minimum number of hybridization events for a consistent evolutionary history. In DAM, Vol. 155:914-918, 2007. Keywords: agreement forest, approximation, APX hard, explicit network, from rooted trees, hybridization, inapproximability, NP complete, phylogenetic network, phylogeny, SPR distance. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BS06a.pdf.
|
|
|
|
|
|
|
David Bryant and
Vincent Moulton. NeighborNet: An Agglomerative Method for the Construction of Phylogenetic Networks. In MBE, Vol. 21(2):255-265, 2004. Keywords: phylogenetic network, phylogeny, Program SplitsTree, reconstruction, split network. Note: http://www.math.auckland.ac.nz/~bryant/Papers/04NeighborNet.pdf.
Toggle abstract
"We present Neighbor-Net, a distance based method for constructing phylogenetic networks that is based on the Neighbor-Joining (NJ) algorithm of Saitou and Nei. Neighbor-Net provides a snapshot of the data that can guide more detailed analysis. Unlike split decomposition, Neighbor-Net scales well and can quickly produce detailed and informative networks for several hundred taxa. We illustrate the method by reanalyzing three published data sets: a collection of 110 highly recombinant Salmonella multi-locus sequence typing sequences, the 135 "African Eve" human mitochondrial sequences published by Vigilant et al., and a collection of 12 Archeal chaperonin sequences demonstrating strong evidence for gene conversion. Neighbor-Net is available as part of the SplitsTree4 software package."
|
|
|
Mihaela Baroni,
Charles Semple and
Mike Steel. A framework for representing reticulate evolution. In ACOM, Vol. 8:398-401, 2004. Keywords: explicit network, from clusters, hybridization, minimum number, phylogenetic network, phylogeny, reconstruction, regular network, SPR distance. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BSS04.pdf.
Toggle abstract
"Acyclic directed graphs (ADGs) are increasingly being viewed as more appropriate for representing certain evolutionary relationships, particularly in biology, than rooted trees. In this paper, we develop a framework for the analysis of these graphs which we call hybrid phylogenies. We are particularly interested in the problem whereby one is given a set of phylogenetic trees and wishes to determine a hybrid phylogeny that 'embeds' each of these trees and which requires the smallest number of hybridisation events. We show that this quantity can be greatly reduced if additional species are involved, and investigate other combinatorial aspects of this and related questions."
|
|
|
Mihaela Baroni,
Charles Semple and
Mike Steel. Hybrids in Real Time. In Systematic Biology, Vol. 55(1):46-56, 2006. Keywords: agreement forest, from rooted trees, phylogenetic network, phylogeny, polynomial, reconstruction, time consistent network. Note: http://www.math.canterbury.ac.nz/~m.steel/Non_UC/files/research/hybrids.pdf.
Toggle abstract
"We describe some new and recent results that allow for the analysis and representation of reticulate evolution by nontree networks. In particular, we (1) present a simple result to show that, despite the presence of reticulation, there is always a well-defined underlying tree that corresponds to those parts of life that do not have a history of reticulation; (2) describe and apply new theory for determining the smallest number of hybridization events required to explain conflicting gene trees; and (3) present a new algorithm to determine whether an arbitrary rooted network can be realized by contemporaneous reticulation events. We illustrate these results with examples. Copyright © Society of Systematic Biologists."
|
|
|
Ho-Leung Chan,
Jesper Jansson,
Tak-Wah Lam and
Siu-Ming Yiu. Reconstructing an Ultrametric Galled Phylogenetic Network from a Distance Matrix. In JBCB, Vol. 4(4):807-832, 2006. Keywords: explicit network, from distances, galled tree, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://www.df.lth.se/~jj/Publications/dist_ugn7_JBCB2006.pdf.
Toggle abstract
"Given a distance matrix M that specifies the pairwise evolutionary distances between n species, the phylogenetic tree reconstruction problem asks for an edge-weighted phylogenetic tree that satisfies M, if one exists. We study some extensions of this problem to rooted phylogenetic networks. Our main result is an O(n2 log n)-time algorithm for determining whether there is an ultrametric galled network that satisfies M, and if so, constructing one. In fact, if such an ultrametric galled network exists, our algorithm is guaranteed to construct one containing the minimum possible number of nodes with more than one parent (hybrid nodes). We also prove that finding a largest possible submatrix M′ of M such that there exists an ultrametric galled network that satisfies M′ is NP-hard. Furthermore, we show that given an incomplete distance matrix (i.e. where some matrix entries are missing), it is also NP-hard to determine whether there exists an ultrametric galled network which satisfies it. © 2006 Imperial College Press."
|
|
|
Charles Choy,
Jesper Jansson,
Kunihiko Sadakane and
Wing-Kin Sung. Computing the maximum agreement of phylogenetic networks. In TCS, Vol. 335(1):93-107, 2005. Keywords: dynamic programming, FPT, level k phylogenetic network, MASN, NP complete, phylogenetic network, phylogeny. Note: http://www.df.lth.se/~jj/Publications/masn8_TCS2005.pdf.
Toggle abstract
"We introduce the maximum agreement phylogenetic subnetwork problem (MASN) for finding branching structure shared by a set of phylogenetic networks. We prove that the problem is NP-hard even if restricted to three phylogenetic networks and give an O(n2)-time algorithm for the special case of two level-1 phylogenetic networks, where n is the number of leaves in the input networks and where N is called a level-f phylogenetic network if every biconnected component in the underlying undirected graph induces a subgraph of N containing at most f nodes with indegree 2. We also show how to extend our technique to yield a polynomial-time algorithm for any two level-f phylogenetic networks N1,N2 satisfying f=O(logn); more precisely, its running time is O(|V(N1)|·|V(N2)|·2f1+f2), where V(Ni) and fi denote the set of nodes in Ni and the level of Ni, respectively, for i∈{1,2}. © 2005 Elsevier B.V. All rights reserved."
|
|
|
Mark Clement,
David Posada and
Keith A. Crandall. TCS: a computer program to estimate gene genealogies. In MOLE, Vol. 9:1657-1659, 2000. Keywords: from sequences, parsimony, phylogenetic network, phylogeny, Program TCS, reconstruction, software, statistical parsimony. Note: http://darwin.uvigo.es/download/papers/08.tcs00.pdf.
Toggle abstract
[No abstract available]
|
|
|
Gabriel Cardona,
Francesc Rosselló and
Gabriel Valiente. Tripartitions do not always discriminate phylogenetic networks. In MBIO, Vol. 211(2):356-370, 2008. Keywords: distance between networks, phylogenetic network, phylogeny, Program Bio PhyloNetwork, tree-child network, tripartition distance. Note: http://arxiv.org/abs/0707.2376, slides available at http://www.newton.cam.ac.uk/webseminars/pg+ws/2007/plg/plgw01/0904/valiente/.
Toggle abstract
"Phylogenetic networks are a generalization of phylogenetic trees that allow for the representation of non-treelike evolutionary events, like recombination, hybridization, or lateral gene transfer. In a recent series of papers devoted to the study of reconstructibility of phylogenetic networks, Moret, Nakhleh, Warnow and collaborators introduced the so-called tripartition metric for phylogenetic networks. In this paper we show that, in fact, this tripartition metric does not satisfy the separation axiom of distances (zero distance means isomorphism, or, in a more relaxed version, zero distance means indistinguishability in some specific sense) in any of the subclasses of phylogenetic networks where it is claimed to do so. We also present a subclass of phylogenetic networks whose members can be singled out by means of their sets of tripartitions (or even clusters), and hence where the latter can be used to define a meaningful metric. © 2007 Elsevier Inc. All rights reserved."
|
|
|
Gabriel Cardona,
Francesc Rosselló and
Gabriel Valiente. Comparison of tree-child phylogenetic networks. In TCBB, Vol. 6(4):552-569, 2009. Keywords: explicit network, phylogenetic network, phylogeny, Program Bio PhyloNetwork, Program PhyloNetwork, tree sibling network, tree-child network. Note: http://arxiv.org/abs/0708.3499.
Toggle abstract
"Phylogenetic networks are a generalization of phylogenetic trees that allow for the representation of nontreelike evolutionary events, like recombination, hybridization, or lateral gene transfer. While much progress has been made to find practical algorithms for reconstructing a phylogenetic network from a set of sequences, all attempts to endorse a class of phylogenetic networks (strictly extending the class of phylogenetic trees) with a well-founded distance measure have, to the best of our knowledge and with the only exception of the bipartition distance on regular networks, failed so far. In this paper, we present and study a new meaningful class of phylogenetic networks, called tree-child phylogenetic networks, and we provide an injective representation of these networks as multisets of vectors of natural numbers, their path multiplicity vectors. We then use this representation to define a distance on this class that extends the well-known Robinson-Foulds distance for phylogenetic trees and to give an alignment method for pairs of networks in this class. Simple polynomial algorithms for reconstructing a tree-child phylogenetic network from its path multiplicity vectors, for computing the distance between two tree-child phylogenetic networks and for aligning a pair of tree-child phylogenetic networks, are provided. They have been implemented as a Perl package and a Java applet, which can be found at http://bioinfo.uib.es/~recerca/ phylonetworks/mudistance/. © 2009 IEEE."
|
|
|
|
|
Andreas W. M. Dress,
Daniel H. Huson and
Vincent Moulton. Analyzing and visualizing distance data using SplitsTree. In DAM, Vol. 71(1):95-109, 1996. Keywords: abstract network, from distances, phylogenetic network, phylogeny, Program SplitsTree, software, split network, visualization. Note: http://bibiserv.techfak.uni-bielefeld.de/splits/splits.pdf.
|
|
|
|
|
W. Ford Doolittle. Phylogenetic Classification and the Universal Tree. In Science, Vol. 284:2124-2128, 1999. Note: http://cas.bellarmine.edu/tietjen/Ecology/phylogenetic_classification_and_.htm.
Toggle abstract
"From comparative analyses of the nucleotide sequences of genes encoding ribosornal RNAs and several proteins, molecular phylogeneticists have constructed a 'universal- tree of life,' taking it as the basis for a 'natural' hierarchical classification of all living things. Although confidence in some of the tree s early branches has recently been shaken, new approaches could still resolve many methodological uncertainties. More challenging is evidence that most archaeal and bacterial genomes (and the inferred ancestral eukaryotic nuclear genome) contain genes from multiple sources. If 'chimerism' or 'lateral gene transfer' cannot be dismissed as trivial in extent or limited to special categories of genes, then no hierarchical universal classification can be taken as natural. Molecular phylogeneticists will have failed to find the 'true tree,' not because their methods are inadequate or because they have chosen the wrong genes, but because the history of life cannot properly be represented as a tree. However, taxonomies based on molecular sequences will remain indispensable, and understanding of the evolutionary process will ultimately be enriched, not impoverished."
|
|
|
Andreas W. M. Dress and
Daniel H. Huson. Constructing splits graphs. In TCBB, Vol. 1(3):109-115, 2004. Keywords: abstract network, circular split system, from trees, phylogenetic network, phylogeny, Program SplitsTree, reconstruction, split network, visualization. Note: http://scilib.kiev.ua/ieee/tcbb/2004/03/n3/n0109.pdf.
Toggle abstract
"Phylogenetic trees correspond one-to-one to compatible systems of splits and so splits play an important role in theoretical and computational aspects of phylogeny. Whereas any tree reconstruction method can be thought of as producing a compatible system of splits, an increasing number of phylogenetlc algorithms are available that compute split systems that are not necessarily compatible and, thus, cannot always be represented by a tree. Such methods include the split decomposition, Neighbor-Net, consensus networks, and the Z-closure method. A more general split system of this kind can be represented graphically by a so-called splits graph, which generalizes the concept of a phylogenetic tree. This paper addresses the problem of computing a splits graph for a given set of splits. We have implemented all presented algorithms in a new program called SplitsTree4. © 2004 IEEE."
|
|
|
|
|
Philippe Gambette and
Daniel H. Huson. Improved Layout of Phylogenetic Networks. In TCBB, Vol. 5(3):472-479, 2008. Keywords: abstract network, heuristic, phylogenetic network, phylogeny, Program SplitsTree, software, split network, visualization. Note: http://hal-lirmm.ccsd.cnrs.fr/lirmm-00309694/en/.
Toggle abstract
"Split networks are increasingly being used in phylogenetic analysis. Usually, a simple equal-angle algorithm is used to draw such networks, producing layouts that leave much room for improvement. Addressing the problem of producing better layouts of split networks, this paper presents an algorithm for maximizing the area covered by the network, describes an extension of the equal-daylight algorithm to networks, looks into using a spring embedder, and discusses how to construct rooted split networks. © 2008 IEEE."
|
|
|
Olivier Gauthier and
François-Joseph Lapointe. Hybrids and Phylogenetics Revisited: A Statistical Test of Hybridization Using Quartets. In Systematic Botany, Vol. 32(1):8-15, 2007. Keywords: explicit network, from quartets, hybridization, phylogenetic network, phylogeny, reconstruction, reticulogram, split decomposition. Note: http://dx.doi.org/10.1600/036364407780360238.
Toggle abstract
"The occurrence of reticulations in the evolutionary history of species poses serious challenges for all modern practitioners of phylogenetic analysis. Such events, including hybridization, introgression, and lateral gene transfer, lead to evolutionary histories that cannot be adequately represented in the form of phylogenetic trees. Although numerous methods that allow for the reconstruction of phylogenetic networks have been proposed in recent years, the detection of reticulations still remains problematic. In this paper we present a Hybrid Detection Criterion (HDC) along with a statistical procedure that allows for the identification of hybrid taxa. The test assesses whether a putative hybrid is consistently intermediate between its postulated parents, with respect to the other taxa. The performance of the statistical method is evaluated using known hybrids of the genus Aphelandra (Acanthaceae) using two network methods: reticulograms and split decomposition graphs. Our results indicate that the HDC test is reliable when used jointly with split decomposition. On the other hand, the test lacks power and provides misleading results when using reticulograms. We then show how the procedure can be used as a tool to identify putative hybrids. © Copyright 2007 by the American Society of Plant Taxonomists."
|
|
|
|
|
Dan Gusfield,
Satish Eddhu and
Charles Langley. Optimal, Efficient Reconstruction of Phylogenetic Networks with Constrained Recombination. In JBCB, Vol. 2(1):173-213, 2004. Keywords: explicit network, from sequences, galled tree, phylogenetic network, phylogeny, recombination, reconstruction. Note: http://wwwcsif.cs.ucdavis.edu/~gusfield/exfinalrec.pdf.
Toggle abstract
"A phylogenetic network is a generalization of a phylogenetic tree, allowing structural properties that are not tree-like. In a seminal paper, Wang et al.1 studied the problem of constructing a phylogenetic network, allowing recombination between sequences, with the constraint that the resulting cycles must be disjoint. We call such a phylogenetic network a "galled-tree". They gave a polynomial-time algorithm that was intended to determine whether or not a set of sequences could be generated on galled-tree. Unfortunately, the algorithm by Wang et al.1 is incomplete and does not constitute a necessary test for the existence of a galled-tree for the data. In this paper, we completely solve the problem. Moreover, we prove that if there is a galled-tree, then the one produced by our algorithm minimizes the number of recombinations over all phylogenetic networks for the data, even allowing multiple-crossover recombinations. We also prove that when there is a galled-tree for the data, the galled-tree minimizing the number of recombinations is "essentially unique" . We. also note two additional results: first, any set of sequences that can be derived on a galled tree can be derived on a true tree (without recombination cycles), where at most one back mutation per site is allowed; second, the site compatibility problem (which is NP-hard in general) can be solved in polynomial time for any set of sequences that can be derived on a galled tree. Perhaps more important than the specific results about galled-trees, we introduce an approach that can be used to study recombination in general phylogenetic networks. This paper greatly extends the conference version that appears in an earlier work.8 PowerPoint slides of the conference talk can be found at our website. © Imperial College Press."
|
|
|
Dan Gusfield,
Satish Eddhu and
Charles Langley. The fine structure of galls in phylogenetic networks. In INCOMP, Vol. 16(4):459-469, 2004. Keywords: explicit network, from sequences, galled tree, phylogenetic network, phylogeny, reconstruction. Note: http://wwwcsif.cs.ucdavis.edu/~gusfield/informs.pdf.
Toggle abstract
"A phylogenetic network is a generalization of a phylogenetic tree, allowing properties that are not tree-like. With the growth of genomic data, much of which does not fit ideal tree models, there is greater need to understand the algorithmics and combinatorics of phylogenetic networks (Posada and Crandall 2001, Schierup and Hein 2000). Wang et al. (2001) studied the problem of constructing a phylogenetic network for a set of n binary sequences derived from the all-zero ancestral sequence, when each site in the sequence can mutate from zero to one at most once in the network, and recombination between sequences is allowed. They showed that the problem of minimizing the number of recombinations in such networks is NP-hard, but introduced a special case of the problem, i.e., to determine whether the sequences could be derived on a phylogenetic network where the recombination cycles are node-disjoint. Wang et al. (2001) provide a sufficient, but not a necessary test, for such solutions. Gusfield et al. (2003, 2004) gave a polynomial-time algorithm that is both a necessary and sufficient test. In this paper, we study in much more detail the fine combinatorial structure of node-disjoint cycles in phylogenetic networks, both for purposes of insight into phylogenetic networks and to speed up parts of the previous algorithm. We explicitly characterize all the ways in which mutations can be arranged on a disjoint cycle, and prove a strong necessary condition for a set of mutations to be on a disjoint cycle. The main contribution here is to show how structure in the phylogenetic network is reflected in the structure of an efficiently-computable graph, called the conflict graph. The success of this approach suggests that additional insight into the structure of phylogenetic networks can be obtained by exploring structural properties of the conflict graph."
|
|
|
Stefan Grünewald,
Kristoffer Forslund,
Andreas W. M. Dress and
Vincent Moulton. QNet: An agglomerative method for the construction of phylogenetic networks from weighted quartets. In MBE, Vol. 24(2):532-538, 2007. Keywords: abstract network, circular split system, from quartets, phylogenetic network, phylogeny, Program QNet, reconstruction, software. Note: http://mbe.oxfordjournals.org/cgi/content/abstract/24/2/532.
Toggle abstract
"We present QNet, a method for constructing split networks from weighted quartet trees. QNet can be viewed as a quartet analogue of the distance-based Neighbor-Net (NNet) method for network construction. Just as NNet, QNet works by agglomeratively computing a collection of circular weighted splits of the taxa set which is subsequently represented by a planar split network. To illustrate the applicability of QNet, we apply it to a previously published Salmonella data set. We conclude that QNet can provide a useful alternative to NNet if distance data are not available or a character-based approach is preferred. Moreover, it can be used as an aid for determining when a quartet-based tree-building method may or may not be appropriate for a given data set. QNet is freely available for download. © The Author 2006. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved."
|
|
|
Dan Gusfield,
Dean Hickerson and
Satish Eddhu. An efficiently computed lower bound on the number of recombinations in phylogenetic networks: Theory and empirical study. In DAM, Vol. 155(6-7):806-830, 2007. Note: http://wwwcsif.cs.ucdavis.edu/~gusfield/cclowerbound.pdf.
Toggle abstract
"Phylogenetic networks are models of sequence evolution that go beyond trees, allowing biological operations that are not tree-like. One of the most important biological operations is recombination between two sequences. An established problem [J. Hein, Reconstructing evolution of sequences subject to recombination using parsimony, Math. Biosci. 98 (1990) 185-200; J. Hein, A heuristic method to reconstruct the history of sequences subject to recombination, J. Molecular Evoluation 36 (1993) 396-405; Y. Song, J. Hein, Parsimonious reconstruction of sequence evolution and haplotype blocks: finding the minimum number of recombination events, in: Proceedings of 2003 Workshop on Algorithms in Bioinformatics, Berlin, Germany, 2003, Lecture Notes in Computer Science, Springer, Berlin; Y. Song, J. Hein, On the minimum number of recombination events in the evolutionary history of DNA sequences, J. Math. Biol. 48 (2003) 160-186; L. Wang, K. Zhang, L. Zhang, Perfect phylogenetic networks with recombination, J. Comput. Biol. 8 (2001) 69-78; S.R. Myers, R.C. Griffiths, Bounds on the minimum number of recombination events in a sample history, Genetics 163 (2003) 375-394; V. Bafna, V. Bansal, Improved recombination lower bounds for haplotype data, in: Proceedings of RECOMB, 2005; Y. Song, Y. Wu, D. Gusfield, Efficient computation of close lower and upper bounds on the minimum number of needed recombinations in the evolution of biological sequences, Bioinformatics 21 (2005) i413-i422. Bioinformatics (Suppl. 1), Proceedings of ISMB, 2005, D. Gusfield, S. Eddhu, C. Langley, Optimal, efficient reconstruction of phylogenetic networks with constrained recombination, J. Bioinform. Comput. Biol. 2(1) (2004) 173-213; D. Gusfield, Optimal, efficient reconstruction of root-unknown phylogenetic networks with constrained and structured recombination, J. Comput. Systems Sci. 70 (2005) 381-398] is to find a phylogenetic network that derives an input set of sequences, minimizing the number of recombinations used. No efficient, general algorithm is known for this problem. Several papers consider the problem of computing a lower bound on the number of recombinations needed. In this paper we establish a new, efficiently computed lower bound. This result is useful in methods to estimate the number of needed recombinations, and also to prove the optimality of algorithms for constructing phylogenetic networks under certain conditions [D. Gusfield, S. Eddhu, C. Langley, Optimal, efficient reconstruction of phylogenetic networks with constrained recombination, J. Bioinform. Comput. Biol. 2(1) (2004) 173-213; D. Gusfield, Optimal, efficient reconstruction of root-unknown phylogenetic networks with constrained and structured recombination, J. Comput. Systems Sci. 70 (2005) 381-398; D. Gusfield, Optimal, efficient reconstruction of root-unknown phylogenetic networks with constrained recombination, Technical Report, Department of Computer Science, University of California, Davis, CA, 2004]. The lower bound is based on a structural, combinatorial insight, using only the site conflicts and incompatibilities, and hence it is fundamental and applicable to many biological phenomena other than recombination, for example, when gene conversions or recurrent or back mutations or cross-species hybridizations cause the phylogenetic history to deviate from a tree structure. In addition to establishing the bound, we examine its use in more complex lower bound methods, and compare the bounds obtained to those obtained by other established lower bound methods. © 2006 Elsevier B.V. All rights reserved."
|
|
|
Stefan Grünewald,
Katharina Huber and
Qiong Wu. Two novel closure rules for constructing phylogenetic super-networks. In BMB, Vol. 70(7):1906-1924, 2008. Keywords: abstract network, from splits, from unrooted trees, phylogenetic network, phylogeny, Program MY CLOSURE, reconstruction, supernetwork. Note: http://arxiv.org/abs/0709.0283, slides available at http://www.newton.cam.ac.uk/webseminars/pg+ws/2007/plg/plgw01/0904/huber/.
Toggle abstract
"A contemporary and fundamental problem faced by many evolutionary biologists is how to puzzle together a collection P of partial trees (leaf-labeled trees whose leaves are bijectively labeled by species or, more generally, taxa, each supported by, e.g., a gene) into an overall parental structure that displays all trees in P. This already difficult problem is complicated by the fact that the trees in P regularly support conflicting phylogenetic relationships and are not on the same but only overlapping taxa sets. A desirable requirement on the sought after parental structure, therefore, is that it can accommodate the observed conflicts. Phylogenetic networks are a popular tool capable of doing precisely this. However, not much is known about how to construct such networks from partial trees, a notable exception being the Z-closure super-network approach, which is based on the Z-closure rule, and the Q-imputation approach. Although attractive approaches, they both suffer from the fact that the generated networks tend to be multidimensional making it necessary to apply some kind of filter to reduce their complexity. To avoid having to resort to a filter, we follow a different line of attack in this paper and develop closure rules for generating circular phylogenetic networks which have the attractive property that they can be represented in the plane. In particular, we introduce the novel Y-(closure) rule and show that this rule on its own or in combination with one of Meacham's closure rules (which we call the M-rule) has some very desirable theoretical properties. In addition, we present a case study based on Rivera et al. "ring of life" to explore the reconstructive power of the M- and Y-rule and also reanalyze an Arabidopsis thaliana data set. © 2008 Society for Mathematical Biology."
|
|
|
Stefan Grünewald,
Vincent Moulton and
Andreas Spillner. Consistency of the QNet algorithm for generating planar split networks from weighted quartets. In DAM, Vol. 157(10):2325-2334, 2009. Keywords: abstract network, consistency, from quartets, phylogenetic network, phylogeny, Program QNet, reconstruction, software. Note: http://dx.doi.org/10.1016/j.dam.2008.06.038.
Toggle abstract
"Phylogenetic networks are a generalization of evolutionary or phylogenetic trees that allow the representation of conflicting signals or alternative evolutionary histories in a single diagram. Recently the Quartet-Net or "QNet" method was introduced, a method for computing a special kind of phylogenetic network called a split network from a collection of weighted quartet trees (i.e. phylogenetic trees with 4 leaves). This can be viewed as a quartet analogue of the distance-based Neighbor-Net (NNet) method for constructing outer-labeled planar split networks. In this paper, we prove that QNet is a consistent method, that is, we prove that if QNet is applied to a collection of weighted quartets arising from a circular split weight function, then it will return precisely this function. This key property of QNet not only ensures that it is guaranteed to produce a tree if the input corresponds to a tree, and an outer-labeled planar split network if the input corresponds to such a network, but also provides the main guiding principle for the design of the method. © 2008 Elsevier B.V. All rights reserved."
|
|
|
|
|
Barbara R. Holland,
Glenn Conner,
Katharina Huber and
Vincent Moulton. Imputing Supertrees and Supernetworks from Quartets. In Systematic Biology, Vol. 56(1):57-67, 2007. Keywords: abstract network, from unrooted trees, phylogenetic network, phylogeny, Program Quartet, reconstruction, split network, supernetwork. Note: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.99.3215.
Toggle abstract
"Inferring species phylogenies is an important part of understanding molecular evolution. Even so, it is well known that an accurate phylogenetic tree reconstruction for a single gene does not always necessarily correspond to the species phylogeny. One commonly accepted strategy to cope with this problem is to sequence many genes; the way in which to analyze the resulting collection of genes is somewhat more contentious. Supermatrix and supertree methods can be used, although these can suppress conflicts arising from true differences in the gene trees caused by processes such as lineage sorting, horizontal gene transfer, or gene duplication and loss. In 2004, Huson et al. (IEEE/ACM Trans. Comput. Biol. Bioinformatics 1:151-158) presented the Z-closure method that can circumvent this problem by generating a supernetwork as opposed to a supertree. Here we present an alternative way for generating supernetworks called Q-imputation. In particular, we describe a method that uses quartet information to add missing taxa into gene trees. The resulting trees are subsequently used to generate consensus networks, networks that generalize strict and majority-rule consensus trees. Through simulations and application to real data sets, we compare Q-imputation to the matrix representation with parsimony (MRP) supertree method and Z-closure, and demonstrate that it provides a useful complementary tool. Copyright © Society of Systematic Biologists."
|
|
|
Daniel H. Huson,
Tobias Dezulian,
Tobias Kloepper and
Mike Steel. Phylogenetic Super-Networks from Partial Trees. In TCBB, Vol. 1(4):151-158, 2004. Keywords: abstract network, from unrooted trees, phylogenetic network, phylogeny, Program SplitsTree, reconstruction, supernetwork. Note: http://hdl.handle.net/10092/3177.
Toggle abstract
"In practice, one is often faced with incomplete phylogenetic data, such as a collection of partial trees or partial splits. This paper poses the problem of Inferring a phylogenetic super-network from such data and provides an efficient algorithm for doing so, called the Z-closure method. Additionally, the questions of assigning lengths to the edges of the network and how to restrict the "dimensionality" of the network are addressed. Applications to a set of five published partial gene trees relating different fungal species and to six published partial gene trees relating different grasses illustrate the usefulness of the method and an experimental study confirms Its potential. The method Is implemented as a plug-in for the program SplitsTree4. © 2004 IEEE."
|
|
|
Barbara R. Holland,
Frédéric Delsuc and
Vincent Moulton. Visualizing Conflicting Evolutionary Hypotheses in Large Collections of Trees: Using Consensus Networks to Study the Origins of Placentals and Hexapods. In Systematic Biology, Vol. 54(1):66-76, 2005. Keywords: consensus. Note: http://hal-sde.archives-ouvertes.fr/halsde-00193050/fr/.
Toggle abstract
"Many phylogenetic methods produce large collections of trees as opposed to a single tree, which allows the exploration of support for various evolutionary hypotheses. However, to be useful, the information contained in large collections of trees should be summarized; frequently this is achieved by constructing a consensus tree. Consensus trees display only those signals that are present in a large proportion of the trees. However, by their very nature consensus trees require that any conflicts between the trees are necessarily disregarded. We present a method that extends the notion of consensus trees to allow the visualization of conflicting hypotheses in a consensus network. We demonstrate the utility of this method in highlighting differences amongst maximum likelihood bootstrap values and Bayesian posterior probabilities in the placental mammal phylogeny, and also in comparing the phylogenetic signal contained in amino acid versus nucleotide characters for hexapod monophyly. Copyright © Society of Systematic Biologists."
|
|
|
Jotun Hein. Reconstructing evolution of sequences subject to recombination using parsimony. In MBIO, Vol. 98(2):185-200, 1990. Note: http://dx.doi.org/10.1016/0025-5564(90)90123-G.
Toggle abstract
"The parsimony principle states that a history of a set of sequences that minimizes the amount of evolution is a good approximation to the real evolutionary history of the sequences. This principle is applied to the reconstruction of the evolution of homologous sequences where recombinations or horizontal transfer can occur. First it is demonstrated that the appropriate structure to represent the evolution of sequences with recombinations is a family of trees each describing the evolution of a segment of the sequence. Two trees for neighboring segments will differ by exactly the transfer of a subtree within the whole tree. This leads to a metric between trees based on the smallest number of such operations needed to convert one tree into the other. An algorithm is presented that calculates this metric. This metric is used to formulate a dynamic programming algorithm that finds the most parsimonious history that fits a given set of sequences. The algorithm is potentially very practical, since many groups of sequences defy analysis by methods that ignore recombinations. These methods give ambiguous or contradictory results because the sequence history cannot be described by one phylogeny, but only a family of phylogenies that each describe the history of a segment of the sequences. The generalization of the algorithm to reconstruct gene conversions and the possibility for heuristic versions of the algorithm for larger data sets are discussed. © 1990."
|
|
|
Jotun Hein. A heuristic method to reconstruct the history of sequences subject to recombination. In JME, Vol. 36(4):396-405, 1993. Keywords: explicit network, from sequences, heuristic, parsimony, phylogenetic network, phylogeny, Program RecPars, recombination, recombination detection, software. Note: http://dx.doi.org/10.1007/BF00182187.
|
|
|
Ying-Jun He,
Trinh N. D. Huynh,
Jesper Jansson and
Wing-Kin Sung. Inferring Phylogenetic Relationships Avoiding Forbidden Rooted Triplets. In JBCB, Vol. 4(1):59-74, 2006. Note: http://www.df.lth.se/~jj/Publications/forb_triplets7_JBCB2006.pdf.
Toggle abstract
"To construct a phylogenetic tree or phylogenetic network for describing the evolutionary history of a set of species is a well-studied problem in computational biology. One previously proposed method to infer a phylogenetic tree/network for a large set of species is by merging a collection of known smaller phylogenetic trees on overlapping sets of species so that no (or as little as possible) branching information is lost. However, little work has been done so far on inferring a phylogenetic tree/network from a specified set of trees when in addition, certain evolutionary relationships among the species are known to be highly unlikely. In this paper, we consider the problem of constructing a phylogenetic tree/network which is consistent with all of the rooted triplets in a given set C and none of the rooted triplets in another given set F. Although NP-hard in the general case, we provide some efficient exact and approximation algorithms for a number of biologically meaningful variants of the problem. © Imperial College Press."
|
|
|
|
|
Katharina Huber,
Michael Langton,
David Penny,
Vincent Moulton and
Mike Hendy. Spectronet: A package for computing spectra and median networks. In ABIO, Vol. 1(3):159-161, 2004. Keywords: from splits, median network, phylogenetic network, phylogeny, Program Spectronet, software, split, visualization. Note: http://citeseer.ist.psu.edu/631776.html.
Toggle abstract
Spectronet is a package that uses various methods for exploring and visualising complex evolutionary signals. Given an alignment in NEXUS format, the package works by computing a collection of weighted splits or bipartitions of the taxa and then allows the user to interactively analyse the resulting collection using tools such as Lento-plots and median networks. The package is highly interactive and available for PCs.
|
|
|
Katharina Huber,
Vincent Moulton,
Peter J. Lockhart and
Andreas W. M. Dress. Pruned Median Networks: A Technique for Reducing the Complexity of Median Networks. In MPE, Vol. 19(2):302-310, 2001. Keywords: abstract network, median network, phylogenetic network, phylogeny, split. Note: http://dx.doi.org/10.1006/mpev.2001.0935.
Toggle abstract
"Observations from molecular marker studies on recently diverged species indicate that substitution patterns in DNA sequences can often be complex and poorly described by tree-like bifurcating evolutionary models. These observations might result from processes of species diversification and/or processes of sequence evolution that are not tree-like. In these cases, bifurcating tree representations provide poor visualization of phylogenetic signals in sequence data. In this paper, we use median networks to study DNA sequence substitution patterns in plant nuclear and chloroplast markers. We describe how to prune median networks to obtain so called pruned median networks. These simpler networks may help to provide a useful framework for investigating the phylogenetic complexity of recently diverged taxa with hybrid origins. © 2001 Academic Press."
|
|
|
Katharina Huber,
Vincent Moulton and
Charles Semple. Replacing cliques by stars in quasi-median graphs. In DAM, Vol. 143(1-3), 2004. Note: http://dx.doi.org/10.1016/j.dam.2004.03.002.
Toggle abstract
"For a multi-set Σ of splits (bipartitions) of a finite set X, we introduce the multi-split graph G(Σ). This graph is a natural extension of the Buneman graph, Indeed, it is shown that several results pertaining to the Buneman graph extend to the multi-split graph. In addition, in case Σ is derived from a set ℛ of partitions of X by taking parts together with their complements, we show that the extremal instances where ℛ is either strongly compatible or strongly incompatible are equivalent to G(Σ) being either a tree or a Cartesian product of star trees, respectively. © 2004 Elsevier B.V. All rights reserved."
|
|
|
Katharina Huber,
Bengt Oxelman,
Martin Lott and
Vincent Moulton. Reconstructing the Evolutionary History of Polyploids from Multilabeled Trees. In MBE, Vol. 23(9):1784-1791, 2007. Keywords: duplication, explicit network, from multilabeled tree, from trees, phylogenetic network, phylogeny, Program PADRE, reconstruction, software. Note: http://mbe.oxfordjournals.org/cgi/content/full/23/9/1784.
Toggle abstract
"In recent studies, phylogenetic networks have been derived from so-called multilabeled trees in order to understand the origins of certain polyploids. Although the trees used in these studies were constructed using sophisticated techniques in phylogenetic analysis, the presented networks were inferred using ad hoc arguments that cannot be easily extended to larger, more complicated examples. In this paper, we present a general method for constructing such networks, which takes as input a multilabeled phylogenetic tree and outputs a phylogenetic network with certain desirable properties. To illustrate the applicability of our method, we discuss its use in reconstructing the evolutionary history of plant allopolyploids. We conclude with a discussion concerning possible future directions. The network construction method has been implemented and is freely available for use from http://www.uea.ac.uk/ ∼a043878/padre.html. © The Author 2006. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved."
|
|
|
Vincent Moulton and
Katharina Huber. Phylogenetic networks from multi-labelled trees. In JOMB, Vol. 52(5):613-632, 2006. Keywords: duplication, explicit network, from multilabeled tree, phylogenetic network, phylogeny, Program PADRE, reconstruction. Note: http://www.uea.ac.uk/~a043878/jmb.pdf.
Toggle abstract
"It is now quite well accepted that the evolutionary past of certain species is better represented by phylogenetic networks as opposed to trees. For example, polyploids are typically thought to have resulted through hybridization and duplication, processes that are probably not best represented as bifurcating speciation events. Based on the knowledge of a multi-labelled tree relating collection of polyploids, we present a canonical construction of a phylogenetic network that exhibits the tree. In addition, we prove that the resulting network is in some well-defined sense a minimal network having this property. © Springer-Verlag 2006."
|
|
|
Richard R. Hudson. Properties of the neutral allele model with intragenic recombination. In TPP, Vol. 23:183-201, 1983. Keywords: coalescent. Note: http://dx.doi.org/10.1016/0040-5809(83)90013-8, see also http://www.brics.dk/~compbio/coalescent/hudson_animator.html.
Toggle abstract
"An infinite-site neutral allele model with crossing-over possible at any of an infinite number of sites is studied. A formula for the variance of the number of segregating sites in a sample of gametes is obtained. An approximate expression for the expected homozygosity is also derived. Simulation results are presented to indicate the accuracy of the approximations. The results concerning the number of segregating sites and the expected homozygosity indicate that a two-locus model and the infinite-site model behave similarly for 4Nu ≤ 2 and r ≤ 5u, where N is the population size, u is the neutral mutation rate, and r is the recombination rate. Simulations of a two-locus model and a four-locus model were also carried out to determine the effect of intragenic recombination on the homozygosity test ofWatterson (Genetics 85, 789-814; 88, 405-417) and on the number of unique alleles in a sample. The results indicate that for 4Nu ≤ 2 and r ≤ 10u, the effect of recombination is quite small. © 1983."
|
|
|
|
|
|
|
Zhi-Zhong Chen and
Lusheng Wang. HybridNET: a tool for constructing hybridization networks. In BIO, Vol. 26(22):2912-2913, 2010. Keywords: agreement forest, FPT, from rooted trees, hybridization, phylogenetic network, phylogeny, Program HybridNET, software. Note: http://rnc.r.dendai.ac.jp/~chen/papers/note2.pdf.
Toggle abstract
"Motivations: When reticulation events occur, the evolutionary history of a set of existing species can be represented by a hybridization network instead of an evolutionary tree. When studying the evolutionary history of a set of existing species, one can obtain a phylogenetic tree of the set of species with high confidence by looking at a segment of sequences or a set of genes. When looking at another segment of sequences, a different phylogenetic tree can be obtained with high confidence too. This indicates that reticulation events may occur. Thus, we have the following problem: given two rooted phylogenetic trees on a set of species that correctly represent the tree-like evolution of different parts of their genomes, what is the hybridization network with the smallest number of reticulation events to explain the evolution of the set of species under consideration? Results: We develop a program, named HybridNet, for constructing a hybridization network with the minimum number of reticulate vertices from two input trees. We first implement the O(3dn)-time algorithm by Whidden et al. for computing a maximum (acyclic) agreement forest. Our program can output all the maximum (acyclic) agreement forests. We then augment the program so that it can construct an optimal hybridization network for each given maximum acyclic agreement forest. To our knowledge, this is the first time that optimal hybridization networks can be rapidly constructed. © The Author 2010. Published by Oxford University Press. All rights reserved."
|
|
|
Daniel H. Huson and
David Bryant. Application of Phylogenetic Networks in Evolutionary Studies. In MBE, Vol. 23(2):254-267, 2006. Keywords: abstract network, phylogenetic network, phylogeny, Program SplitsTree, software, survey. Note: http://dx.doi.org/10.1093/molbev/msj030, software available from www.splitstree.org.
Toggle abstract
"The evolutionary history of a set of taxa is usually represented by a phylogenetic tree, and this model has greatly facilitated the discussion and testing of hypotheses. However, it is well known that more complex evolutionary scenarios are poorly described by such models. Further, even when evolution proceeds in a tree-like manner, analysis of the data may not be best served by using methods that enforce a tree structure but rather by a richer visualization of the data to evaluate its properties, at least as an essential first step. Thus, phylogenetic networks should be employed when reticulate events such as hybridization, horizontal gene transfer, recombination, or gene duplication and loss are believed to be involved, and, even in the absence of such events, phylogenetic networks have a useful role to play. This article reviews the terminology used for phylogenetic networks and covers both split networks and reticulate networks, how they are defined, and how they can be interpreted. Additionally, the article outlines the beginnings of a comprehensive statistical framework for applying split network methods. We show how split networks can represent confidence sets of trees and introduce a conservative statistical test for whether the conflicting signal in a network is treelike. Finally, this article describes a new program, SplitsTree4, an interactive and comprehensive tool for inferring different types of phylogenetic networks from sequences, distances, and trees. © The Author 2005. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved."
|
|
|
Jesper Jansson and
Wing-Kin Sung. Inferring a level-1 phylogenetic network from a dense set of rooted triplets. In TCS, Vol. 363(1):60-68, 2006. Keywords: explicit network, from triplets, galled tree, level k phylogenetic network, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://www.df.lth.se/~jj/Publications/ipnrt8_TCS2006.pdf.
Toggle abstract
"We consider the following problem: Given a set T of rooted triplets with leaf set L, determine whether there exists a phylogenetic network consistent with T, and if so, construct one. We show that if no restrictions are placed on the hybrid nodes in the solution, the problem is trivially solved in polynomial time by a simple sorting network-based construction. For the more interesting (and biologically more motivated) case where the solution is required to be a level-1 phylogenetic network, we present an algorithm solving the problem in O (| T |2) time when T is dense, i.e., when T contains at least one rooted triplet for each cardinality three subset of L. We also give an O (| T |5 / 3)-time algorithm for finding the set of all phylogenetic networks having a single hybrid node attached to exactly one leaf (and having no other hybrid nodes) that are consistent with a given dense set of rooted triplets. © 2006 Elsevier B.V. All rights reserved."
|
|
|
Jesper Jansson,
Nguyen Bao Nguyen and
Wing-Kin Sung. Algorithms for Combining Rooted Triplets into a Galled Phylogenetic Network. In SICOMP, Vol. 35(5):1098-1121, 2006. Keywords: approximation, explicit network, from triplets, galled tree, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://www.df.lth.se/~jj/Publications/triplets_to_gn7_SICOMP2006.pdf.
Toggle abstract
"This paper considers the problem of determining whether a given set Τ of rooted triplets can be merged without conflicts into a galled phylogenetic network and, if so, constructing such a network. When the input Τ is dense, we solve the problem in O(|Τ|) time, which is optimal since the size of the input is Θ(|Τ|). In comparison, the previously fastest algorithm for this problem runs in O(|Τ|2) time. We also develop an optimal O(|Τ|)-time algorithm for enumerating all simple phylogenetic networks leaf-labeled by L that are consistent with Τ, where L is the set of leaf labels in Τ, which is used by our main algorithm. Next, we prove that the problem becomes NP-hard if extended to nondense inputs, even for the special case of simple phylogenetic networks. We also show that for every positive integer n, there exists some set Τ of rooted triplets on n leaves such that any galled network can be consistent with at most 0.4883 ·|Τ| of the rooted triplets in Τ. On the other hand, we provide a polynomial-time approximation algorithm that always outputs a galled network consistent with at least a factor of 5/12 (> 0.4166) of the rooted triplets in Τ. © 2006 Society for Industrial and Applied Mathematics."
|
|
|
Guohua Jin,
Luay Nakhleh,
Sagi Snir and
Tamir Tuller. Maximum Likelihood of Phylogenetic Networks. In BIO, Vol. 22(21):2604-2611, 2006. Keywords: explicit network, likelihood, phylogenetic network, phylogeny, Program Nepal, reconstruction. Note: http://www.cs.rice.edu/~nakhleh/Papers/NetworksML06.pdf, supplementary material: http://www.cs.rice.edu/~nakhleh/Papers/Supp-ML.pdf.
|
|
|
|
|
|
|
Victor Kunin,
Leon Goldovsky,
Nikos Darzentas and
Christos A. Ouzounis. The net of life: Reconstructing the microbial phylogenetic network. In GR, Vol. 15:954-959, 2005. Note: http://dx.doi.org/10.1101/gr.3666505.
Toggle abstract
"It has previously been suggested that the phylogeny of microbial species might be better described as a network containing vertical and horizontal gene transfer (HGT) events. Yet, all phylogenetic reconstructions so far have presented microbial trees rather than networks. Here, we present a first attempt to reconstruct such an evolutionary network, which we term the "net of life." We use available tree reconstruction methods to infer vertical inheritance, and use an ancestral state inference algorithm to map HGT events on the tree. We also describe a weighting scheme used to estimate the number of genes exchanged between pairs of organisms. We demonstrate that vertical inheritance constitutes the bulk of gene transfer on the tree of life. We term the bulk of horizontal gene flow between tree nodes as "vines," and demonstrate that multiple but mostly tiny vines interconnect the tree. Our results strongly suggest that the HGT network is a scale-free graph, a finding with important implications for genome evolution. We propose that genes might propagate extremely rapidly across microbial species through the HGT network, using certain organisms as hubs. ©2005 by Cold Spring Harbor Laboratory Press."
|
|
|
Martyn Kennedy,
Barbara R. Holland,
Russel D. Gray and
Hamish G. Spencer. Untangling Long Branches: Identifying Conflicting Phylogenetic Signals Using Spectral Analysis, Neighbor-Net, and Consensus Networks. In Systematic Biology, Vol. 54(4):620-633, 2005. Keywords: abstract network, consensus, NeighborNet, phylogenetic network, phylogeny. Note: http://awcmee.massey.ac.nz/people/bholland/pdf/Kennedy_etal_2005.pdf.
|
|
|
|
|
François-Joseph Lapointe. How to account for reticulation events in phylogenetic analysis: A review of distance-based methods. In Journal of Classification, Vol. 17:175-184, 2000. Keywords: abstract network, evaluation, from distances, phylogenetic network, Program Pyramids, Program SplitsTree, Program T REX, pyramid, reconstruction, reticulogram, split network, survey, weak hierarchy. Note: http://dx.doi.org/10.1007/s003570000016.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Bernard M. E. Moret,
Luay Nakhleh,
Tandy Warnow,
C. Randal Linder,
Anna Tholse,
Anneke Padolina,
Jerry Sun and
Ruth Timme. Phylogenetic Networks: Modeling, Reconstructibility, and Accuracy. In TCBB, Vol. 1(1):13-23, 2004. Keywords: distance between networks, evaluation, phylogenetic network, phylogeny, time consistent network, tripartition distance. Note: http://www.cs.rice.edu/~nakhleh/Papers/tcbb04.pdf.
|
|
|
Monique M. Morin and
Bernard M. E. Moret. NetGen: generating phylogenetic networks with diploid hybrids. In BIO, Vol. 22(15):1921-1923, 2006. Keywords: generation, hybridization, Program NetGen, software. Note: http://dx.doi.org/10.1093/bioinformatics/btl191.
Toggle abstract
"Summary: NetGen is an event-driven simulator that creates phylogenetic networks by extending the birth-death model to include diploid hybridizations. DNA sequences are evolved in conjunction with the topology, enabling hybridization decisions to be based on contemporary evolutionary distances. NetGen supports variable rate lineages, root sequence specification, outgroup generation and many other options. This note describes the NetGen application and proposes an extension of the Newick format to accommodate phylogenetic networks. © 2006 Oxford University Press."
|
|
|
David A. Morrison. Networks in phylogenetic analysis: new tools for population biology. In IJP, Vol. 35:567-582, 2005. Keywords: median network, NeighborNet, phylogenetic network, phylogeny, population genetics, Program Network, Program Spectronet, Program SplitsTree, Program T REX, Program TCS, reconstruction, reticulogram, split decomposition, survey. Note: http://hem.fyristorg.com/acacia/papers/networks.pdf.
|
|
|
|
|
Cam Thach Nguyen,
Nguyen Bao Nguyen and
Wing-Kin Sung. Fast Algorithms for computing the Tripartition-based Distance between Phylogenetic Networks. In JCO, Vol. 13(3), 2007. Keywords: distance between networks, phylogenetic network, phylogeny, tripartition distance. Note: http://dx.doi.org/10.1007/s10878-006-9025-5.
Toggle abstract
"Consider two phylogenetic networks N and N′ of size n. The tripartition-based distance finds the proportion of tripartitions which are not shared by N and N′. This distance is proposed by Moret et al. (2004) and is a generalization of Robinson-Foulds distance, which is orginally used to compare two phylogenetic trees. This paper gives an O(min {kn log n, n log n + hn} -time algorithm to compute this distance, where h is the number of hybrid nodes in N and N′ while k is the maximum number of hybrid nodes among all biconnected components in N and N′. Note that k ≪ h ≪ n in a phylogenetic network. In addition, we propose algorithms for comparing galled-trees, which are an important, biological meaningful special case of phylogenetic network. We give an O(n)-time algorithm for comparing two galled-trees. We also give an O(n + kh)-time algorithm for comparing a galled-tree with another general network, where h and k are the number of hybrid nodes in the latter network and its biggest biconnected component respectively. © Springer Science+Business Media, LLC 2007."
|
|
|
Cam Thach Nguyen,
Nguyen Bao Nguyen,
Wing-Kin Sung and
Louxin Zhang. Reconstructing Recombination Network from Sequence Data: The Small Parsimony Problem. In TCBB, Vol. 4(3):394-402, 2007. Keywords: explicit network, from sequences, labeling, NP complete, parsimony, phylogenetic network, phylogeny. Note: http://www.cs.washington.edu/homes/ncthach/Papers/TCBB2007.pdf.
|
|
|
|
|
Luay Nakhleh,
Tandy Warnow,
C. Randal Linder and
Katherine St. John. Reconstructing reticulate evolution in species - theory and practice. In JCB, Vol. 12(6):796-811, 2005. Keywords: from rooted trees, galled tree, phylogenetic network, phylogeny, polynomial, Program SPNet, reconstruction, software. Note: http://www.cs.rice.edu/~nakhleh/Papers/NWLSjcb.pdf.
|
|
|
David Posada,
Keith A. Crandall and
Edward C. Holmes. Recombination in Evolutionary Genomics. In ARG, Vol. 36:75-97, 2002. Keywords: phylogenetic network, phylogeny, recombination, recombination detection, survey. Note: http://dx.doi.org/10.1146/annurev.genet.36.040202.111115.
Toggle abstract
"Recombination can be a dominant force in shaping genomes and associated phenotypes. To better understand the impact of recombination on genomic evolution, we need to be able to identify recombination in aligned sequences. We review bioinformatic approaches for detecting recombination and measuring recombination rates. We also examine the impact of recombination on the reconstruction of evolutionary histories and the estimation of population genetic parameters. Finally, we review the role of recombination in the evolutionary history of bacteria, viruses, and human mitochondria. We conclude by highlighting a number of areas for future development of tools to help quantify the role of recombination in genomic evolution."
|
|
|
|
|
David Posada and
Keith A. Crandall. Intraspecific gene genealogies: trees grafting into networks. In TEE, Vol. 16(1):37-45, 2001. Keywords: likelihood, median network, netting, parsimony, phylogenetic network, phylogeny, Program Arlequin, Program SplitsTree, Program T REX, Program TCS, pyramid, reticulogram, split decomposition, statistical parsimony, survey. Note: http://darwin.uvigo.es/download/papers/09.networks01.pdf.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yun S. Song and
Jotun Hein. On the Minimum Number of Recombination Events in the Evolutionary History of DNA Sequences. In JOMB, Vol. 48(2):160-186, 2004. Keywords: minimum number, recombination. Note: http://dx.doi.org/10.1007/s00285-003-0227-5.
Toggle abstract
"In representing the evolutionary history of a set of binary DNA sequences by a connected graph, a set theoretical approach is introduced for studying recombination events. We show that set theoretical constraints have direct implications on the number of recombination events. We define a new lower bound on the number of recombination events and demonstrate the usefulness of our new approach through several explicit examples. © Springer-Verlag 2003."
|
|
|
|
|
Andreas Spillner,
Binh T. Nguyen and
Vincent Moulton. Computing phylogenetic diversity for split systems. In TCBB, Vol. 5(2):235-244, 2008. Keywords: abstract network, diversity, phylogenetic network, phylogeny, split. Note: http://dx.doi.org/10.1109/TCBB.2007.70260, slides available at http://www.newton.cam.ac.uk/webseminars/pg+ws/2007/plg/plgw01/0906/spillner/.
Toggle abstract
"In conservation biology it is a central problem to measure, predict, and preserve biodiversity as species face extinction. In 1992 Faith proposed measuring the diversity of a collection of species in terms of their relationships on a phylogenetic tree, and to use this information to identify collections of species with high diversity. Here we are interested in some variants of the resulting optimization problem that arise when considering species whose evolution is better represented by a network rather than a tree. More specifically, we consider the problem of computing phylogenetic diversity relative to a split system on a collection of species of size $n$. We show that for general split systems this problem is NP-hard. In addition we provide some efficient algorithms for some special classes of split systems, in particular presenting an optimal $O(n)$ time algorithm for phylogenetic trees and an $O(nlog n + n k)$ time algorithm for choosing an optimal subset of size $k$ relative to a circular split system. © 2006 IEEE."
|
|
|
|
|
|
|
Alan R. Templeton,
Keith A. Crandall and
Charles F. Sing. A Cladistic Analysis of Phenotypic Associations With Haplotypes Inferred From Restriction Endonuclease Mapping and DNA Sequence Data. III. Cladogram Estimation. In GEN, Vol. 132:619-633, 2000. Keywords: from sequences, parsimony, phylogenetic network, phylogeny, Program TCS, recombination, reconstruction, statistical parsimony. Note: http://www.genetics.org/cgi/content/abstract/132/2/619.
|
|
|
Richard C. Winkworth,
David Bryant,
Peter J. Lockhart,
David Havell and
Vincent Moulton. Biogeographic Interpretation of Splits Graphs: Least Squares Optimization of Branch Lengths. In Systematic Biology, Vol. 54(1):56-65, 2005. Keywords: abstract network, from distances, from network, phylogenetic network, phylogeny, reconstruction, split, split network. Note: http://www.math.auckland.ac.nz/~bryant/Papers/05Biogeographic.pdf.
|
|
|
|
|
|
|
|
|
|
|
Daniel H. Huson and
Celine Scornavacca. A survey of combinatorial methods for phylogenetic networks. In Genome Biology and Evolution, Vol. 3:23-35, 2011. Keywords: phylogenetic network, survey. Note: http://dx.doi.org/10.1093/gbe/evq077.
Toggle abstract
"The evolutionary history of a set of species is usually described by a rooted phylogenetic tree. Although it is generally undisputed that bifurcating speciation events and descent with modifications are major forces of evolution, there is a growing belief that reticulate events also have a role to play. Phylogenetic networks provide an alternative to phylogenetic trees and may be more suitable for data sets where evolution involves significant amounts of reticulate events, such as hybridization, horizontal gene transfer, or recombination. In this article, we give an introduction to the topic of phylogenetic networks, very briefly describing the fundamental concepts and summarizing some of the most important combinatorial methods that are available for their computation. © 2010 The Author(s)."
|
|
|
|
|
|
|
David Bryant,
Vincent Moulton and
Andreas Spillner. Consistency of the Neighbor-Net Algorithm. In AMB, Vol. 2(8), 2007. Keywords: abstract network, consistency, from distances, NeighborNet. Note: http://dx.doi.org/10.1186/1748-7188-2-8.
Toggle abstract
"Background: Neighbor-Net is a novel method for phylogenetic analysis that is currently being widely used in areas such as virology, bacteriology, and plant evolution. Given an input distance matrix, Neighbor-Net produces a phylogenetic network, a generalization of an evolutionary or phylogenetic tree which allows the graphical representation of conflicting phylogenetic signals. Results: In general, any network construction method should not depict more conflict than is found in the data, and, when the data is fitted well by a tree, the method should return a network that is close to this tree. In this paper we provide a formal proof that Neighbor-Net satisfies both of these requirements so that, in particular, Neighbor-Net is statistically consistent on circular distances. © 2007 Bryant et al; licensee BioMed Central Ltd."
|
|
|
|
|
Bhaskar DasGupta,
Sergio Ferrarini,
Uthra Gopalakrishnan and
Nisha Raj Paryani. Inapproximability results for the lateral gene transfer problem. In JCO, Vol. 11(4):387-405, 2006. Keywords: approximation, from rooted trees, from species tree, inapproximability, lateral gene transfer, parsimony, phylogenetic network, phylogeny. Note: http://www.cs.uic.edu/~dasgupta/resume/publ/papers/t-scenario-3-reviewed-3.pdf.
|
|
|
Dave MacLeod,
Robert L. Charlebois,
W. Ford Doolittle and
Eric Bapteste. Deduction of probable events of lateral gene transfer through comparison of phylogenetic trees by recursive consolidation and rearrangement. In BMCEB, Vol. 5(27), 2005. Keywords: explicit network, from rooted trees, lateral gene transfer, phylogenetic network, phylogeny, Program HorizStory, reconstruction, software. Note: http://dx.doi.org/10.1186/1471-2148-5-27.
Toggle abstract
"Background: When organismal phylogenies based on sequences of single marker genes are poorly resolved, a logical approach is to add more markers, on the assumption that weak but congruent phylogenetic signal will be reinforced in such multigene trees. Such approaches are valid only when the several markers indeed have identical phylogenies, an issue which many multigene methods (such as the use of concatenated gene sequences or the assembly of supertrees) do not directly address. Indeed, even when the true history is a mixture of vertical descent for some genes and lateral gene transfer (LGT) for others, such methods produce unique topologies. Results: We have developed software that aims to extract evidence for vertical and lateral inheritance from a set of gene trees compared against an arbitrary reference tree. This evidence is then displayed as a synthesis showing support over the tree for vertical inheritance, overlaid with explicit lateral gene transfer (LGT) events inferred to have occurred over the history of the tree. Like splits-tree methods, one can thus identify nodes at which conflict occurs. Additionally one can make reasonable inferences about vertical and lateral signal, assigning putative donors and recipients. Conclusion: A tool such as ours can serve to explore the reticulated dimensionality of molecular evolution, by dissecting vertical and lateral inheritance at high resolution. By this, we mean that individual nodes can be examined not only for congruence, but also for coherence in light of LGT. We assert that our tools will facilitate the comparison of phylogenetic trees, and the interpretation of conflicting data. © 2005 MacLeod et al; licensee BioMed Central Ltd."
|
|
|
Robert G. Beiko and
Nicholas Hamilton. Phylogenetic identification of lateral genetic transfer events. In BMCEB, Vol. 6(15), 2006. Keywords: evaluation, from rooted trees, from unrooted trees, lateral gene transfer, Program EEEP, Program HorizStory, Program LatTrans, reconstruction, software, SPR distance. Note: http://dx.doi.org/10.1186/1471-2148-6-15.
Toggle abstract
"Background: Lateral genetic transfer can lead to disagreements among phylogenetic trees comprising sequences from the same set of taxa. Where topological discordance is thought to have arisen through genetic transfer events, tree comparisons can be used to identify the lineages that may have shared genetic information. An 'edit path' of one or more transfer events can be represented with a series of subtree prune and regraft (SPR) operations, but finding the optimal such set of operations is NP-hard for comparisons between rooted trees, and may be so for unrooted trees as well. Results: Efficient Evaluation of Edit Paths (EEEP) is a new tree comparison algorithm that uses evolutionarily reasonable constraints to identify and eliminate many unproductive search avenues, reducing the time required to solve many edit path problems. The performance of EEEP compares favourably to that of other algorithms when applied to strictly bifurcating trees with specified numbers of SPR operations. We also used EEEP to recover edit paths from over 19 000 unrooted, incompletely resolved protein trees containing up to 144 taxa as part of a large phylogenomic study. While inferred protein trees were far more similar to a reference supertree than random trees were to each other, the phylogenetic distance spanned by random versus inferred transfer events was similar, suggesting that real transfer events occur most frequently between closely related organisms, but can span large phylogenetic distances as well. While most of the protein trees examined here were very similar to the reference supertree, requiring zero or one edit operations for reconciliation, some trees implied up to 40 transfer events within a single orthologous set of proteins. Conclusion: Since sequence trees typically have no implied root and may contain unresolved or multifurcating nodes, the strategy implemented in EEEP is the most appropriate for phylogenomic analyses. The high degree of consistency among inferred protein trees shows that vertical inheritance is the dominant pattern of evolution, at least for the set of organisms considered here. However, the edit paths inferred using EEEP suggest an important role for genetic transfer in the evolution of microbial genomes as well. © 2006Beiko and Hamilton; licensee BioMed Central Ltd."
|
|
|
Maria S. Poptsova and
J. Peter Gogarten. The power of phylogenetic approaches to detect horizontally transferred genes. In BMCEB, Vol. 7(45), 2007. Keywords: evaluation, from rooted trees, lateral gene transfer, Program EEEP. Note: http://dx.doi.org/10.1186/1471-2148-7-45.
Toggle abstract
"Background. Horizontal gene transfer plays an important role in evolution because it sometimes allows recipient lineages to adapt to new ecological niches. High genes transfer frequencies were inferred for prokaryotic and early eukaryotic evolution. Does horizontal gene transfer also impact phylogenetic reconstruction of the evolutionary history of genomes and organisms? The answer to this question depends at least in part on the actual gene transfer frequencies and on the ability to weed out transferred genes from further analyses. Are the detected transfers mainly false positives, or are they the tip of an iceberg of many transfer events most of which go undetected by current methods? Results. Phylogenetic detection methods appear to be the method of choice to infer gene transfers, especially for ancient transfers and those followed by orthologous replacement. Here we explore how well some of these methods perform using in silico transfers between the terminal branches of a gamma proteobacterial, genome based phylogeny. For the experiments performed here on average the AU test at a 5% significance level detects 90.3% of the transfers and 91% of the exchanges as significant. Using the Robinson-Foulds distance only 57.7% of the exchanges and 60% of the donations were identified as significant. Analyses using bipartition spectra appeared most successful in our test case. The power of detection was on average 97% using a 70% cut-off and 94.2% with 90% cut-off for identifying conflicting bipartitions, while the rate of false positives was below 4.2% and 2.1% for the two cut-offs, respectively. For all methods the detection rates improved when more intervening branches separated donor and recipient. Conclusion. Rates of detected transfers should not be mistaken for the actual transfer rates; most analyses of gene transfers remain anecdotal. The method and significance level to identify potential gene transfer events represent a trade-off between the frequency of erroneous identification (false positives) and the power to detect actual transfer events. © 2007 Poptsova and Gogarten; licensee BioMed Central Ltd."
|
|
|
Daniel H. Huson,
Daniel C. Richter,
Christian Rausch,
Tobias Dezulian,
Markus Franz and
Regula Rupp. Dendroscope: An interactive viewer for large phylogenetic trees. In BMCB, Vol. 8:460, 2007. Keywords: phylogeny, Program Dendroscope, software, visualization. Note: http://dx.doi.org/10.1186/1471-2105-8-460, slides available at http://www.newton.cam.ac.uk/webseminars/pg+ws/2007/plg/plgw01/0903/huson/, software freely available from http://www.dendroscope.org.
Toggle abstract
"Background: Research in evolution requires software for visualizing and editing phylogenetic trees, for increasingly very large datasets, such as arise in expression analysis or metagenomics, for example. It would be desirable to have a program that provides these services in an effcient and user-friendly way, and that can be easily installed and run on all major operating systems. Although a large number of tree visualization tools are freely available, some as a part of more comprehensive analysis packages, all have drawbacks in one or more domains. They either lack some of the standard tree visualization techniques or basic graphics and editing features, or they are restricted to small trees containing only tens of thousands of taxa. Moreover, many programs are diffcult to install or are not available for all common operating systems. Results: We have developed a new program, Dendroscope, for the interactive visualization and navigation of phylogenetic trees. The program provides all standard tree visualizations and is optimized to run interactively on trees containing hundreds of thousands of taxa. The program provides tree editing and graphics export capabilities. To support the inspection of large trees, Dendroscope offers a magnification tool. The software is written in Java 1.4 and installers are provided for Linux/Unix, MacOS X and Windows XP. Conclusion: Dendroscope is a user-friendly program for visualizing and navigating phylogenetic trees, for both small and large datasets. © 2007 Huson et al; licensee BioMed Central Ltd."
|
|
|
Ulrik Brandes and
Sabine Cornelsen. Phylogenetic Graph Models Beyond Trees. In DAM, Vol. 157(10):2361-2369, 2009. Keywords: abstract network, cactus graph, from splits, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://www.inf.uni-konstanz.de/~cornelse/Papers/bc-pgmbt-07.pdf.
Toggle abstract
"A graph model for a set S of splits of a set X consists of a graph and a map from X to the vertices of the graph such that the inclusion-minimal cuts of the graph represent S. Phylogenetic trees are graph models in which the graph is a tree. We show that the model can be generalized to a cactus (i.e. a tree of edges and cycles) without losing computational efficiency. A cactus can represent a quadratic rather than linear number of splits in linear space. We show how to decide in linear time in the size of a succinct representation of S whether a set of splits has a cactus model, and if so construct it within the same time bounds. As a byproduct, we show how to construct the subset of all compatible splits and a maximal compatible set of splits in linear time. Note that it is N P-complete to find a compatible subset of maximum size. Finally, we briefly discuss further generalizations of tree models. © 2008 Elsevier B.V. All rights reserved."
|
|
|
Katharina Huber,
Elizabeth E. Watson and
Mike Hendy. An Algorithm for Constructing Local Regions in a Phylogenetic Network. In MPE, Vol. 19(1):1-8, 2000. Keywords: abstract network, median network, phylogenetic network, phylogeny, reconstruction, split. Note: http://dx.doi.org/10.1006/mpev.2000.0891.
Toggle abstract
"The groupings of taxa in a phylogenetic tree cannot represent all the conflicting signals that usually occur among site patterns in aligned homologous genetic sequences. Hence a tree-building program must compromise by reporting a subset of the patterns, using some discriminatory criterion. Thus, in the worst case, out of possibly a large number of equally good trees, only an arbitrarily chosen tree might be reported by the tree-building program as" The Tree." This tree might then be used as a basis for phylogenetic conclusions. One strategy to represent conflicting patterns in the data is to construct a network. The Buneman graph is a theoretically very attractive example of such a network. In particular, a characterization for when this network will be a tree is known. Also the Buneman graph contains each of the most parsimonious trees indicated by the data. In this paper we describe a new method for constructing the Buneman graph that can be used for a generalization of Hadamard conjugation to networks. This new method differs from previous methods by allowing us to focus on local regions of the graph without having to first construct the full graph. The construction is illustrated by an example. © 2001 Academic Press."
|
|
|
|
|
|
|
|
|
|
|
|
|
Dan Gusfield,
Vikas Bansal,
Vineet Bafna and
Yun S. Song. A Decomposition Theory for Phylogenetic Networks and Incompatible Characters. In JCB, Vol. 14(10):1247-1272, 2007. Keywords: explicit network, from sequences, galled tree, phylogenetic network, phylogeny, Program Beagle, Program GalledTree, recombination, reconstruction, software. Note: http://www.eecs.berkeley.edu/~yss/Pub/decomposition.pdf.
|
|
|
|
|
Gabriel Cardona,
Francesc Rosselló and
Gabriel Valiente. A Perl Package and an Alignment Tool for Phylogenetic Networks. In BMCB, Vol. 9:175, 2008. Keywords: distance between networks, phylogenetic network, phylogeny, Program Bio PhyloNetwork, tree sibling network, tree-child network. Note: http://dx.doi.org/10.1186/1471-2105-9-175.
Toggle abstract
"Background: Phylogenetic networks are a generalization of phylogenetic trees that allow for the representation of evolutionary events acting at the population level, like recombination between genes, hybridization between lineages, and lateral gene transfer. While most phylogenetics tools implement a wide range of algorithms on phylogenetic trees, there exist only a few applications to work with phylogenetic networks, none of which are open-source libraries, and they do not allow for the comparative analysis of phylogenetic networks by computing distances between them or aligning them. Results: In order to improve this situation, we have developed a Perl package that relies on the BioPerl bundle and implements many algorithms on phylogenetic networks. We have also developed a Java applet that makes use of the aforementioned Perl package and allows the user to make simple experiments with phylogenetic networks without having to develop a program or Perl script by him or herself. Conclusion: The Perl package is available as part of the BioPerl bundle, and can also be downloaded. A web-based application is also available (see availability and requirements). The Perl package includes full documentation of all its features. © 2008 Cardona et al; licensee BioMed Central Ltd."
|
|
|
Yun S. Song,
Zhihong Ding,
Dan Gusfield,
Charles Langley and
Yufeng Wu. Algorithms to Distinguish the Role of Gene-Conversion from Single-Crossover Recombination in the Derivation of SNP Sequences in Populations. In JCB, Vol. 14(10):1273-1286, 2007. Keywords: ARG, from sequences, phylogenetic network, phylogeny, Program SHRUB, reconstruction. Note: http://dx.doi.org/10.1089/cmb.2007.0096.
Toggle abstract
"Meiotic recombination is a fundamental biological event and one of the principal evolutionary forces responsible for shaping genetic variation within species. In addition to its fundamental role, recombination is central to several critical applied problems. The most important example is "association mapping" in populations, which is widely hoped to help find genes that influence genetic diseases (Carlson et al., 2004; Clark, 2003). Hence, a great deal of recent attention has focused on problems of inferring the historical derivation of sequences in populations when both mutations and recombinations have occurred. In the algorithms literature, most of that recent work has been directed to single-crossover recombination. However, gene-conversion is an important, and more common, form of (two-crossover) recombination which has been much less investigated in the algorithms literature. In this paper, we explicitly incorporate gene-conversion into discrete methods to study historical recombination. We are concerned with algorithms for identifying and locating the extent of historical crossing-over and gene-conversion (along with single-nucleotide mutation), and problems of constructing full putative histories of those events. The novel technical issues concern the incorporation of gene-conversion into recently developed discrete methods (Myers and Griffiths, 2003; Song et al., 2005) that compute lower and upper-bound information on the amount of needed recombination without gene-conversion. We first examine the most natural extension of the lower bound methods from Myers and Griffiths (2003), showing that the extension can be computed efficiently, but that this extension can only yield weak lower bounds. We then develop additional ideas that lead to higher lower bounds, and show how to solve, via integer-linear programming, a more biologically realistic version of the lower bound problem. We also show how to compute effective upper bounds on the number of needed single-crossovers and gene-conversions, along with explicit networks showing a putative history of mutations, single-crossovers and gene-conversions. Both lower and upper bound methods can handle data with missing entries, and the upper bound method can be used to infer missing entries with high accuracy. We validate the significance of these methods by showing that they can be effectively used to distinguish simulation-derived sequences generated without gene-conversion from sequences that were generated with gene-conversion. We apply the methods to recently studied sequences of Arabidopsis thaliana, identifying many more regions in the sequences than were previously identified (Plagnol et al., 2006), where gene-conversion may have played a significant role. Demonstration software is available at www.csif.cs.ucdavis.edu/∼gusfield. © 2007 Mary Ann Liebert, Inc."
|
|
|
Stephen J. Willson. Reconstruction of certain phylogenetic networks from the genomes at their leaves. In JTB, Vol. 252(2):185-376, 2008. Keywords: labeling, polynomial. Note: http://www.public.iastate.edu/~swillson/ReconstructNormalHomopap6.pdf.
Toggle abstract
"A network N is a rooted acyclic digraph. A base-set X for N is a subset of vertices including the root (or outgroup), all leaves, and all vertices of outdegree 1. A simple model of evolution is considered in which all characters are binary and in which back-mutations occur only at hybrid vertices. It is assumed that the genome is known for each member of the base-set X. If the network is known and is assumed to be "normal," then it is proved that the genome of every vertex is uniquely determined and can be explicitly reconstructed. Under additional hypotheses involving time-consistency and separation of the hybrid vertices, the network itself can also be reconstructed from the genomes of all members of X. An explicit polynomial-time procedure is described for performing the reconstruction. © 2008 Elsevier Ltd. All rights reserved."
|
|
|
|
|
Patricia Buendia and
Giri Narasimhan. Serial NetEvolve: A flexible utility for generating serially-sampled sequences along a tree or recombinant network. In BIO, Vol. 18(22):2313-2314, 2006. Keywords: generation, phylogenetic network, phylogeny, Program Serial NetEvolve, Program Treevolve, recombination, software. Note: http://dx.doi.org/10.1093/bioinformatics/btl387.
Toggle abstract
"Summary: Serial NetEvolve is a flexible simulation program that generates DNA sequences evolved along a tree or recombinant network. It offers a user-friendly Windows graphical interface and a Windows or Linux simulator with a diverse selection of parameters to control the evolutionary model. Serial NetEvolve is a modification of the Treevolve program with the following additional features: simulation of serially-sampled data, the choice of either a clock-like or a variable rate model of sequence evolution, sampling from the internal nodes and the output of the randomly generated tree or network in our newly proposed NeTwick format. © 2006 Oxford University Press."
|
|
|
Patricia Buendia and
Giri Narasimhan. Sliding MinPD: Building evolutionary networks of serial samples via an automated recombination detection approach. In BIO, Vol. 23(22):2993-3000, 2007. Keywords: from sequences, phylogenetic network, phylogeny, Program Sliding MinPD, recombination, recombination detection, serial evolutionary networks, software. Note: http://dx.doi.org/10.1093/bioinformatics/btm413.
Toggle abstract
"Motivation: Traditional phylogenetic methods assume tree-like evolutionary models and are likely to perform poorly when provided with sequence data from fast-evolving, recombining viruses. Furthermore, these methods assume that all the sequence data are from contemporaneous taxa, which is not valid for serially-sampled data. A more general approach is proposed here, referred to as the Sliding MinPD method, that reconstructs evolutionary networks for serially-sampled sequences in the presence of recombination. Results: Sliding MinPD combines distance-based phylogenetic methods with automated recombination detection based on the best-known sliding window approaches to reconstruct serial evolutionary networks. Its performance was evaluated through comprehensive simulation studies and was also applied to a set of serially-sampled HIV sequences from a single patient. The resulting network organizations reveal unique patterns of viral evolution and may help explain the emergence of disease-associated mutants and drug-resistant strains with implications for patient prognosis and treatment strategies. © The Author 2007. Published by Oxford University Press. All rights reserved."
|
|
|
Insa Cassens,
Patrick Mardulyn and
Michel C. Milinkovitch. Evaluating Intraspecific Network Construction Methods Using Simulated Sequence Data: Do Existing Algorithms Outperform the Global Maximum Parsimony Approach? In Systematic Biology, Vol. 54(3):363-372, 2005. Keywords: abstract network, evaluation, from unrooted trees, haplotype network, parsimony, phylogenetic network, phylogeny, Program Arlequin, Program CombineTrees, Program Network, Program TCS, reconstruction, software. Note: http://www.lanevol.org/LANE/publications_files/Cassens_etal_SystBio_2005.pdf.
|
|
|
|
|
Ali Tofigh,
Mike Hallett and
Jens Lagergren. Simultaneous Identification of Duplications and Lateral Gene Transfers. In TCBB, Vol. 8(2):517-535, 2011. Keywords: duplication, explicit network, FPT, from rooted trees, from species tree, lateral gene transfer, loss, NP complete, phylogenetic network, phylogeny, reconstruction. Note: http://dx.doi.org/10.1109/TCBB.2010.14.
Toggle abstract
"The incongruency between a gene tree and a corresponding species tree can be attributed to evolutionary events such as gene duplication and gene loss. This paper describes a combinatorial model where so-called DTL-scenarios are used to explain the differences between a gene tree and a corresponding species tree taking into account gene duplications, gene losses, and lateral gene transfers (also known as horizontal gene transfers). The reasonable biological constraint that a lateral gene transfer may only occur between contemporary species leads to the notion of acyclic DTL-scenarios. Parsimony methods are introduced by defining appropriate optimization problems. We show that finding most parsimonious acyclic DTL-scenarios is NP-hard. However, by dropping the condition of acyclicity, the problem becomes tractable, and we provide a dynamic programming algorithm as well as a fixed-parameter tractable algorithm for finding most parsimonious DTL-scenarios. © 2011 IEEE."
|
|
|
Leo van Iersel,
Steven Kelk and
Matthias Mnich. Uniqueness, intractability and exact algorithms: reflections on level-k phylogenetic networks. In JBCB, Vol. 7(4):597-623, 2009. Keywords: explicit network, from triplets, galled tree, level k phylogenetic network, NP complete, phylogenetic network, phylogeny, reconstruction, uniqueness. Note: http://arxiv.org/pdf/0712.2932v2.
|
|
|
Daniel H. Huson. Drawing Rooted Phylogenetic Networks. In TCBB, Vol. 6(1):103-109, 2009. Keywords: explicit network, phylogenetic network, phylogeny, Program Dendroscope, Program SplitsTree, visualization. Note: http://dx.doi.org/10.1109/TCBB.2008.58.
Toggle abstract
"The evolutionary history of a collection of species is usually represented by a phylogenetic tree. Sometimes, phylogenetic networks are used as a means of representing reticulate evolution or of showing uncertainty and incompatibilities in evolutionary datasets. This is often done using unrooted phylogenetic networks such as split networks, due in part, to the availability of software (SplitsTree) for their computation and visualization. In this paper we discuss the problem of drawing rooted phylogenetic networks as cladograms or phylograms in a number of different views that are commonly used for rooted trees. Implementations of the algorithms are available in new releases of the Dendroscope and SplitsTree programs. © 2006 IEEE."
|
|
|
Andreas W. M. Dress,
Katharina Huber,
Jacobus Koolen and
Vincent Moulton. Compatible decompositions and block realizations of finite metrics. In EJC, Vol. 29(7):1617-1633, 2008. Keywords: abstract network, block realization, from distances, phylogenetic network, phylogeny, realization, reconstruction. Note: http://www.ims.nus.edu.sg/preprints/2007-21.pdf.
Toggle abstract
"Given a metric D defined on a finite set X, we define a finite collection D of metrics on X to be a compatible decomposition of D if any two distinct metrics in D are linearly independent (considered as vectors in RX × X), D = ∑d ∈ D d holds, and there exist points x, x′ ∈ X for any two distinct metrics d, d′ in D such that d (x, y) d′ (x′, y) = 0 holds for every y ∈ X. In this paper, we show that such decompositions are in one-to-one correspondence with (isomorphism classes of) block realizations of D, that is, graph realizations G of D for which G is a block graph and for which every vertex in G not labelled by X has degree at least 3 and is a cut point of G. This generalizes a fundamental result in phylogenetic combinatorics that states that a metric D defined on X can be realized by a tree if and only if there exists a compatible decomposition D of D such that all metrics d ∈ D are split metrics, and lays the foundation for a more general theory of metric decompositions that will be explored in future papers. © 2007 Elsevier Ltd. All rights reserved."
|
|
|
Tobias Kloepper and
Daniel H. Huson. Drawing explicit phylogenetic networks and their integration into SplitsTree. In BMCEB, Vol. 8(22), 2008. Keywords: explicit network, phylogenetic network, phylogeny, Program SplitsTree, software, split network, visualization. Note: http://dx.doi.org/10.1186/1471-2148-8-22.
Toggle abstract
"Background. SplitsTree provides a framework for the calculation of phylogenetic trees and networks. It contains a wide variety of methods for the import/export, calculation and visualization of phylogenetic information. The software is developed in Java and implements a command line tool as well as a graphical user interface. Results. In this article, we present solutions to two important problems in the field of phylogenetic networks. The first problem is the visualization of explicit phylogenetic networks. To solve this, we present a modified version of the equal angle algorithm that naturally integrates reticulations into the layout process and thus leads to an appealing visualization of these networks. The second problem is the availability of explicit phylogenetic network methods for the general user. To advance the usage of explicit phylogenetic networks by biologists further, we present an extension to the SplitsTree framework that integrates these networks. By addressing these two problems, SplitsTree is among the first programs that incorporates implicit and explicit network methods together with standard phylogenetic tree methods in a graphical user interface environment. Conclusion. In this article, we presented an extension of SplitsTree 4 that incorporates explicit phylogenetic networks. The extension provides a set of core classes to handle explicit phylogenetic networks and a visualization of these networks. © 2008 Kloepper and Huson; licensee BioMed Central Ltd."
|
|
|
Changiz Eslahchi,
Mahnaz Habibi,
Reza Hassanzadeh and
Ehsan Mottaghi. MC-Net: a method for the construction of phylogenetic networks based on the Monte-Carlo method. In BMCEB, Vol. 10:254, 2010. Keywords: abstract network, circular split system, from distances, heuristic, phylogenetic network, Program MC-Net, Program SplitsTree, software, split, split network. Note: http://dx.doi.org/10.1186/1471-2148-10-254.
Toggle abstract
"Background. A phylogenetic network is a generalization of phylogenetic trees that allows the representation of conflicting signals or alternative evolutionary histories in a single diagram. There are several methods for constructing these networks. Some of these methods are based on distances among taxa. In practice, the methods which are based on distance perform faster in comparison with other methods. The Neighbor-Net (N-Net) is a distance-based method. The N-Net produces a circular ordering from a distance matrix, then constructs a collection of weighted splits using circular ordering. The SplitsTree which is a program using these weighted splits makes a phylogenetic network. In general, finding an optimal circular ordering is an NP-hard problem. The N-Net is a heuristic algorithm to find the optimal circular ordering which is based on neighbor-joining algorithm. Results. In this paper, we present a heuristic algorithm to find an optimal circular ordering based on the Monte-Carlo method, called MC-Net algorithm. In order to show that MC-Net performs better than N-Net, we apply both algorithms on different data sets. Then we draw phylogenetic networks corresponding to outputs of these algorithms using SplitsTree and compare the results. Conclusions. We find that the circular ordering produced by the MC-Net is closer to optimal circular ordering than the N-Net. Furthermore, the networks corresponding to outputs of MC-Net made by SplitsTree are simpler than N-Net. © 2010 Eslahchi et al; licensee BioMed Central Ltd."
|
|
|
Luay Nakhleh. A Metric on the Space of Reduced Phylogenetic Networks. In TCBB, Vol. 7(2), 2010. Keywords: distance between networks, phylogenetic network, phylogeny. Note: http://www.cs.rice.edu/~nakhleh/Papers/tcbb-Metric.pdf.
Toggle abstract
"Phylogenetic networks are leaf-labeled, rooted, acyclic, and directed graphs that are used to model reticulate evolutionary histories. Several measures for quantifying the topological dissimilarity between two phylogenetic networks have been devised, each of which was proven to be a metric on certain restricted classes of phylogenetic networks. A biologically motivated class of phylogenetic networks, namely, reduced phylogenetic networks, was recently introduced. None of the existing measures is a metric on the space of reduced phylogenetic networks. In this paper, we provide a metric on the space of reduced phylogenetic networks that is computable in time polynomial in the size of the networks. © 2006 IEEE."
|
|
|
Dan Levy and
Lior Pachter. The Neighbor-Net Algorithm. In Advances in Applied Mathematics, Vol. 47(2):240-258, 2011. Keywords: abstract network, circular split system, evaluation, from distances, NeighborNet, phylogenetic network, phylogeny, split network. Note: http://arxiv.org/abs/math/0702515.
Toggle abstract
"The neighbor-joining algorithm is a popular phylogenetics method for constructing trees from dissimilarity maps. The neighbor-net algorithm is an extension of the neighbor-joining algorithm and is used for constructing split networks. We begin by describing the output of neighbor-net in terms of the tessellation of M̄0n(R) by associahedra. This highlights the fact that neighbor-net outputs a tree in addition to a circular ordering and we explain when the neighbor-net tree is the neighbor-joining tree. A key observation is that the tree constructed in existing implementations of neighbor-net is not a neighbor-joining tree. Next, we show that neighbor-net is a greedy algorithm for finding circular split systems of minimal balanced length. This leads to an interpretation of neighbor-net as a greedy algorithm for the traveling salesman problem. The algorithm is optimal for Kalmanson matrices, from which it follows that neighbor-net is consistent and has optimal radius 12. We also provide a statistical interpretation for the balanced length for a circular split system as the length based on weighted least squares estimates of the splits. We conclude with applications of these results and demonstrate the implications of our theorems for a recently published comparison of Papuan and Austronesian languages. © 2010 Elsevier Inc. All rights reserved."
|
|
|
Leo van Iersel,
Steven Kelk,
Nela Lekic and
Leen Stougie. Approximation algorithms for nonbinary agreement forests. In SIDMA, Vol. 28(1):49-66, 2014. Keywords: agreement forest, approximation, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, reconstruction. Note: http://arxiv.org/abs/1210.3211.
Toggle abstract
"Given two rooted phylogenetic trees on the same set of taxa X, the Maximum Agreement Forest (maf) problem asks to find a forest that is, in a certain sense, common to both trees and has a minimum number of components. The Maximum Acyclic Agreement Forest (maaf) problem has the additional restriction that the components of the forest cannot have conflicting ancestral relations in the input trees. There has been considerable interest in the special cases of these problems in which the input trees are required to be binary. However, in practice, phylogenetic trees are rarely binary, due to uncertainty about the precise order of speciation events. Here, we show that the general, nonbinary version of maf has a polynomial-time 4-approximation and a fixedparameter tractable (exact) algorithm that runs in O(4opoly(n)) time, where n = |X| and k is the number of components of the agreement forest minus one. Moreover, we show that a c-approximation algorithm for nonbinary maf and a d-approximation algorithm for the classical problem Directed Feedback Vertex Set (dfvs) can be combined to yield a d(c+3)-approximation for nonbinary maaf. The algorithms for maf have been implemented and made publicly available. © 2014 Society for Industrial and Applied Mathematics."
|
|
|
Steven M. Woolley,
David Posada and
Keith A. Crandall. A Comparison of Phylogenetic Network Methods Using Computer Simulation. In PLoS ONE, Vol. 3(4):e1913, 2008. Keywords: abstract network, distance between networks, evaluation, median network, MedianJoining, minimum spanning network, NeighborNet, parsimony, phylogenetic network, phylogeny, Program Arlequin, Program CombineTrees, Program Network, Program SHRUB, Program SplitsTree, Program TCS, split decomposition. Note: http://dx.doi.org/10.1371/journal.pone.0001913.
Toggle abstract
"Background: We present a series of simulation studies that explore the relative performance of several phylogenetic network approaches (statistical parsimony, split decomposition, union of maximum parsimony trees, neighbor-net, simulated history recombination upper bound, median-joining, reduced median joining and minimum spanning network) compared to standard tree approaches (neighbor-joining and maximum parsimony) in the presence and absence of recombination. Principal Findings: In the absence of recombination, all methods recovered the correct topology and branch lengths nearly all of the time when the subtitution rate was low, except for minimum spanning networks, which did considerably worse. At a higher substitution rate, maximum parsimony and union of maximum parsimony trees were the most accurate. With recombination, the ability to infer the correct topology was halved for all methods and no method could accurately estimate branch lengths. Conclusions: Our results highlight the need for more accurate phylogenetic network methods and the importance of detecting and accounting for recombination in phylogenetic studies. Furthermore, we provide useful information for choosing a network algorithm and a framework in which to evaluate improvements to existing methods and novel algorithms developed in the future. © 2008 Woolley et al."
|
|
|
|
|
|
|
Gabriel Cardona,
Mercè Llabrés,
Francesc Rosselló and
Gabriel Valiente. A Distance Metric for a Class of Tree-Sibling Phylogenetic Networks. In BIO, Vol. 24(13):1481-1488, 2008. Keywords: distance between networks, phylogenetic network, phylogeny, polynomial, tree sibling network. Note: http://dx.doi.org/10.1093/bioinformatics/btn231.
Toggle abstract
"Motivation: The presence of reticulate evolutionary events in phylogenies turn phylogenetic trees into phylogenetic networks. These events imply in particular that there may exist multiple evolutionary paths from a non-extant species to an extant one, and this multiplicity makes the comparison of phylogenetic networks much more difficult than the comparison of phylogenetic trees. In fact, all attempts to define a sound distance measure on the class of all phylogenetic networks have failed so far. Thus, the only practical solutions have been either the use of rough estimates of similarity (based on comparison of the trees embedded in the networks), or narrowing the class of phylogenetic networks to a certain class where such a distance is known and can be efficiently computed. The first approach has the problem that one may identify two networks as equivalent, when they are not; the second one has the drawback that there may not exist algorithms to reconstruct such networks from biological sequences. Results: We present in this articlea distance measure on the class of semi-binary tree-sibling time consistent phylogenetic networks, which generalize tree-child time consistent phylogenetic networks, and thus also galled-trees. The practical interest of this distance measure is 2-fold: it can be computed in polynomial time by means of simple algorithms, and there also exist polynomial-time algorithms for reconstructing networks of this class from DNA sequence data. © 2008 The Author(s)."
|
|
|
James B. Whitfield,
Sydney A. Cameron,
Daniel H. Huson and
Mike Steel. Filtered Z-Closure Supernetworks for Extracting and Visualizing Recurrent Signal from Incongruent Gene Trees. In Systematic Biology, Vol. 57(6):939-947, 2008. Keywords: abstract network, from unrooted trees, phylogenetic network, phylogeny, Program SplitsTree, split, split network, supernetwork. Note: http://www.life.uiuc.edu/scameron/pdfs/Filtered%20Z-closure%20SystBiol.pdf.
|
|
|
|
|
Ingo Althöfer. On optimal realizations of finite metric spaces by graphs. In Discrete and Computational Geometry, Vol. 3(1):103-122, 1986. Keywords: NP complete, optimal realization, realization. Note: http://dx.doi.org/10.1007/BF02187901.
Toggle abstract
"Graph realizations of finite metric spaces have widespread applications, for example, in biology, economics, and information theory. The main results of this paper are: 1. Finding optimal realizations of integral metrics (which means all distances are integral) is NP-complete. 2. There exist metric spaces with a continuum of optimal realizations. Furthermore, two conditions necessary for a weighted graph to be an optimal realization are given and an extremal problem arising in connection with the realization problem is investigated. © 1988 Springer-Verlag New York Inc."
|
|
|
|
|
Joanna L. Davies,
Frantisek Simancík,
Rune Lyngsø,
Thomas Mailund and
Jotun Hein. On Recombination-Induced Multiple and Simultaneous Coalescent Events. In GEN, Vol. 177:2151-2160, 2007. Keywords: coalescent, phylogenetic network, phylogeny, recombination, statistical model. Note: http://dx.doi.org/10.1534/genetics.107.071126.
Toggle abstract
"Coalescent theory deals with the dynamics of how sampled genetic material has spread through a population from a single ancestor over many generations and is ubiquitous in contemporary molecular population genetics. Inherent in most applications is a continuous-time approximation that is derived under the assumption that sample size is small relative to the actual population size. In effect, this precludes multiple and simultaneous coalescent events that take place in the history of large samples. If sequences do not recombine, the number of sequences ancestral to a large sample is reduced sufficiently after relatively few generations such that use of the continuous-time approximation is justified. However, in tracing the history of large chromosomal segments, a large recombination rate per generation will consistently maintain a large number of ancestors. This can create a major disparity between discrete-time and continuous-time models and we analyze its importance, illustrated with model parameters typical of the human genome. The presence of gene conversion exacerbates the disparity and could seriously undermine applications of coalescent theory to complete genomes. However, we show that multiple and simultaneous coalescent events influence global quantities, such as total number of ancestors, but have negligible effect on local quantities, such as linkage disequilibrium. Reassuringly, most applications of the coalescent model with recombination (including association mapping) focus on local quantities. Copyright © 2007 by the Genetics Society of America."
|
|
|
Shlomo Moran,
Sagi Snir and
Wing-Kin Sung. Partial Convex Recolorings of Trees and Galled Networks: Tight Upper and Lower bounds. In ACM Transactions on Algorithms, Vol. 7(4), 2011. Keywords: evaluation, galled tree, phylogenetic network. Note: http://www.cs.technion.ac.il/~moran/r/PS/gnets-TOA-7Feb2007.pdf.
Toggle abstract
"A coloring of a graph is convex if the vertices that pertain to any color induce a connected subgraph; a partial coloring (which assigns colors to a subset of the vertices) is convex if it can be completed to a convex (total) coloring. Convex coloring has applications in fields such as phylogenetics, communication or transportation networks, etc. When a coloring of a graph is not convex, a natural question is how far it is from a convex one. This problem is denoted as convex recoloring (CR).While the initial works on CR defined and studied the problem on trees, recent efforts aim at either generalizing the underlying graphs or specializing the input colorings. In this work, we extend the underlying graph and the input coloring to partially colored galled networks. We show that although determining whether a coloring is convex on an arbitrary network is hard, it can be found efficiently on galled networks. We present a fixed parameter tractable algorithm that finds the recoloring distance of such a network whose running time is quadratic in the network size and exponential in that distance. This complexity is achieved by amortized analysis that uses a novel technique for contracting colored graphs that seems to be of independent interest. © 2011 ACM."
|
|
|
Sagi Snir and
Tamir Tuller. The NET-HMM approach: Phylogenetic Network Inference by Combining Maximum Likelihood and Hidden Markov Models. In JBCB, Vol. 7(4):625-644, 2009. Keywords: explicit network, from sequences, HMM, lateral gene transfer, likelihood, phylogenetic network, phylogeny, statistical model. Note: http://research.haifa.ac.il/~ssagi/published%20papers/Snir-NET-HMM-JBCB-2009.pdf.
Toggle abstract
"Horizontal gene transfer (HGT) is the event of transferring genetic material from one lineage in the evolutionary tree to a different lineage. HGT plays a major role in bacterial genome diversification and is a significant mechanism by which bacteria develop resistance to antibiotics. Although the prevailing assumption is of complete HGT, cases of partial HGT (which are also named chimeric HGT) where only part of a gene is horizontally transferred, have also been reported, albeit less frequently. In this work we suggest a new probabilistic model, the NET-HMM, for analyzing and modeling phylogenetic networks. This new model captures the biologically realistic assumption that neighboring sites of DNA or amino acid sequences are not independent, which increases the accuracy of the inference. The model describes the phylogenetic network as a Hidden Markov Model (HMM), where each hidden state is related to one of the network's trees. One of the advantages of the NET-HMM is its ability to infer partial HGT as well as complete HGT. We describe the properties of the NET-HMM, devise efficient algorithms for solving a set of problems related to it, and implement them in software. We also provide a novel complementary significance test for evaluating the fitness of a model (NET-HMM) to a given dataset. Using NET-HMM, we are able to answer interesting biological questions, such as inferring the length of partial HGT's and the affected nucleotides in the genomic sequences, as well as inferring the exact location of HGT events along the tree branches. These advantages are demonstrated through the analysis of synthetical inputs and three different biological inputs. © 2009 Imperial College Press."
|
|
|
Stefan Grünewald,
Katharina Huber,
Vincent Moulton,
Charles Semple and
Andreas Spillner. Characterizing weak compatibility in terms of weighted quartets. In Advances in Applied Mathematics, Vol. 42(3):329-341, 2009. Keywords: abstract network, characterization, from quartets, split network, weak hierarchy. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/GHMSS08.pdf, slides at http://www.lirmm.fr/miep08/slides/12_02_huber.pdf.
|
|
|
Gabriel Cardona,
Mercè Llabrés,
Francesc Rosselló and
Gabriel Valiente. Metrics for phylogenetic networks I: Generalizations of the Robinson-Foulds metric. In TCBB, Vol. 6(1):46-61, 2009. Keywords: distance between networks, explicit network, phylogenetic network, phylogeny, time consistent network, tree-child network, tripartition distance. Note: http://dx.doi.org/10.1109/TCBB.2008.70.
Toggle abstract
"The assessment of phylogenetic network reconstruction methods requires the ability to compare phylogenetic networks. This is the first in a series of papers devoted to the analysis and comparison of metrics for tree-child time consistent phylogenetic networks on the same set of taxa. In this paper, we study three metrics that have already been introduced in the literature: the Robinson-Foulds distance, the tripartitions distance and the $mu$-distance. They generalize to networks the classical Robinson-Foulds or partition distance for phylogenetic trees. We analyze the behavior of these metrics by studying their least and largest values and when they achieve them. As a by-product of this study, we obtain tight bounds on the size of a tree-child time consistent phylogenetic network. © 2006 IEEE."
|
|
|
Gabriel Cardona,
Mercè Llabrés,
Francesc Rosselló and
Gabriel Valiente. Metrics for phylogenetic networks II: Nodal and triplets metrics. In TCBB, Vol. 6(3):454-469, 2009. Keywords: distance between networks, phylogenetic network, phylogeny. Note: http://dx.doi.org/10.1109/TCBB.2008.127.
Toggle abstract
"The assessment of phylogenetic network reconstruction methods requires the ability to compare phylogenetic networks. This is the second in a series of papers devoted to the analysis and comparison of metrics for tree-child time consistent phylogenetic networks on the same set of taxa. In this paper, we generalize to phylogenetic networks two metrics that have already been introduced in the literature for phylogenetic trees: the nodal distance and the triplets distance. We prove that they are metrics on any class of tree- child time consistent phylogenetic networks on the same set of taxa, as well as some basic properties for them. To prove these results, we introduce a reduction/expansion procedure that can be used not only to establish properties of tree-child time consistent phylogenetic networks by induction, but also to generate all tree-child time consistent phylogenetic networks with a given number of leaves. © 2009 IEEE."
|
|
|
Gabriel Cardona,
Mercè Llabrés,
Francesc Rosselló and
Gabriel Valiente. Path lengths in tree-child time consistent hybridization networks. In Information Sciences, Vol. 180(3):366-383, 2010. Keywords: distance between networks, phylogenetic network, phylogeny, time consistent network, tree-child network. Note: http://arxiv.org/abs/0807.0087?context=cs.CE.
Toggle abstract
"Hybridization networks are representations of evolutionary histories that allow for the inclusion of reticulate events like recombinations, hybridizations, or lateral gene transfers. The recent growth in the number of hybridization network reconstruction algorithms has led to an increasing interest in the definition of metrics for their comparison that can be used to assess the accuracy or robustness of these methods. In this paper we establish some basic results that make it possible the generalization to tree-child time consistent (TCTC) hybridization networks of some of the oldest known metrics for phylogenetic trees: those based on the comparison of the vectors of path lengths between leaves. More specifically, we associate to each hybridization network a suitably defined vector of 'splitted' path lengths between its leaves, and we prove that if two TCTC hybridization networks have the same such vectors, then they must be isomorphic. Thus, comparing these vectors by means of a metric for real-valued vectors defines a metric for TCTC hybridization networks. We also consider the case of fully resolved hybridization networks, where we prove that simpler, 'non-splitted' vectors can be used. © 2009 Elsevier Inc. All rights reserved."
|
|
|
Cuong Than,
Derek Ruths and
Luay Nakhleh. PhyloNet: A Software Package for Analyzing and Reconstructing Reticulate Evolutionary Relationships. In BMCB, Vol. 9(322), 2008. Keywords: Program PhyloNet, software. Note: http://dx.doi.org/10.1186/1471-2105-9-322.
Toggle abstract
"Background: Phylogenies, i.e., the evolutionary histories of groups of taxa, play a major role in representing the interrelationships among biological entities. Many software tools for reconstructing and evaluating such phylogenies have been proposed, almost all of which assume the underlying evolutionary history to be a tree. While trees give a satisfactory first-order approximation for many families of organisms, other families exhibit evolutionary mechanisms that cannot be represented by trees. Processes such as horizontal gene transfer (HGT), hybrid speciation, and interspecific recombination, collectively referred to as reticulate evolutionary events, result in networks, rather than trees, of relationships. Various software tools have been recently developed to analyze reticulate evolutionary relationships, which include SplitsTree4, LatTrans, EEEP, HorizStory, and T-REX. Results: In this paper, we report on the PhyloNet software package, which is a suite of tools for analyzing reticulate evolutionary relationships, or evolutionary networks, which are rooted, directed, acyclic graphs, leaf-labeled by a set of taxa. These tools can be classified into four categories: (1) evolutionary network representation: reading/writing evolutionary networks in a newly devised compact form; (2) evolutionary network characterization: analyzing evolutionary networks in terms of three basic building blocks - trees, clusters, and tripartitions; (3) evolutionary network comparison: comparing two evolutionary networks in terms of topological dissimilarities, as well as fitness to sequence evolution under a maximum parsimony criterion; and (4) evolutionary network reconstruction: reconstructing an evolutionary network from a species tree and a set of gene trees. Conclusion: The software package, PhyloNet, offers an array of utilities to allow for efficient and accurate analysis of evolutionary networks. The software package will help significantly in analyzing large data sets, as well as in studying the performance of evolutionary network reconstruction methods. Further, the software package supports the proposed eNewick format for compact representation of evolutionary networks, a feature that allows for efficient interoperability of evolutionary network software tools. Currently, all utilities in PhyloNet are invoked on the command line. © 2008 Than et al; licensee BioMed Central Ltd."
|
|
|
Iyad A. Kanj,
Luay Nakhleh,
Cuong Than and
Ge Xia. Seeing the Trees and Their Branches in the Network is Hard. In TCS, Vol. 401:153-164, 2008. Keywords: evaluation, from network, from rooted trees, NP complete, phylogenetic network, phylogeny, tree containment. Note: http://www.cs.rice.edu/~nakhleh/Papers/tcs08.pdf.
|
|
|
Barbara R. Holland,
Steffi Benthin,
Peter J. Lockhart,
Vincent Moulton and
Katharina Huber. Using supernetworks to distinguish hybridization from lineage-sorting. In BMCEB, Vol. 8(202), 2008. Keywords: explicit network, from unrooted trees, hybridization, lineage sorting, phylogenetic network, phylogeny, reconstruction, supernetwork. Note: http://dx.doi.org/10.1186/1471-2148-8-202.
Toggle abstract
"Background. A simple and widely used approach for detecting hybridization in phylogenies is to reconstruct gene trees from independent gene loci, and to look for gene tree incongruence. However, this approach may be confounded by factors such as poor taxon-sampling and/or incomplete lineage-sorting. Results. Using coalescent simulations, we investigated the potential of supernetwork methods to differentiate between gene tree incongruence arising from taxon sampling and incomplete lineage-sorting as opposed to hybridization. For few hybridization events, a large number of independent loci, and well-sampled taxa across these loci, we found that it was possible to distinguish incomplete lineage-sorting from hybridization using the filtered Z-closure and Q-imputation supernetwork methods. Moreover, we found that the choice of supernetwork method was less important than the choice of filtering, and that count-based filtering was the most effective filtering technique. Conclusion. Filtered supernetworks provide a tool for detecting and identifying hybridization events in phylogenies, a tool that should become increasingly useful in light of current genome sequencing initiatives and the ease with which large numbers of independent gene loci can be determined using new generation sequencing technologies. © 2008 Holland et al; licensee BioMed Central Ltd."
|
|
|
Gabriel Cardona,
Mercè Llabrés,
Francesc Rosselló and
Gabriel Valiente. On Nakhleh's metric for reduced phylogenetic networks. In TCBB, Vol. 6(4):629-638, 2009. Keywords: distance between networks, phylogenetic network, phylogeny. Note: Preliminary versions: http://arxiv.org/abs/0809.0110 and http://arxiv.org/abs/0801.2354v1.
Toggle abstract
"We prove that Nakhleh's metric for reduced phylogenetic networks is also a metric on the classes of tree-child phylogenetic networks, semibinary tree-sibling time consistent phylogenetic networks, and multilabeled phylogenetic trees. We also prove that it separates distinguishable phylogenetic networks. In this way, it becomes the strongest dissimilarity measure for phylogenetic networks available so far. Furthermore, we propose a generalization of that metric that separates arbitrary phylogenetic networks. © 2009 IEEE."
|
|
|
Miguel Arenas,
Gabriel Valiente and
David Posada. Characterization of reticulate networks based on the coalescent with recombination. In MBE, Vol. 25(12):2517-2520, 2008. Keywords: coalescent, evaluation, explicit network, galled tree, phylogenetic network, phylogeny, Program Recodon, regular network, simulation, tree sibling network, tree-child network. Note: http://dx.doi.org/10.1093/molbev/msn219.
Toggle abstract
"Phylogenetic networks aim to represent the evolutionary history of taxa. Within these, reticulate networks are explicitly able to accommodate evolutionary events like recombination, hybridization, or lateral gene transfer. Although several metrics exist to compare phylogenetic networks, they make several assumptions regarding the nature of the networks that are not likely to be fulfilled by the evolutionary process. In order to characterize the potential disagreement between the algorithms and the biology, we have used the coalescent with recombination to build the type of networks produced by reticulate evolution and classified them as regular, tree sibling, tree child, or galled trees. We show that, as expected, the complexity of these reticulate networks is a function of the population recombination rate. At small recombination rates, most of the networks produced are already more complex than regular or tree sibling networks, whereas with moderate and large recombination rates, no network fit into any of the standard classes. We conclude that new metrics still need to be devised in order to properly compare two phylogenetic networks that have arisen from reticulating evolutionary process. © 2008 The Authors."
|
|
|
|
|
Gabriel Cardona,
Francesc Rosselló and
Gabriel Valiente. Extended Newick: It is Time for a Standard Representation. In BMCB, Vol. 9:532, 2008. Keywords: evaluation, explicit network, phylogenetic network, Program Bio PhyloNetwork, Program Dendroscope, Program NetGen, Program PhyloNet, Program SplitsTree, Program TCS, visualization. Note: http://bioinfo.uib.es/media/uploaded/bmc-2008-enewick-sub.pdf.
|
|
|
Supriya Munshaw and
Thomas B. Kepler. An Information-Theoretic Method for the Treatment of Plural Ancestry in Phylogenetics. In MBE, Vol. 25(6):1199-1208, 2008. Keywords: explicit network, from sequences, heuristic, phylogenetic network, reconstruction, simulated annealing, software. Note: http://dx.doi.org/10.1093/molbev/msn066.
Toggle abstract
"In the presence of recombination and gene conversion, a given genomic segment may inherit information from 2 distinct immediate ancestors. The importance of this type of molecular inheritance has become increasingly clear over the years, and the potential for erroneous inference when it is not accounted for in the statistical model is well documented. Yet, the inclusion of plural ancestry (PA) in phylogenetic analysis is still not routine. This omission is due to the greater difficulty of phylogenetic inference on general acyclic graphs compared that on with trees and the accompanying computational burden. We have developed a technique for phylogenetic inference in the presence of PA based on the principle of minimum description length, which assigns a cost - the description length - to each network topology given the observed sequence data. The description length combines the cost of poor data fit and model complexity in terms of information. This device allows us to search through network topologies to minimize the total description length. By comparing the best models obtained with and without PA, one can determine whether or not recombination has played an active role in the evolution of the genes under investigation, identify those genes that appear to be mosaic, and infer the phylogenetic network that best represents the history of the alignment. We show that the method performs well on simulated data and demonstrate its application on HIV env gene sequence data from 8 human subjects. The software implementation of the method is available upon request. © The Author 2008. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved."
|
|
|
Miguel Arenas and
David Posada. Recodon: Coalescent simulation of coding DNA sequences with recombination, migration and demography. In BMCB, Vol. 8(458), 2008. Keywords: coalescent, generation, Program Recodon, software. Note: http://dx.doi.org/10.1186/1471-2105-8-458.
Toggle abstract
"Background: Coalescent simulations have proven very useful in many population genetics studies. In order to arrive to meaningful conclusions, it is important that these simulations resemble the process of molecular evolution as much as possible. To date, no single coalescent program is able to simulate codon sequences sampled from populations with recombination, migration and growth. Results: We introduce a new coalescent program, called Recodon, which is able to simulate samples of coding DNA sequences under complex scenarios in which several evolutionary forces can interact simultaneously (namely, recombination, migration and demography). The basic codon model implemented is an extension to the general time-reversible model of nucleotide substitution with a proportion of invariable sites and among-site rate variation. In addition, the program implements non-reversible processes and mixtures of different codon models. Conclusion: Recodon is a flexible tool for the simulation of coding DNA sequences under realistic evolutionary models. These simulations can be used to build parameter distributions for testing evolutionary hypotheses using experimental data. Recodon is written in C, can run in parallel, and is freely available from http://darwin.uvigo.es/. © 2007 Arenas and Posada; licensee BioMed Central Ltd."
|
|
|
|
|
Alain Guénoche. Graphical Representation of a Boolean Array. In Computers and the Humanities, Vol. 20(4):277-281, 1986. Keywords: from splits, median network, reconstruction. Note: http://dx.doi.org/10.1007/BF02400118.
Toggle abstract
"In this paper, we represent a boolean array of data with a median connected graph. Vertices are the different lines of the array plus virtual monomials, and an edge links two vertices that are different for only one variable. We describe an algorithm to compute this graph, that is an exact representation of the symmetrical difference distance between lines, and we show an application to Bronze age pins. © 1986 Paradigm Press, Inc."
|
|
|
Galina Glazko,
Vladimir Makarenkov,
Jing Liu and
Arcady Mushegian. Evolutionary history of bacteriophages with double-stranded DNA genomes. In Biology Direct, Vol. 2(36), 2007. Keywords: explicit network, from sequences, phylogenetic network, phylogeny, Program T REX. Note: http://dx.doi.org/10.1186/1745-6150-2-36.
Toggle abstract
"Background: Reconstruction of evolutionary history of bacteriophages is a difficult problem because of fast sequence drift and lack of omnipresent genes in phage genomes. Moreover, losses and recombinational exchanges of genes are so pervasive in phages that the plausibility of phylogenetic inference in phage kingdom has been questioned. Results: We compiled the profiles of presence and absence of 803 orthologous genes in 158 completely sequenced phages with double-stranded DNA genomes and used these gene content vectors to infer the evolutionary history of phages. There were 18 well-supported clades, mostly corresponding to accepted genera, but in some cases appearing to define new taxonomic groups. Conflicts between this phylogeny and trees constructed from sequence alignments of phage proteins were exploited to infer 294 specific acts of intergenome gene transfer. Conclusion: A notoriously reticulate evolutionary history of fast-evolving phages can be reconstructed in considerable detail by quantitative comparative genomics. © 2007 Glazko et al; licensee BioMed Central Ltd."
|
|
|
Roderic D.M. Page and
Michael A. Charleston. Trees within trees: phylogeny and historical associations. In TEE, Vol. 13(9):356-359, 1998. Keywords: duplication, explicit network, from rooted trees, from species tree, lateral gene transfer, phylogenetic network, phylogeny, reconstruction, survey. Note: http://taxonomy.zoology.gla.ac.uk/rod/papers/tree.pdf.
|
|
|
Cuong Than,
Derek Ruths,
Hideki Innan and
Luay Nakhleh. Confounding Factors in HGT Detection: Statistical Error, Coalescent Effects, and Multiple Solutions. In JCB, Vol. 14(4):517-535, 2007. Keywords: counting, explicit network, from rooted trees, from species tree, lateral gene transfer, phylogenetic network, phylogeny, Program LatTrans, Program PhyloNet. Note: http://www.cs.rice.edu/~nakhleh/Papers/recombcg06-jcb.pdf.
|
|
|
|
|
Bin Ma,
Lusheng Wang and
Ming Li. Fixed topology alignment with recombination. In DAM, Vol. 104:281-300, 2000. Keywords: approximation, explicit network, from network, from sequences, galled tree, inapproximability, phylogenetic network, phylogeny, recombination. Note: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.40.7759.
Toggle abstract
"Background: Reticulate events play an important role in determining evolutionary relationships. The problem of computing the minimum number of such events to explain discordance between two phylogenetic trees is a hard computational problem. Even for binary trees, exact solvers struggle to solve instances with reticulation number larger than 40-50.Results: Here we present CycleKiller and NonbinaryCycleKiller, the first methods to produce solutions verifiably close to optimality for instances with hundreds or even thousands of reticulations.Conclusions: Using simulations, we demonstrate that these algorithms run quickly for large and difficult instances, producing solutions that are very close to optimality. As a spin-off from our simulations we also present TerminusEst, which is the fastest exact method currently available that can handle nonbinary trees: this is used to measure the accuracy of the NonbinaryCycleKiller algorithm. All three methods are based on extensions of previous theoretical work (SIDMA 26(4):1635-1656, TCBB 10(1):18-25, SIDMA 28(1):49-66) and are publicly available. We also apply our methods to real data. © 2014 van Iersel et al.; licensee BioMed Central Ltd."
|
|
|
Ran Libeskind-Hadas and
Michael A. Charleston. On the Computational Complexity of the Reticulate Cophylogeny Reconstruction Problem. In JCB, Vol. 16(1):105-117, 2009. Keywords: cophylogeny, heuristic, NP complete, parsimony, phylogenetic network, reconstruction. Note: http://dx.doi.org/10.1089/cmb.2008.0084.
Toggle abstract
"The cophylogeny reconstruction problem is that of finding minimal cost explanations of differences between evolutionary histories of ecologically linked groups of biological organisms. We present a proof that shows that the general problem of reconciling evolutionary histories is NP-complete and provide a sharp boundary where this intractability begins. We also show that a related problem, that of finding Pareto optimal solutions, is NP-hard. As a byproduct of our results, we give a framework by which meta-heuristics can be applied to find good solutions to this problem. © Mary Ann Liebert, Inc. 2009."
|
|
|
Johannes Fischer and
Daniel H. Huson. New Common Ancestor Problems in Trees and Directed Acyclic Graphs. In IPL, Vol. 110(8-9):331-335, 2010. Keywords: explicit network, phylogenetic network, polynomial. Note: http://www-ab.informatik.uni-tuebingen.de/people/fischer/lsa.pdf.
Toggle abstract
"We derive a new generalization of lowest common ancestors (LCAs) in dags, called the lowest single common ancestor (LSCA). We show how to preprocess a static dag in linear time such that subsequent LSCA-queries can be answered in constant time. The size is linear in the number of nodes. We also consider a "fuzzy" variant of LSCA that allows to compute a node that is only an LSCA of a given percentage of the query nodes. The space and construction time of our scheme for fuzzy LSCAs is linear, whereas the query time has a sub-logarithmic slow-down. This "fuzzy" algorithm is also applicable to LCAs in trees, with the same complexities. © 2010 Elsevier B.V. All rights reserved."
|
|
|
Stephen J. Willson. Regular Networks Can Be Uniquely Constructed from Their Trees. In TCBB, Vol. 8(3):785-796, 2010. Keywords: explicit network, from rooted trees, phylogenetic network, phylogeny, reconstruction, regular network. Note: http://www.public.iastate.edu/~swillson/RegularNetsFromTrees5.pdf.
Toggle abstract
"A rooted acyclic digraph N with labeled leaves displays a tree T when there exists a way to select a unique parent of each hybrid vertex resulting in the tree T. Let Tr(N) denote the set of all trees displayed by the network N. In general, there may be many other networks M, such that Tr(M) = Tr(N). A network is regular if it is isomorphic with its cover digraph. If N is regular and D is a collection of trees displayed by N, this paper studies some procedures to try to reconstruct N given D. If the input is D=Tr(N), one procedure is described, which will reconstruct N. Hence, if N and M are regular networks and Tr(N) = Tr(M), it follows that N = M, proving that a regular network is uniquely determined by its displayed trees. If D is a (usually very much smaller) collection of displayed trees that satisfies certain hypotheses, modifications of the procedure will still reconstruct N given D. © 2011 IEEE."
|
|
|
Martin Lott,
Andreas Spillner,
Katharina Huber and
Vincent Moulton. PADRE: A Package for Analyzing and Displaying Reticulate Evolution. In BIO, Vol. 25(9):1199-1200, 2009. Keywords: duplication, explicit network, from multilabeled tree, phylogenetic network, phylogeny, Program PADRE, reconstruction, software. Note: http://dx.doi.org/10.1093/bioinformatics/btp133.
Toggle abstract
"Recent advances in gene sequencing for polyploid species, coupled with standard phylogenetic tree reconstruction, leads to gene trees in which the same species can label several leaves. Such multi-labeled trees are then used to reconstruct the evolutionary history of the polyploid species in question. However, this reconstruction process requires new techniques that are not available in current phylogenetic software packages. Here, we describe the software package PADRE (Package for Analyzing and Displaying Reticulate Evolution) that implements such techniques, allowing the reconstruction of complex evolutionary histories for polyploids in the form of phylogenetic networks. © The Author 2009. Published by Oxford University Press. All rights reserved."
|
|
|
Gabriel Cardona,
Mercè Llabrés,
Francesc Rosselló and
Gabriel Valiente. The comparison of tree-sibling time consistent phylogenetic networks is graph-isomorphism complete. In The Scientific World Journal, Vol. 2014(254279):1-6, 2014. Keywords: abstract network, distance between networks, from network, isomorphism, phylogenetic network, tree sibling network. Note: http://arxiv.org/abs/0902.4640.
Toggle abstract
"Several polynomial time computable metrics on the class of semibinary tree-sibling time consistent phylogenetic networks are available in the literature; in particular, the problem of deciding if two networks of this kind are isomorphic is in P. In this paper, we show that if we remove the semibinarity condition, then the problem becomes much harder. More precisely, we prove that the isomorphism problem for generic tree-sibling time consistent phylogenetic networks is polynomially equivalent to the graph isomorphism problem. Since the latter is believed not to belong to P, the chances are that it is impossible to define a metric on the class of all tree-sibling time consistent phylogenetic networks that can be computed in polynomial time. © 2014 Gabriel Cardona et al."
|
|
|
|
|
Laxmi Parida,
Asif Javed,
Marta Melé,
Francesc Calafell,
Jaume Bertranpetit and
Genographic Consortium. Minimizing recombinations in consensus networks for phylogeographic studies. In BMCB, Vol. 10(Suppl 1):S72, 2009. Note: Selected papers from the Seventh Asia-Pacific Bioinformatics Conference (APBC 2009), http://dx.doi.org/10.1186/1471-2105-10-S1-S72.
Toggle abstract
"Background: We address the problem of studying recombinational variations in (human) populations. In this paper, our focus is on one computational aspect of the general task: Given two networks G1 and G2, with both mutation and recombination events, defined on overlapping sets of extant units the objective is to compute a consensus network G3 with minimum number of additional recombinations. We describe a polynomial time algorithm with a guarantee that the number of computed new recombination events is within = sz(G1, G2) (function sz is a well-behaved function of the sizes and topologies of G1 and G2) of the optimal number of recombinations. To date, this is the best known result for a network consensus problem. Results: Although the network consensus problem can be applied to a variety of domains, here we focus on structure of human populations. With our preliminary analysis on a segment of the human Chromosome X data we are able to infer ancient recombinations, population-specific recombinations and more, which also support the widely accepted 'Out of Africa' model. These results have been verified independently using traditional manual procedures. To the best of our knowledge, this is the first recombinations-based characterization of human populations. Conclusion: We show that our mathematical model identifies recombination spots in the individual haplotypes; the aggregate of these spots over a set of haplotypes defines a recombinational landscape that has enough signal to detect continental as well as population divide based on a short segment of Chromosome X. In particular, we are able to infer ancient recombinations, population-specific recombinations and more, which also support the widely accepted 'Out of Africa' model. The agreement with mutation-based analysis can be viewed as an indirect validation of our results and the model. Since the model in principle gives us more information embedded in the networks, in our future work, we plan to investigate more non-traditional questions via these structures computed by our methodology. © 2009 Parida et al; licensee BioMed Central Ltd."
|
|
|
Francesc Rosselló and
Gabriel Valiente. All that Glisters is not Galled. In MBIO, Vol. 221(1):54-59, 2009. Keywords: galled tree, phylogenetic network, phylogeny. Note: http://arxiv.org/abs/0904.2448.
Toggle abstract
"Galled trees, evolutionary networks with isolated reticulation cycles, have appeared under several slightly different definitions in the literature. In this paper, we establish the actual relationships between the main four such alternative definitions: namely, the original galled trees, level-1 networks, nested networks with nesting depth 1, and evolutionary networks with arc-disjoint reticulation cycles. © 2009 Elsevier Inc. All rights reserved."
|
|
|
Philippe Gambette and
Katharina Huber. On Encodings of Phylogenetic Networks of Bounded Level. In JOMB, Vol. 65(1):157-180, 2012. Keywords: characterization, explicit network, from clusters, from rooted trees, from triplets, galled tree, identifiability, level k phylogenetic network, phylogenetic network, uniqueness, weak hierarchy. Note: http://hal.archives-ouvertes.fr/hal-00609130/en/.
Toggle abstract
"Phylogenetic networks have now joined phylogenetic trees in the center of phylogenetics research. Like phylogenetic trees, such networks canonically induce collections of phylogenetic trees, clusters, and triplets, respectively. Thus it is not surprising that many network approaches aim to reconstruct a phylogenetic network from such collections. Related to the well-studied perfect phylogeny problem, the following question is of fundamental importance in this context: When does one of the above collections encode (i. e. uniquely describe) the network that induces it? For the large class of level-1 (phylogenetic) networks we characterize those level-1 networks for which an encoding in terms of one (or equivalently all) of the above collections exists. In addition, we show that three known distance measures for comparing phylogenetic networks are in fact metrics on the resulting subclass and give the diameter for two of them. Finally, we investigate the related concept of indistinguishability and also show that many properties enjoyed by level-1 networks are not satisfied by networks of higher level. © 2011 Springer-Verlag."
|
|
|
Gabriel Cardona,
Mercè Llabrés,
Francesc Rosselló and
Gabriel Valiente. Comparison of Galled Trees. In TCBB, Vol. 8(2):410-427, 2011. Note: http://arxiv.org/abs/0906.1166.
Toggle abstract
"Galled trees, directed acyclic graphs that model evolutionary histories with isolated hybridization events, have become very popular due to both their biological significance and the existence of polynomial-time algorithms for their reconstruction. In this paper, we establish to which extent several distance measures for the comparison of evolutionary networks are metrics for galled trees, and hence, when they can be safely used to evaluate galled tree reconstruction methods. © 2011 IEEE."
|
|
|
Sarah C. Ayling and
Terence A. Brown. Novel methodology for construction and pruning of quasi-median networks. In BMCB, Vol. 9:115, 2009. Keywords: abstract network, from sequences, median network, phylogenetic network, phylogeny, quasi-median network, reconstruction. Note: http://dx.doi.org/10.1186/1471-2105-9-115.
Toggle abstract
"BACKGROUND: Visualising the evolutionary history of a set of sequences is a challenge for molecular phylogenetics. One approach is to use undirected graphs, such as median networks, to visualise phylogenies where reticulate relationships such as recombination or homoplasy are displayed as cycles. Median networks contain binary representations of sequences as nodes, with edges connecting those sequences differing at one character; hypothetical ancestral nodes are invoked to generate a connected network which contains all most parsimonious trees. Quasi-median networks are a generalisation of median networks which are not restricted to binary data, although phylogenetic information contained within the multistate positions can be lost during the preprocessing of data. Where the history of a set of samples contain frequent homoplasies or recombination events quasi-median networks will have a complex topology. Graph reduction or pruning methods have been used to reduce network complexity but some of these methods are inapplicable to datasets in which recombination has occurred and others are procedurally complex and/or result in disconnected networks. RESULTS: We address the problems inherent in construction and reduction of quasi-median networks. We describe a novel method of generating quasi-median networks that uses all characters, both binary and multistate, without imposing an arbitrary ordering of the multistate partitions. We also describe a pruning mechanism which maintains at least one shortest path between observed sequences, displaying the underlying relations between all pairs of sequences while maintaining a connected graph. CONCLUSION: Application of this approach to 5S rDNA sequence data from sea beet produced a pruned network within which genetic isolation between populations by distance was evident, demonstrating the value of this approach for exploration of evolutionary relationships."
|
|
|
Hadas Birin,
Zohar Gal-Or,
Isaac Elias and
Tamir Tuller. Inferring horizontal transfers in the presence of rearrangements by the minimum evolution criterion. In BIO, Vol. 24(6):826-832, 2008. Note: http://dx.doi.org/10.1093/bioinformatics/btn024.
Toggle abstract
"Motivation: The evolution of viruses is very rapid and in addition to local point mutations (insertion, deletion, substitution) it also includes frequent recombinations, genome rearrangements and horizontal transfer of genetic materials (HGTS). Evolutionary analysis of viral sequences is therefore a complicated matter for two main reasons: First, due to HGTs and recombinations, the right model of evolution is a network and not a tree. Second, due to genome rearrangements, an alignment of the input sequences is not guaranteed. These facts encourage developing methods for inferring phylogenetic networks that do not require aligned sequences as input. Results: In this work, we present the first computational approach which deals with both genome rearrangements and horizontal gene transfers and does not require a multiple alignment as input. We formalize a new set of computational problems which involve analyzing such complex models of evolution. We investigate their computational complexity, and devise algorithms for solving them. Moreover, we demonstrate the viability of our methods on several synthetic datasets as well as four biological datasets. © The Author 2008. Published by Oxford University Press. All rights reserved."
|
|
|
Frederick A. Matsen. ConstNJ: an algorithm to reconstruct sets of phylogenetic trees satisfying pairwise topological constraints. In JCB, Vol. 17(6):799-818, 2010. Keywords: from distances, Program constNJ, reconstruction. Note: http://arxiv.org/abs/0901.1598v2.
Toggle abstract
"This article introduces constNJ (constrained neighbor-joining), an algorithm for phylogenetic reconstruction of sets of trees with constrained pairwise rooted subtree-prune-regraft (rSPR) distance. We are motivated by the problem of constructing sets of trees that must fit into a recombination, hybridization, or similar network. Rather than first finding a set of trees that are optimal according to a phylogenetic criterion (e.g., likelihood or parsimony) and then attempting to fit them into a network, constNJ estimates the trees while enforcing specified rSPR distance constraints. The primary input for constNJ is a collection of distance matrices derived from sequence blocks which are assumed to have evolved in a tree-like manner, such as blocks of an alignment which do not contain any recombination breakpoints. The other input is a set of rSPR constraint inequalities for any set of pairs of trees. constNJ is consistent and a strict generalization of the neighbor-joining algorithm; it uses the new notion of maximum agreement partitions (MAPs) to assure that the resulting trees satisfy the given rSPR distance constraints. Copyright 2010, Mary Ann Liebert, Inc."
|
|
|
Stefan Grünewald,
Jacobus Koolen and
Woo-Sun Lee. Quartets in maximal weakly compatible split systems. In Applied Mathematics Letters, Vol. 22(6):1604-1608, 2009. Note: http://dx.doi.org/10.1016/j.aml.2009.05.006.
Toggle abstract
"Weakly compatible split systems are a generalization of unrooted evolutionary trees and are commonly used to display reticulate evolution or ambiguity in biological data. They are collections of bipartitions of a finite set X of taxa (e.g. species) with the property that, for every four taxa, at least one of the three bipartitions into two pairs (quartets) is not induced by any of the X-splits. We characterize all split systems where exactly two quartets from every quadruple are induced by some split. On the other hand, we construct maximal weakly compatible split systems where the number of induced quartets per quadruple tends to 0 with the number of taxa going to infinity. © 2009."
|
|
|
Stephen J. Willson. Properties of normal phylogenetic networks. In BMB, Vol. 72(2):340-358, 2010. Keywords: normal network, phylogenetic network, phylogeny, regular network. Note: http://www.public.iastate.edu/~swillson/RestrictionsOnNetworkspap9.pdf, slides available at http://www.newton.cam.ac.uk/webseminars/pg+ws/2007/plg/plgw01/0904/willson/.
Toggle abstract
"A phylogenetic network is a rooted acyclic digraph with vertices corresponding to taxa. Let X denote a set of vertices containing the root, the leaves, and all vertices of outdegree 1. Regard X as the set of vertices on which measurements such as DNA can be made. A vertex is called normal if it has one parent, and hybrid if it has more than one parent. The network is called normal if it has no redundant arcs and also from every vertex there is a directed path to a member of X such that all vertices after the first are normal. This paper studies properties of normal networks. Under a simple model of inheritance that allows homoplasies only at hybrid vertices, there is essentially unique determination of the genomes at all vertices by the genomes at members of X if and only if the network is normal. This model is a limiting case of more standard models of inheritance when the substitution rate is sufficiently low. Various mathematical properties of normal networks are described. These properties include that the number of vertices grows at most quadratically with the number of leaves and that the number of hybrid vertices grows at most linearly with the number of leaves. © 2009 Society for Mathematical Biology."
|
|
|
Katharina Huber,
Leo van Iersel,
Steven Kelk and
Radoslaw Suchecki. A Practical Algorithm for Reconstructing Level-1 Phylogenetic Networks. In TCBB, Vol. 8(3):607-620, 2011. Keywords: explicit network, from triplets, galled tree, generation, heuristic, phylogenetic network, phylogeny, Program LEV1ATHAN, Program Lev1Generator, reconstruction, software. Note: http://arxiv.org/abs/0910.4067.
Toggle abstract
"Recently, much attention has been devoted to the construction of phylogenetic networks which generalize phylogenetic trees in order to accommodate complex evolutionary processes. Here, we present an efficient, practical algorithm for reconstructing level-1 phylogenetic networks-a type of network slightly more general than a phylogenetic tree-from triplets. Our algorithm has been made publicly available as the program Lev1athan. It combines ideas from several known theoretical algorithms for phylogenetic tree and network reconstruction with two novel subroutines. Namely, an exponential-time exact and a greedy algorithm both of which are of independent theoretical interest. Most importantly, Lev1athan runs in polynomial time and always constructs a level-1 network. If the data are consistent with a phylogenetic tree, then the algorithm constructs such a tree. Moreover, if the input triplet set is dense and, in addition, is fully consistent with some level-1 network, it will find such a network. The potential of Lev1athan is explored by means of an extensive simulation study and a biological data set. One of our conclusions is that Lev1athan is able to construct networks consistent with a high percentage of input triplets, even when these input triplets are affected by a low to moderate level of noise. © 2011 IEEE."
|
|
|
|
|
Leo van Iersel,
Charles Semple and
Mike Steel. Quantifying the Extent of Lateral Gene Transfer Required to Avert a 'Genome of Eden'. In BMB, Vol. 72:1783–1798, 2010. Note: http://www.win.tue.nl/~liersel/LGT.pdf.
Toggle abstract
"The complex pattern of presence and absence of many genes across different species provides tantalising clues as to how genes evolved through the processes of gene genesis, gene loss, and lateral gene transfer (LGT). The extent of LGT, particularly in prokaryotes, and its implications for creating a 'network of life' rather than a 'tree of life' is controversial. In this paper, we formally model the problem of quantifying LGT, and provide exact mathematical bounds, and new computational results. In particular, we investigate the computational complexity of quantifying the extent of LGT under the simple models of gene genesis, loss, and transfer on which a recent heuristic analysis of biological data relied. Our approach takes advantage of a relationship between LGT optimization and graph-theoretical concepts such as tree width and network flow. © 2010 Society for Mathematical Biology."
|
|
|
|
|
Josh Voorkamp né Collins,
Simone Linz and
Charles Semple. Quantifying hybridization in realistic time. In JCB, Vol. 18(10):1305-1318, 2011. Keywords: explicit network, FPT, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, Program HybridInterleave, reconstruction, software. Note: http://wwwcsif.cs.ucdavis.edu/~linzs/CLS10_interleave.pdf, software available at http://www.math.canterbury.ac.nz/~c.semple/software.shtml.
Toggle abstract
"Recently, numerous practical and theoretical studies in evolutionary biology aim at calculating the extent to which reticulation-for example, horizontal gene transfer, hybridization, or recombination-has influenced the evolution for a set of present-day species. It has been shown that inferring the minimum number of hybridization events that is needed to simultaneously explain the evolutionary history for a set of trees is an NP-hard and also fixed-parameter tractable problem. In this article, we give a new fixed-parameter algorithm for computing the minimum number of hybridization events for when two rooted binary phylogenetic trees are given. This newly developed algorithm is based on interleaving-a technique using repeated kernelization steps that are applied throughout the exhaustive search part of a fixed-parameter algorithm. To show that our algorithm runs efficiently to be applicable to a wide range of practical problem instances, we apply it to a grass data set and highlight the significant improvements in terms of running times in comparison to an algorithm that has previously been implemented. © 2011, Mary Ann Liebert, Inc."
|
|
|
Mark A. Ragan. Trees and networks before and after Darwin. In Biology Direct, Vol. 4(43), 2009. Keywords: abstract network, explicit network, phylogenetic network, phylogeny, survey, visualization. Note: http://dx.doi.org/10.1186/1745-6150-4-43.
Toggle abstract
"It is well-known that Charles Darwin sketched abstract trees of relationship in his 1837 notebook, and depicted a tree in the Origin of Species (1859). Here I attempt to place Darwin's trees in historical context. By the mid-Eighteenth century the Great Chain of Being was increasingly seen to be an inadequate description of order in nature, and by about 1780 it had been largely abandoned without a satisfactory alternative having been agreed upon. In 1750 Donati described aquatic and terrestrial organisms as forming a network, and a few years later Buffon depicted a network of genealogical relationships among breeds of dogs. In 1764 Bonnet asked whether the Chain might actually branch at certain points, and in 1766 Pallas proposed that the gradations among organisms resemble a tree with a compound trunk, perhaps not unlike the tree of animal life later depicted by Eichwald. Other trees were presented by Augier in 1801 and by Lamarck in 1809 and 1815, the latter two assuming a transmutation of species over time. Elaborate networks of affinities among plants and among animals were depicted in the late Eighteenth and very early Nineteenth centuries. In the two decades immediately prior to 1837, so-called affinities and/or analogies among organisms were represented by diverse geometric figures. Series of plant and animal fossils in successive geological strata were represented as trees in a popular textbook from 1840, while in 1858 Bronn presented a system of animals, as evidenced by the fossil record, in a form of a tree. Darwin's 1859 tree and its subsequent elaborations by Haeckel came to be accepted in many but not all areas of biological sciences, while network diagrams were used in others. Beginning in the early 1960s trees were inferred from protein and nucleic acid sequences, but networks were re-introduced in the mid-1990s to represent lateral genetic transfer, increasingly regarded as a fundamental mode of evolution at least for bacteria and archaea. In historical context, then, the Network of Life preceded the Tree of Life and might again supersede it. Reviewers: This article was reviewed by Eric Bapteste, Patrick Forterre and Dan Graur. © 2009 Ragan; licensee BioMed Central Ltd."
|
|
|
Joel Velasco and
Elliott Sober. Testing for Treeness: Lateral Gene Transfer, Phylogenetic Inference, and Model Selection. In Biology and Philosophy, Vol. 25(4):675-687, 2010. Keywords: explicit network, model selection, phylogenetic network, phylogeny, reconstruction, statistical model. Note: http://joelvelasco.net/Papers/velascosober-testingfortreeness.pdf.
Toggle abstract
"A phylogeny that allows for lateral gene transfer (LGT) can be thought of as a strictly branching tree (all of whose branches are vertical) to which lateral branches have been added. Given that the goal of phylogenetics is to depict evolutionary history, we should look for the best supported phylogenetic network and not restrict ourselves to considering trees. However, the obvious extensions of popular tree-based methods such as maximum parsimony and maximum likelihood face a serious problem-if we judge networks by fit to data alone, networks that have lateral branches will always fit the data at least as well as any network that restricts itself to vertical branches. This is analogous to the well-studied problem of overfitting data in the curve-fitting problem. Analogous problems often have analogous solutions and we propose to treat network inference as a case of model selection and use the Akaike Information Criterion (AIC). Strictly tree-like networks are more parsimonious than those that postulate lateral as well as vertical branches. This leads to the conclusion that we should not always infer LGT events whenever it would improve our fit-to-data, but should do so only when the improved fit is larger than the penalty for adding extra lateral branches. © 2010 Springer Science+Business Media B.V."
|
|
|
|
|
|
|
|
|
|
|
David A. Morrison. Using data-display networks for exploratory data analysis in phylogenetic studies. In MBE, Vol. 27(5):1044-1057, 2010. Keywords: abstract network, hybridization, NeighborNet, Program SplitsTree, recombination, split decomposition. Note: http://dx.doi.org/10.1093/molbev/msp309.
Toggle abstract
"Exploratory data analysis (EDA) is a frequently undervalued part of data analysis in biology. It involves evaluating the characteristics of the data "before" proceeding to the definitive analysis in relation to the scientific question at hand. For phylogenetic analyses, a useful tool for EDA is a data-display network. This type of network is designed to display any character (or tree) conflict in a data set, without prior assumptions about the causes of those conflicts. The conflicts might be caused by 1) methodological issues in data collection or analysis, 2) homoplasy, or 3) horizontal gene flow of some sort. Here, I explore 13 published data sets using splits networks, as examples of using data-display networks for EDA. In each case, I performed an original EDA on the data provided, to highlight the aspects of the resulting network that will be important for an interpretation of the phylogeny. In each case, there is at least one important point (possibly missed by the original authors) that might affect the phylogenetic analysis. I conclude that EDA should play a greater role in phylogenetic analyses than it has done. © 2010 The Author. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved."
|
|
|
Leo van Iersel and
Steven Kelk. Constructing the Simplest Possible Phylogenetic Network from Triplets. In ALG, Vol. 60(2):207-235, 2011. Keywords: explicit network, from triplets, galled tree, level k phylogenetic network, minimum number, phylogenetic network, phylogeny, polynomial, Program Marlon, Program Simplistic. Note: http://dx.doi.org/10.1007/s00453-009-9333-0.
Toggle abstract
"A phylogenetic network is a directed acyclic graph that visualizes an evolutionary history containing so-called reticulations such as recombinations, hybridizations or lateral gene transfers. Here we consider the construction of a simplest possible phylogenetic network consistent with an input set T, where T contains at least one phylogenetic tree on three leaves (a triplet) for each combination of three taxa. To quantify the complexity of a network we consider both the total number of reticulations and the number of reticulations per biconnected component, called the level of the network. We give polynomial-time algorithms for constructing a level-1 respectively a level-2 network that contains a minimum number of reticulations and is consistent with T (if such a network exists). In addition, we show that if T is precisely equal to the set of triplets consistent with some network, then we can construct such a network with smallest possible level in time O(|T| k+1), if k is a fixed upper bound on the level of the network. © 2009 The Author(s)."
|
|
|
Simone Linz,
Charles Semple and
Tanja Stadler. Analyzing and reconstructing reticulation networks under timing constraints. In JOMB, Vol. 61(5):715-737, 2010. Keywords: explicit network, from rooted trees, hybridization, lateral gene transfer, NP complete, phylogenetic network, phylogeny, reconstruction, time consistent network. Note: http://dx.doi.org/10.1007/s00285-009-0319-y..
Toggle abstract
"Reticulation networks are now frequently used to model the history of life for various groups of species whose evolutionary past is likely to include reticulation events such as horizontal gene transfer or hybridization. However, the reconstructed networks are rarely guaranteed to be temporal. If a reticulation network is temporal, then it satisfies the two biologically motivated timing constraints of instantaneously occurring reticulation events and successively occurring speciation events. On the other hand, if a reticulation network is not temporal, it is always possible to make it temporal by adding a number of additional unsampled or extinct taxa. In the first half of the paper, we show that deciding whether a given number of additional taxa is sufficient to transform a non-temporal reticulation network into a temporal one is an NP-complete problem. As one is often given a set of gene trees instead of a network in the context of hybridization, this motivates the second half of the paper which provides an algorithm, called TemporalHybrid, for reconstructing a temporal hybridization network that simultaneously explains the ancestral history of two trees or indicates that no such network exists. We further derive two methods to decide whether or not a temporal hybridization network exists for two given trees and illustrate one of the methods on a grass data set. © 2009 The Author(s)."
|
|
|
|
|
Martin Lott,
Andreas Spillner,
Katharina Huber,
Anna Petri,
Bengt Oxelman and
Vincent Moulton. Inferring polyploid phylogenies from multiply-labeled gene trees. In BMCEB, Vol. 9:216, 2009. Keywords: duplication, explicit network, from multilabeled tree, phylogenetic network, phylogeny, Program PADRE, reconstruction. Note: http://dx.doi.org/10.1186/1471-2148-9-216.
Toggle abstract
"Background : Gene trees that arise in the context of reconstructing the evolutionary history of polyploid species are often multiply-labeled, that is, the same leaf label can occur several times in a single tree. This property considerably complicates the task of forming a consensus of a collection of such trees compared to usual phylogenetic trees. Results. We present a method for computing a consensus tree of multiply-labeled trees. As with the well-known greedy consensus tree approach for phylogenetic trees, our method first breaks the given collection of gene trees into a set of clusters. It then aims to insert these clusters one at a time into a tree, starting with the clusters that are supported by most of the gene trees. As the problem to decide whether a cluster can be inserted into a multiply-labeled tree is computationally hard, we have developed a heuristic method for solving this problem. Conclusion. We illustrate the applicability of our method using two collections of trees for plants of the genus Silene, that involve several allopolyploids at different levels. © 2009 Lott et al; licensee BioMed Central Ltd."
|
|
|
|
|
Tal Dagan,
Yael Artzy-Randrup and
William Martin. Modular networks and cumulative impact of lateral transfer in prokaryote genome evolution. In PNAS, Vol. 105:10039-10044, 2008. Keywords: from sequences, from species tree, heuristic, lateral gene transfer, phylogenetic network, phylogeny, reconstruction. Note: http://dx.doi.org/10.1073/pnas.0800679105.
Toggle abstract
"Lateral gene transfer is an important mechanism of natural variation among prokaryotes, but the significance of its quantitative contribution to genome evolution is debated. Here, we report networks that capture both vertical and lateral components of evolutionary history among 539,723 genes distributed across 181 sequenced prokaryotic genomes. Partitioning of these networks by an eigenspectrum analysis identifies community structure in prokaryotic gene-sharing networks, the modules of which do not correspond to a strictly hierarchical prokaryotic classification. Our results indicate that, on average, at least 81 ± 15% of the genes in each genome studied were involved in lateral gene transfer at some point in their history, even though they can be vertically inherited after acquisition, uncovering a substantial cumulative effect of lateral gene transfer on longer evolutionary time scales. © 2008 by The National Academy of Sciences of the USA."
|
|
|
Hans-Jürgen Bandelt and
Arne Dür. Translating DNA data tables into quasi-median networks for parsimony analysis and error detection. In MPE, Vol. 42(1):256-271, 2007. Keywords: abstract network, from sequences, parsimony, phylogenetic network, phylogeny, quasi-median network, reconstruction. Note: http://dx.doi.org/10.1016/j.ympev.2006.07.013.
Toggle abstract
"Every DNA data table can be turned into a quasi-median network that faithfully represents the data. We show that for (weighted) condensed data tables the associated network harbors all most parsimonious reconstructions for any tree that connects the sampled haplotypes. Structural features of this network can be computed directly from the data table. The key principle repeatedly used is that the quasi-median network is uniquely determined by the sub-tables for pairs of characters. The translation of a table into a network enhances the understanding of the properties of the data in regard to homoplasy and potential artifacts. The total number of nodes of such a network measures the complexity of the data. In particular, networks that display the results of filter analyses by which hotspot mutations are removed help to detect data idiosyncrasies and thus pinpoint sequencing problems. A pertinent example drawn from human mtDNA illustrates these points. © 2006 Elsevier Inc. All rights reserved."
|
|
|
Leo van Iersel and
Steven Kelk. When two trees go to war. In JTB, Vol. 269(1):245-255, 2011. Keywords: APX hard, explicit network, from clusters, from rooted trees, from sequences, from triplets, level k phylogenetic network, minimum number, NP complete, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://arxiv.org/abs/1004.5332.
Toggle abstract
"Rooted phylogenetic networks are used to model non-treelike evolutionary histories. Such networks are often constructed by combining trees, clusters, triplets or characters into a single network that in some well-defined sense simultaneously represents them all. We review these four models and investigate how they are related. Motivated by the parsimony principle, one often aims to construct a network that contains as few reticulations (non-treelike evolutionary events) as possible. In general, the model chosen influences the minimum number of reticulation events required. However, when one obtains the input data from two binary (i.e. fully resolved) trees, we show that the minimum number of reticulations is independent of the model. The number of reticulations necessary to represent the trees, triplets, clusters (in the softwired sense) and characters (with unrestricted multiple crossover recombination) are all equal. Furthermore, we show that these results also hold when not the number of reticulations but the level of the constructed network is minimised. We use these unification results to settle several computational complexity questions that have been open in the field for some time. We also give explicit examples to show that already for data obtained from three binary trees the models begin to diverge. © 2010 Elsevier Ltd."
|
|
|
Robert G. Beiko. Gene sharing and genome evolution: networks in trees and trees in networks. In Biology and Philosophy, Vol. 25(4):659-673, 2010. Keywords: abstract network, explicit network, from rooted trees, galled network, phylogenetic network, phylogeny, Program Dendroscope, Program SplitsTree, reconstruction, split network, survey. Note: http://dx.doi.org/10.1007/s10539-010-9217-3.
Toggle abstract
"Frequent lateral genetic transfer undermines the existence of a unique "tree of life" that relates all organisms. Vertical inheritance is nonetheless of vital interest in the study of microbial evolution, and knowing the "tree of cells" can yield insights into ecological continuity, the rates of change of different cellular characters, and the evolutionary plasticity of genomes. Notwithstanding within-species recombination, the relationships most frequently recovered from genomic data at shallow to moderate taxonomic depths are likely to reflect cellular inheritance. At the same time, it is clear that several types of 'average signals' from whole genomes can be highly misleading, and the existence of a central tendency must not be taken as prima facie evidence of vertical descent. Phylogenetic networks offer an attractive solution, since they can be formulated in ways that mitigate the misleading aspects of hybrid evolutionary signals in genomes. But the connections in a network typically show genetic relatedness without distinguishing between vertical and lateral inheritance of genetic material. The solution may lie in a compromise between strict tree-thinking and network paradigms: build a phylogenetic network, but identify the set of connections in the network that are potentially due to vertical descent. Even if a single tree cannot be unambiguously identified, choosing a subnetwork of putative vertical connections can still lead to drastic reductions in the set of candidate vertical hypotheses. © 2010 Springer Science+Business Media B.V."
|
|
|
Miguel Arenas,
Mateus Patricio,
David Posada and
Gabriel Valiente. Characterization of Phylogenetic Networks with NetTest. In BMCB, Vol. 11:268, 2010. Keywords: explicit network, galled tree, phylogenetic network, Program NetTest, software, time consistent network, tree sibling network, tree-child network, visualization. Note: http://dx.doi.org/10.1186/1471-2105-11-268, software available at http://darwin.uvigo.es/software/nettest/.
Toggle abstract
"Background: Typical evolutionary events like recombination, hybridization or gene transfer make necessary the use of phylogenetic networks to properly depict the evolution of DNA and protein sequences. Although several theoretical classes have been proposed to characterize these networks, they make stringent assumptions that will likely not be met by the evolutionary process. We have recently shown that the complexity of simulated networks is a function of the population recombination rate, and that at moderate and large recombination rates the resulting networks cannot be categorized. However, we do not know whether these results extend to networks estimated from real data.Results: We introduce a web server for the categorization of explicit phylogenetic networks, including the most relevant theoretical classes developed so far. Using this tool, we analyzed statistical parsimony phylogenetic networks estimated from ~5,000 DNA alignments, obtained from the NCBI PopSet and Polymorphix databases. The level of characterization was correlated to nucleotide diversity, and a high proportion of the networks derived from these data sets could be formally characterized.Conclusions: We have developed a public web server, NetTest (freely available from the software section at http://darwin.uvigo.es), to formally characterize the complexity of phylogenetic networks. Using NetTest we found that most statistical parsimony networks estimated with the program TCS could be assigned to a known network class. The level of network characterization was correlated to nucleotide diversity and dependent upon the intra/interspecific levels, although no significant differences were detected among genes. More research on the properties of phylogenetic networks is clearly needed. © 2010 Arenas et al; licensee BioMed Central Ltd."
|
|
|
Stephen J. Willson. CSD Homomorphisms Between Phylogenetic Networks. In TCBB, Vol. 9(4), 2012. Keywords: explicit network, from network, from quartets, phylogenetic network. Note: http://www.public.iastate.edu/~swillson/Relationships11IEEE.pdf, preliminary version entitled Relationships Among Phylogenetic Networks.
Toggle abstract
"Since Darwin, species trees have been used as a simplified description of the relationships which summarize the complicated network N of reality. Recent evidence of hybridization and lateral gene transfer, however, suggest that there are situations where trees are inadequate. Consequently it is important to determine properties that characterize networks closely related to N and possibly more complicated than trees but lacking the full complexity of N. A connected surjective digraph map (CSD) is a map f from one network N to another network M such that every arc is either collapsed to a single vertex or is taken to an arc, such that f is surjective, and such that the inverse image of a vertex is always connected. CSD maps are shown to behave well under composition. It is proved that if there is a CSD map from N to M, then there is a way to lift an undirected version of M into N, often with added resolution. A CSD map from N to M puts strong constraints on N. In general, it may be useful to study classes of networks such that, for any N, there exists a CSD map from N to some standard member of that class. © 2012 IEEE."
|
|
|
Hyun Jung Park,
Guohua Jin and
Luay Nakhleh. Bootstrap-based Support of HGT Inferred by Maximum Parsimony. In BMCEB, Vol. 10:131, 2010. Keywords: bootstrap, explicit network, from sequences, lateral gene transfer, parsimony, phylogenetic network, phylogeny, Program Nepal, reconstruction. Note: http://dx.doi.org/10.1186/1471-2148-10-131.
Toggle abstract
"Background. Maximum parsimony is one of the most commonly used criteria for reconstructing phylogenetic trees. Recently, Nakhleh and co-workers extended this criterion to enable reconstruction of phylogenetic networks, and demonstrated its application to detecting reticulate evolutionary relationships. However, one of the major problems with this extension has been that it favors more complex evolutionary relationships over simpler ones, thus having the potential for overestimating the amount of reticulation in the data. An ad hoc solution to this problem that has been used entails inspecting the improvement in the parsimony length as more reticulation events are added to the model, and stopping when the improvement is below a certain threshold. Results. In this paper, we address this problem in a more systematic way, by proposing a nonparametric bootstrap-based measure of support of inferred reticulation events, and using it to determine the number of those events, as well as their placements. A number of samples is generated from the given sequence alignment, and reticulation events are inferred based on each sample. Finally, the support of each reticulation event is quantified based on the inferences made over all samples. Conclusions. We have implemented our method in the NEPAL software tool (available publicly at http://bioinfo.cs.rice.edu/), and studied its performance on both biological and simulated data sets. While our studies show very promising results, they also highlight issues that are inherently challenging when applying the maximum parsimony criterion to detect reticulate evolution. © 2010 Park et al; licensee BioMed Central Ltd."
|
|
|
Stephen J. Willson. Restricted trees: simplifying networks with bottlenecks. In BMB, Vol. 73(10):2322-2338, 2011. Keywords: from network, phylogenetic network. Note: http://arxiv.org/abs/1005.4956.
Toggle abstract
"Suppose N is a phylogenetic network indicating a complicated relationship among individuals and taxa. Often of interest is a much simpler network, for example, a species tree T, that summarizes the most fundamental relationships. The meaning of a species tree is made more complicated by the recent discovery of the importance of hybridizations and lateral gene transfers. Hence, it is desirable to describe uniform well-defined procedures that yield a tree given a network N. A useful tool toward this end is a connected surjective digraph (CSD) map φ:N→N′ where N′ is generally a much simpler network than N. A set W of vertices in N is "restricted" if there is at most one vertex u∉W from which there is an arc into W, thus yielding a bottleneck in N. A CSD map φ:N→N′ is "restricted" if the inverse image of each vertex in N′ is restricted in N. This paper describes a uniform procedure that, given a network N, yields a well-defined tree called the "restricted tree" of N. There is a restricted CSD map from N to the restricted tree. Many relationships in the tree can be proved to appear also in N. © 2011 The Author(s)."
|
|
|
Leo van Iersel,
Charles Semple and
Mike Steel. Locating a tree in a phylogenetic network. In IPL, Vol. 110(23), 2010. Keywords: cluster containment, explicit network, from network, level k phylogenetic network, normal network, NP complete, phylogenetic network, polynomial, regular network, time consistent network, tree containment, tree sibling network, tree-child network. Note: http://arxiv.org/abs/1006.3122.
Toggle abstract
"Phylogenetic trees and networks are leaf-labelled graphs that are used to describe evolutionary histories of species. The Tree Containment problem asks whether a given phylogenetic tree is embedded in a given phylogenetic network. Given a phylogenetic network and a cluster of species, the Cluster Containment problem asks whether the given cluster is a cluster of some phylogenetic tree embedded in the network. Both problems are known to be NP-complete in general. In this article, we consider the restriction of these problems to several well-studied classes of phylogenetic networks. We show that Tree Containment is polynomial-time solvable for normal networks, for binary tree-child networks, and for level-k networks. On the other hand, we show that, even for tree-sibling, time-consistent, regular networks, both Tree Containment and Cluster Containment remain NP-complete. © 2010 Elsevier B.V. All rights reserved."
|
|
|
Sophie Abby,
Eric Tannier,
Manolo Gouy and
Vincent Daubin. Detecting lateral gene transfers by statistical reconciliation of phylogenetic forests. In BMCB, Vol. 11:324, 2010. Keywords: agreement forest, explicit network, from rooted trees, from species tree, heuristic, lateral gene transfer, phylogenetic network, phylogeny, Program EEEP, Program PhyloNet, Program Prunier, reconstruction, software. Note: http://www.biomedcentral.com/1471-2105/11/324.
Toggle abstract
"Background: To understand the evolutionary role of Lateral Gene Transfer (LGT), accurate methods are needed to identify transferred genes and infer their timing of acquisition. Phylogenetic methods are particularly promising for this purpose, but the reconciliation of a gene tree with a reference (species) tree is computationally hard. In addition, the application of these methods to real data raises the problem of sorting out real and artifactual phylogenetic conflict.Results: We present Prunier, a new method for phylogenetic detection of LGT based on the search for a maximum statistical agreement forest (MSAF) between a gene tree and a reference tree. The program is flexible as it can use any definition of "agreement" among trees. We evaluate the performance of Prunier and two other programs (EEEP and RIATA-HGT) for their ability to detect transferred genes in realistic simulations where gene trees are reconstructed from sequences. Prunier proposes a single scenario that compares to the other methods in terms of sensitivity, but shows higher specificity. We show that LGT scenarios carry a strong signal about the position of the root of the species tree and could be used to identify the direction of evolutionary time on the species tree. We use Prunier on a biological dataset of 23 universal proteins and discuss their suitability for inferring the tree of life.Conclusions: The ability of Prunier to take into account branch support in the process of reconciliation allows a gain in complexity, in comparison to EEEP, and in accuracy in comparison to RIATA-HGT. Prunier's greedy algorithm proposes a single scenario of LGT for a gene family, but its quality always compares to the best solutions provided by the other algorithms. When the root position is uncertain in the species tree, Prunier is able to infer a scenario per root at a limited additional computational cost and can easily run on large datasets.Prunier is implemented in C++, using the Bio++ library and the phylogeny program Treefinder. It is available at: http://pbil.univ-lyon1.fr/software/prunier. © 2010 Abby et al; licensee BioMed Central Ltd."
|
|
|
Laura S. Kubatko. Identifying Hybridization Events in the Presence of Coalescence via Model Selection. In Systematic Biology, Vol. 58(5):478-488, 2009. Keywords: AIC, BIC, branch length, coalescent, explicit network, from rooted trees, from species tree, hybridization, lineage sorting, model selection, phylogenetic network, phylogeny, statistical model. Note: http://dx.doi.org/10.1093/sysbio/syp055.
|
|
|
Tao Sang and
Yang Zhong. Testing Hybridization Hypotheses Based on Incongruent Gene Trees. In Systematic Biology, Vol. 49(3):422-434, 2000. Keywords: bootstrap, from rooted trees, hybridization, lateral gene transfer, lineage sorting, phylogenetic network, phylogeny, reconstruction, statistical model. Note: http://dx.doi.org/10.1080/10635159950127321.
|
|
|
Chen Meng and
Laura S. Kubatko. Detecting hybrid speciation in the presence of incomplete lineage sorting using gene tree incongruence: A model. In Theoretical Population Biology, Vol. 75(1):35-45, 2009. Keywords: bayesian, coalescent, from network, from rooted trees, hybridization, likelihood, lineage sorting, phylogenetic network, phylogeny, statistical model. Note: http://dx.doi.org/10.1016/j.tpb.2008.10.004.
Toggle abstract
"The application of phylogenetic inference methods, to data for a set of independent genes sampled randomly throughout the genome, often results in substantial incongruence in the single-gene phylogenetic estimates. Among the processes known to produce discord between single-gene phylogenies, two of the best studied in a phylogenetic context are hybridization and incomplete lineage sorting. Much recent attention has focused on the development of methods for estimating species phylogenies in the presence of incomplete lineage sorting, but phylogenetic models that allow for hybridization have been more limited. Here we propose a model that allows incongruence in single-gene phylogenies to be due to both hybridization and incomplete lineage sorting, with the goal of determining the contribution of hybridization to observed gene tree incongruence in the presence of incomplete lineage sorting. Using our model, we propose methods for estimating the extent of the role of hybridization in both a likelihood and a Bayesian framework. The performance of our methods is examined using both simulated and empirical data. © 2008 Elsevier Inc. All rights reserved."
|
|
|
Nicolas Galtier. A model of horizontal gene transfer and the bacterial phylogeny problem. In Systematic Biology, Vol. 56(4):633-642, 2007. Keywords: explicit network, generation, lateral gene transfer, phylogenetic network, phylogeny, Program HGT_simul, software, statistical model. Note: http://dx.doi.org/10.1080/10635150701546231.
Toggle abstract
"How much horizontal gene transfer (HGT) between species influences bacterial phylogenomics is a controversial issue. This debate, however, lacks any quantitative assessment of the impact of HGT on phylogenies and of the ability of tree-building methods to cope with such events. I introduce a Markov model of genome evolution with HGT, accounting for the constraints on time-an HGT event can only occur between concomitantly living species. This model is used to simulate multigene sequence data sets with or without HGT. The consequences of HGT on phylogenomic inference are analyzed and compared to other well-known phylogenetic artefacts. It is found that supertree methods are quite robust to HGT, keeping high levels of performance even when gene trees are largely incongruent with each other. Gene tree incongruence per se is not indicative of HGT. HGT, however, removes the (otherwise observed) positive relationship between sequence length and gene tree congruence to the estimated species tree. Surprisingly, when applied to a bacterial and a eukaryotic multigene data set, this criterion rejects the HGT hypothesis for the former, but not the latter data set. Copyright © Society of Systematic Biologists."
|
|
|
Mihaela Baroni and
Mike Steel. Accumulation Phylogenies. In ACOM, Vol. 10(1):19-30, 2006. Keywords: abstract network, from clusters, from distances, phylogenetic network, phylogeny, polynomial, reconstruction, regular network. Note: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.137.1960.
Toggle abstract
"We investigate the computational complexity of a new combinatorial problem of inferring a smallest possible multi-labeled phylogenetic tree (MUL tree) which is consistent with each of the rooted triplets in a given set. We prove that even the restricted case of determining if there exists a MUL tree consistent with the input and having just one leaf duplication is NP-hard. Furthermore, we show that the general minimization problem is NP-hard to approximate within a ratio of n 1-ε for any constant 0<ε≤1, where n denotes the number of distinct leaf labels in the input set, although a simple polynomial-time approximation algorithm achieves the approximation ratio n. We also provide an exact algorithm for the problem running in O *(7 n ) time and O *(3 n ) space. © 2009 Springer-Verlag Berlin Heidelberg."
|
|
|
Mark T. Holder,
Jennifer A. Anderson and
Alisha K. Holloway. Difficulties in Detecting Hybridization. In Systematic Biology, Vol. 50(6):978-982, 2001. Keywords: bootstrap, from rooted trees, hybridization, lateral gene transfer, lineage sorting, phylogenetic network, phylogeny, reconstruction, statistical model. Note: http://dx.doi.org/10.1080/106351501753462911.
Toggle abstract
[No abstract available]
|
|
|
Marc Thuillard and
Vincent Moulton. Identifying and reconstructing lateral transfers from distance matrices by combining the Minimum Contradiction Method and Neighbor-Net. In JBCB, Vol. 9(4):453-470, 2011. Keywords: from distances, lateral gene transfer, minimum contradiction, NeighborNet, phylogenetic network, phylogeny, reconstruction. Note: http://dx.doi.org/10.1142/S0219720011005409, slides available at http://www.newton.ac.uk/programmes/PLG/seminars/062015501.html.
Toggle abstract
"Identifying lateral gene transfers is an important problem in evolutionary biology. Under a simple model of evolution, the expected values of an evolutionary distance matrix describing a phylogenetic tree fulfill the so-called Kalmanson inequalities. The Minimum Contradiction method for identifying lateral gene transfers exploits the fact that lateral transfers may generate large deviations from the Kalmanson inequalities. Here a new approach is presented to deal with such cases that combines the Neighbor-Net algorithm for computing phylogenetic networks with the Minimum Contradiction method. A subset of taxa, prescribed using Neighbor-Net, is obtained by measuring how closely the Kalmanson inequalities are fulfilled by each taxon. A criterion is then used to identify the taxa, possibly involved in a lateral transfer between nonconsecutive taxa. We illustrate the utility of the new approach by applying it to a distance matrix for Archaea, Bacteria, and Eukaryota. © 2011 Imperial College Press."
|
|
|
Klaus Schliep. Phangorn: Phylogenetic analysis in R. In Bioinformatics, Vol. 27(4):592-593, 2011. Keywords: abstract network, from distances, phylogenetic network, Program Phangorn, software, split, split network. Note: http://dx.doi.org/10.1093/bioinformatics/btq706.
Toggle abstract
"Summary: phangorn is a package for phylogenetic reconstruction and analysis in the R language. Previously it was only possible to estimate phylogenetic trees with distance methods in R. phangorn, now offers the possibility of reconstructing phylogenies with distance based methods, maximum parsimony or maximum likelihood (ML) and performing Hadamard conjugation. Extending the general ML framework, this package provides the possibility of estimating mixture and partition models. Furthermore, phangorn offers several functions for comparing trees, phylogenetic models or splits, simulating character data and performing congruence analyses. © The Author(s) 2010. Published by Oxford University Press."
|
|
|
|
|
Lavanya Kannan,
Hua Li and
Arcady Mushegian. A Polynomial-Time Algorithm Computing Lower and Upper Bounds of the Rooted Subtree Prune and Regraft Distance. In JCB, Vol. 18(5):743-757, 2011. Keywords: bound, minimum number, polynomial, SPR distance. Note: http://dx.doi.org/10.1089/cmb.2010.0045.
Toggle abstract
"Rooted, leaf-labeled trees are used in biology to represent hierarchical relationships of various entities, most notably the evolutionary history of molecules and organisms. Rooted Subtree Prune and Regraft (rSPR) operation is a tree rearrangement operation that is used to transform a tree into another tree that has the same set of leaf labels. The minimum number of rSPR operations that transform one tree into another is denoted by drSPR and gives a measure of dissimilarity between the trees, which can be used to compare trees obtained by different approaches, or, in the context of phylogenetic analysis, to detect horizontal gene transfer events by finding incongruences between trees of different evolving characters. The problem of computing the exact d rSPR measure is NP-hard, and most algorithms resort to finding sequences of rSPR operations that are sufficient for transforming one tree into another, thereby giving upper bound heuristics for the distance. In this article, we present an O(n4) recursive algorithm D-Clust that gives both lower bound and upper bound heuristics for the distance between trees with n shared leaves and also gives a sequence of operations that transforms one tree into another. Our experiments on simulated pairs of trees containing up to 100 leaves showed that the two bounds are almost equal for small distances, thereby giving the nearly-precise actual value, and that the upper bound tends to be close to the upper bounds given by other approaches for all pairs of trees. © Copyright 2011, Mary Ann Liebert, Inc. 2011."
|
|
|
Yun Yu,
Cuong Than,
James H. Degnan and
Luay Nakhleh. Coalescent Histories on Phylogenetic Networks and Detection of Hybridization Despite Incomplete Lineage Sorting. In Systematic Biology, Vol. 60(2):138-149, 2011. Keywords: coalescent, hybridization, lineage sorting, reconstruction, statistical model. Note: http://www.cs.rice.edu/~nakhleh/Papers/YuEtAl-SB11.pdf.
Toggle abstract
"Analyses of the increasingly available genomic data continue to reveal the extent of hybridization and its role in the evolutionary diversification of various groups of species. We show, through extensive coalescent-based simulations of multilocus data sets on phylogenetic networks, how divergence times before and after hybridization events can result in incomplete lineage sorting with gene tree incongruence signatures identical to those exhibited by hybridization. Evolutionary analysis of such data under the assumption of a species tree model can miss all hybridization events, whereas analysis under the assumption of a species network model would grossly overestimate hybridization events. These issues necessitate a paradigm shift in evolutionary analysis under these scenarios, from a model that assumes a priori a single source of gene tree incongruence to one that integrates multiple sources in a unifying framework. We propose a framework of coalescence within the branches of a phylogenetic network and show how this framework can be used to detect hybridization despite incomplete lineage sorting. We apply the model to simulated data and show that the signature of hybridization can be revealed as long as the interval between the divergence times of the species involved in hybridization is not too small. We reanalyze a data set of 106 loci from 7 in-group Saccharomyces species for which a species tree with no hybridization has been reported in the literature. Our analysis supports the hypothesis that hybridization occurred during the evolution of this group, explaining a large amount of the incongruence in the data. Our findings show that an integrative approach to gene tree incongruence and its reconciliation is needed. Our framework will help in systematically analyzing genomic data for the occurrence of hybridization and elucidating its evolutionary role. [Coalescent history; incomplete lineage sorting; hybridization; phylogenetic network.]. © 2011 The Author(s)."
|
|
|
Steven Kelk,
Celine Scornavacca and
Leo van Iersel. On the elusiveness of clusters. In TCBB, Vol. 9(2):517-534, 2012. Keywords: explicit network, from clusters, from rooted trees, from triplets, level k phylogenetic network, phylogenetic network, phylogeny, Program Clustistic, reconstruction, software. Note: http://arxiv.org/abs/1103.1834.
|
|
|
|
|
Jeremy G. Sumner,
Barbara R. Holland and
Peter D. Jarvis. The algebra of the general Markov model on phylogenetic trees and networks. In BMB, Vol. 74(4):858-880, 2012. Keywords: abstract network, phylogenetic network, phylogeny, split, split network, statistical model. Note: http://arxiv.org/abs/1012.5165.
Toggle abstract
"It is known that the Kimura 3ST model of sequence evolution on phylogenetic trees can be extended quite naturally to arbitrary split systems. However, this extension relies heavily on mathematical peculiarities of the associated Hadamard transformation, and providing an analogous augmentation of the general Markov model has thus far been elusive. In this paper, we rectify this shortcoming by showing how to extend the general Markov model on trees to include incompatible edges; and even further to more general network models. This is achieved by exploring the algebra of the generators of the continuous-time Markov chain together with the "splitting" operator that generates the branching process on phylogenetic trees. For simplicity, we proceed by discussing the two state case and then show that our results are easily extended to more states with little complication. Intriguingly, upon restriction of the two state general Markov model to the parameter space of the binary symmetric model, our extension is indistinguishable from the Hadamard approach only on trees; as soon as any incompatible splits are introduced the two approaches give rise to differing probability distributions with disparate structure. Through exploration of a simple example, we give an argument that our extension to more general networks has desirable properties that the previous approaches do not share. In particular, our construction allows for convergent evolution of previously divergent lineages; a property that is of significant interest for biological applications. © 2011 Society for Mathematical Biology."
|
|
|
|
|
Gergely J. Szöllösi and
Vincent Daubin. Modeling Gene Family Evolution and Reconciling Phylogenetic Discord. In Evolutionary Genomics, Statistical and Computational Methods, Volume 2, Methods in Molecular Biology, Vol. 856:29-51, Chapter 2, springer, 2011. Keywords: duplication, from multilabeled tree, lateral gene transfer, likelihood, phylogeny, reconstruction, statistical model. Note: ArXiv version entitled The pattern and process of gene family evolution.
Toggle abstract
"Large-scale databases are available that contain homologous gene families constructed from hundreds of complete genome sequences from across the three domains of life. Here, we discuss the approaches of increasing complexity aimed at extracting information on the pattern and process of gene family evolution from such datasets. In particular, we consider the models that invoke processes of gene birth (duplication and transfer) and death (loss) to explain the evolution of gene families. First, we review birth-and-death models of family size evolution and their implications in light of the universal features of family size distribution observed across different species and the three domains of life. Subsequently, we proceed to recent developments on models capable of more completely considering information in the sequences of homologous gene families through the probabilistic reconciliation of the phylogenetic histories of individual genes with the phylogenetic history of the genomes in which they have resided. To illustrate the methods and results presented, we use data from the HOGENOM database, demonstrating that the distribution of homologous gene family sizes in the genomes of the eukaryota, archaea, and bacteria exhibits remarkably similar shapes. We show that these distributions are best described by models of gene family size evolution, where for individual genes the death (loss) rate is larger than the birth (duplication and transfer) rate but new families are continually supplied to the genome by a process of origination. Finally, we use probabilistic reconciliation methods to take into consideration additional information from gene phylogenies, and find that, for prokaryotes, the majority of birth events are the result of transfer. © 2012 Springer Science+Business Media, LLC."
|
|
|
Lawrence A. David and
Eric J. Alm. Rapid evolutionary innovation during an Archaean genetic expansion. In Nature, Vol. 469:93-96, 2011. Keywords: duplication, dynamic programming, from multilabeled tree, from rooted trees, from species tree, parsimony, phylogenetic network, phylogeny, Program Angst. Note: http://dx.doi.org/10.1038/nature09649, Program Angst described here.
|
|
|
Bui Quang Minh,
Steffen Klaere and
Arndt von Haeseler. Taxon Selection under Split Diversity. In Systematic Biology, Vol. 58(6):586-594, 2009. Keywords: abstract network, circular split system, diversity, from network, phylogenetic network, split network. Note: http://dx.doi.org/10.1093/sysbio/syp058.
Toggle abstract
"The phylogenetic diversity (PD) measure of biodiversity is evaluated using a phylogenetic tree, usually inferred from morphological or molecular data. Consequently, it is vulnerable to errors in that tree, including those resulting from sampling error, model misspecification, or conflicting signals. To improve the robustness of PD, we can evaluate the measure using either a collection (or distribution) of trees or a phylogenetic network. Recently, it has been shown that these 2 approaches are equivalent but that the problem of maximizing PD in the general concept is NP-hard. In this study, we provide an efficient dynamic programming algorithm for maximizing PD when splits in the trees or network form a circular split system. We illustrate our method using a case study of game birds (Galliformes) and discuss the different choices of taxa based on our approach and PD."
|
|
|
Bui Quang Minh,
Fabio Pardi,
Steffen Klaere and
Arndt von Haeseler. Budgeted Phylogenetic Diversity on Circular Split Systems. In TCBB, Vol. 6(1):22-29, 2009. Keywords: abstract network, circular split system, dynamic programming, from network, phylogenetic network, polynomial, split, split network. Note: http://dx.doi.org/10.1109/TCBB.2008.54.
Toggle abstract
"In the last 15 years, Phylogenetic Diversity (PD) has gained interest in the community of conservation biologists as a surrogate measure for assessing biodiversity. We have recently proposed two approaches to select taxa for maximizing PD, namely PD with budget constraints and PD on split systems. In this paper, we will unify these two strategies and present a dynamic programming algorithm to solve the unified framework of selecting taxa with maximal PD under budget constraints on circular split systems. An improved algorithm will also be given if the underlying split system is a tree. © 2006 IEEE."
|
|
|
Andreas Spillner,
Binh T. Nguyen and
Vincent Moulton. Constructing and Drawing Regular Planar Split Networks. In TCBB, Vol. 9(2):395-407, 2012. Keywords: abstract network, from splits, phylogenetic network, phylogeny, reconstruction, visualization. Note: slides and presentation available at http://www.newton.ac.uk/programmes/PLG/seminars/062111501.html.
Toggle abstract
"Split networks are commonly used to visualize collections of bipartitions, also called splits, of a finite set. Such collections arise, for example, in evolutionary studies. Split networks can be viewed as a generalization of phylogenetic trees and may be generated using the SplitsTree package. Recently, the NeighborNet method for generating split networks has become rather popular, in part because it is guaranteed to always generate a circular split system, which can always be displayed by a planar split network. Even so, labels must be placed on the "outside" of the network, which might be problematic in some applications. To help circumvent this problem, it can be helpful to consider so-called flat split systems, which can be displayed by planar split networks where labels are allowed on the inside of the network too. Here, we present a new algorithm that is guaranteed to compute a minimal planar split network displaying a flat split system in polynomial time, provided the split system is given in a certain format. We will also briefly discuss two heuristics that could be useful for analyzing phylogeographic data and that allow the computation of flat split systems in this format in polynomial time. © 2006 IEEE."
|
|
|
Paul Phipps and
Sergey Bereg. Optimizing Phylogenetic Networks for Circular Split Systems. In TCBB, Vol. 9(2):535-547, 2012. Keywords: abstract network, from distances, from splits, phylogenetic network, phylogeny, Program PhippsNetwork, reconstruction, software.
Toggle abstract
"We address the problem of realizing a given distance matrix by a planar phylogenetic network with a minimum number of faces. With the help of the popular software SplitsTree4, we start by approximating the distance matrix with a distance metric that is a linear combination of circular splits. The main results of this paper are the necessary and sufficient conditions for the existence of a network with a single face. We show how such a network can be constructed, and we present a heuristic for constructing a network with few faces using the first algorithm as the base case. Experimental results on biological data show that this heuristic algorithm can produce phylogenetic networks with far fewer faces than the ones computed by SplitsTree4, without affecting the approximation of the distance matrix. © 2012 IEEE."
|
|
|
Steven Kelk and
Celine Scornavacca. Constructing minimal phylogenetic networks from softwired clusters is fixed parameter tractable. In ALG, Vol. 68(4):886-915, 2014. Keywords: explicit network, FPT, from clusters, level k phylogenetic network, phylogenetic network, phylogeny, reconstruction. Note: http://arxiv.org/abs/1108.3653.
Toggle abstract
"Here we show that, given a set of clusters C on a set of taxa X, where |X|=n, it is possible to determine in time f(k)×poly(n) whether there exists a level-≤k network (i.e. a network where each biconnected component has reticulation number at most k) that represents all the clusters in C in the softwired sense, and if so to construct such a network. This extends a result from Kelk et al. (in IEEE/ACM Trans. Comput. Biol. Bioinform. 9:517-534, 2012) which showed that the problem is polynomial-time solvable for fixed k. By defining "k-reticulation generators" analogous to "level-k generators", we then extend this fixed parameter tractability result to the problem where k refers not to the level but to the reticulation number of the whole network. © 2012 Springer Science+Business Media New York."
|
|
|
Andreas Spillner and
Vincent Moulton. Optimal algorithms for computing edge weights in planar split-networks. In Journal of Applied Mathematics and Computing, Vol. 39(1-2):1-13, 2012. Keywords: abstract network, from distances, phylogenetic network, phylogeny, reconstruction, split, split network. Note: http://dx.doi.org/10.1007/s12190-011-0506-z.
Toggle abstract
"In phylogenetics, biologists commonly compute split networks when trying to better understand evolutionary data. These graph-theoretical structures represent collections of weighted bipartitions or splits of a finite set, and provide a means to display conflicting evolutionary signals. The weights associated to the splits are used to scale the edges in the network and are often computed using some distance matrix associated with the data. In this paper we present optimal polynomial time algorithms for three basic problems that arise in this context when computing split weights for planar split-networks. These generalize algorithms that have been developed for special classes of split networks (namely, trees and outer-labeled planar networks). As part of our analysis, we also derive a Crofton formula for full flat split systems, structures that naturally arise when constructing planar split-networks. © 2011 Korean Society for Computational and Applied Mathematics."
|
|
|
Magnus Bordewich and
Charles Semple. Budgeted Nature Reserve Selection with diversity feature loss and arbitrary split systems. In JOMB, Vol. 64(1):69-85, 2012. Keywords: abstract network, approximation, diversity, phylogenetic network, polynomial, split network. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BS11.pdf.
Toggle abstract
"Arising in the context of biodiversity conservation, the Budgeted Nature Reserve Selection (BNRS) problem is to select, subject to budgetary constraints, a set of regions to conserve so that the phylogenetic diversity (PD) of the set of species contained within those regions is maximized. Here PD is measured across either a single rooted tree or a single unrooted tree. Nevertheless, in both settings, this problem is NP-hard. However, it was recently shown that, for each setting, there is a polynomial-time (1-1/e)-approximation algorithm for it and that this algorithm is tight. In the first part of the paper, we consider two extensions of BNRS. In the rooted setting we additionally allow for the disappearance of features, for varying survival probabilities across species, and for PD to be measured across multiple trees. In the unrooted setting, we extend to arbitrary split systems. We show that, despite these additional allowances, there remains a polynomial-time (1-1/e)-approximation algorithm for each extension. In the second part of the paper, we resolve a complexity problem on computing PD across an arbitrary split system left open by Spillner et al. © 2011 Springer-Verlag."
|
|
|
|
|
Marta Melé,
Asif Javed,
Marc Pybus,
Francesc Calafell,
Laxmi Parida,
Jaume Bertranpetit and
Genographic Consortium. A New Method to Reconstruct Recombination Events at a Genomic Scale. In PLoS Computational Biology, Vol. 6(11):e1001010.1-13, 2010. Keywords: explicit network, from sequences, phylogenetic network, phylogeny. Note: http://dx.doi.org/10.1371/journal.pcbi.1001010.
Toggle abstract
"Recombination is one of the main forces shaping genome diversity, but the information it generates is often overlooked. A recombination event creates a junction between two parental sequences that may be transmitted to the subsequent generations. Just like mutations, these junctions carry evidence of the shared past of the sequences. We present the IRiS algorithm, which detects past recombination events from extant sequences and specifies the place of each recombination and which are the recombinants sequences. We have validated and calibrated IRiS for the human genome using coalescent simulations replicating standard human demographic history and a variable recombination rate model, and we have finetuned IRiS parameters to simultaneously optimize for false discovery rate, sensitivity, and accuracy in placing the recombination events in the sequence. Newer recombinations overwrite traces of past ones and our results indicate more recent recombinations are detected by IRiS with greater sensitivity. IRiS analysis of the MS32 region, previously studied using sperm typing, showed good concordance with estimated recombination rates. We also applied IRiS to haplotypes for 18 X-chromosome regions in HapMap Phase 3 populations. Recombination events detected for each individual were recoded as binary allelic states and combined into recotypes. Principal component analysis and multidimensional scaling based on recotypes reproduced the relationships between the eleven HapMap Phase III populations that can be expected from known human population history, thus further validating IRiS. We believe that our new method will contribute to the study of the distribution of recombination events across the genomes and, for the first time, it will allow the use of recombination as genetic marker to study human genetic variation. © 2010 Mele ́ et al."
|
|
|
Sagi Snir and
Edward Trifonov. A Novel Technique for Detecting Putative Horizontal Gene Transfer in the Sequence Space. In JCB, Vol. 17(11):1535-1548, 2010. Keywords: from sequences, phylogenetic network, phylogeny, reconstruction. Note: http://research.haifa.ac.il/~ssagi/published%20papers/JCB-HGT.pdf.
Toggle abstract
"Horizontal transfer (HT) is the event of a DNA sequence being transferred between species not by inheritance. This phenomenon violates the tree-like evolution of the species under study turning the trees into networks. At the sequence level, HT offers basic characteristics that enable not only clear identification and distinguishing from other sequence similarity cases but also the possibility of dating the events. We developed a novel, self-contained technique to identify relatively recent horizontal transfer elements (HTEs) in the sequences. Appropriate formalism allows one to obtain confidence values for the events detected. The technique does not rely on such problematic prerequisites as reliable phylogeny and/or statistically justified pairwise sequence alignment. In conjunction with the unique properties of HT, it gives rise to a two-level sequence similarity algorithm that, to the best of our knowledge, has not been explored. From evolutionary perspective, the novelty of the work is in the combination of small scale and large scale mutational events. The technique is employed on both simulated and real biological data. The simulation results show high capability of discriminating between HT and conserved regions. On the biological data, the method detected documented HTEs along with their exact locations in the recipient genomes. Supplementary Material is available online at www.libertonline.com/cmb. Copyright 2010, Mary Ann Liebert, Inc."
|
|
|
Mukul S. Bansal,
Guy Banay,
J. Peter Gogarten and
Ron Shamir. Detecting Highways of Horizontal Gene Transfer. In JCB, Vol. 18(9):1087-1114, 2011. Keywords: explicit network, from rooted trees, from species tree, lateral gene transfer, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://people.csail.mit.edu/mukul/HighwayFull_preprint.pdf.
Toggle abstract
"In a horizontal gene transfer (HGT) event, a gene is transferred between two species that do not have an ancestor-descendant relationship. Typically, no more than a few genes are horizontally transferred between any two species. However, several studies identified pairs of species between which many different genes were horizontally transferred. Such a pair is said to be linked by a highway of gene sharing. We present a method for inferring such highways. Our method is based on the fact that the evolutionary histories of horizontally transferred genes disagree with the corresponding species phylogeny. Specifically, given a set of gene trees and a trusted rooted species tree, each gene tree is first decomposed into its constituent quartet trees and the quartets that are inconsistent with the species tree are identified. Our method finds a pair of species such that a highway between them explains the largest (normalized) fraction of inconsistent quartets. For a problem on n species and m input quartet trees, we give an efficient O(m+n 2)-time algorithm for detecting highways, which is optimal with respect to the quartets input size. An application of our method to a dataset of 1128 genes from 11 cyanobacterial species, as well as to simulated datasets, illustrates the efficacy of our method. © 2011, Mary Ann Liebert, Inc."
|
|
|
Celine Scornavacca,
Simone Linz and
Benjamin Albrecht. A first step towards computing all hybridization networks for two rooted binary phylogenetic trees. In JCB, Vol. 19:1227-1242, 2012. Keywords: agreement forest, explicit network, FPT, from rooted trees, phylogenetic network, phylogeny, Program Dendroscope, Program Hybroscale, reconstruction. Note: http://arxiv.org/abs/1109.3268.
Toggle abstract
"Recently, considerable effort has been put into developing fast algorithms to reconstruct a rooted phylogenetic network that explains two rooted phylogenetic trees and has a minimum number of hybridization vertices. With the standard app1235roach to tackle this problem being combinatorial, the reconstructed network is rarely unique. From a biological point of view, it is therefore of importance to not only compute one network, but all possible networks. In this article, we make a first step toward approaching this goal by presenting the first algorithm-called allMAAFs-that calculates all maximum-acyclic-agreement forests for two rooted binary phylogenetic trees on the same set of taxa. © Copyright 2012, Mary Ann Liebert, Inc. 2012."
|
|
|
Katharina Huber and
Vincent Moulton. Encoding and Constructing 1-Nested Phylogenetic Networks with Trinets. In ALG, Vol. 66(3):714-738, 2013. Keywords: explicit network, from subnetworks, from trinets, phylogenetic network, phylogeny, reconstruction, uniqueness. Note: http://arxiv.org/abs/1110.0728.
Toggle abstract
"Phylogenetic networks are a generalization of phylogenetic trees that are used in biology to represent reticulate or non-treelike evolution. Recently, several algorithms have been developed which aim to construct phylogenetic networks from biological data using triplets, i.e. binary phylogenetic trees on 3-element subsets of a given set of species. However, a fundamental problem with this approach is that the triplets displayed by a phylogenetic network do not necessarily uniquely determine or encode the network. Here we propose an alternative approach to encoding and constructing phylogenetic networks, which uses phylogenetic networks on 3-element subsets of a set, or trinets, rather than triplets. More specifically, we show that for a special, well-studied type of phylogenetic network called a 1-nested network, the trinets displayed by a 1-nested network always encode the network. We also present an efficient algorithm for deciding whether a dense set of trinets (i.e. one that contains a trinet on every 3-element subset of a set) can be displayed by a 1-nested network or not and, if so, constructs that network. In addition, we discuss some potential new directions that this new approach opens up for constructing and comparing phylogenetic networks. © 2012 Springer Science+Business Media, LLC."
|
|
|
Simon Joly,
Patricia A. McLenachan and
Peter J. Lockhart. A Statistical Approach for Distinguishing Hybridization and Incomplete Lineage Sorting. In The American Naturalist, Vol. 174(2):E54-E70, 2009. Keywords: hybridization, lineage sorting, phylogenetic network, phylogeny, reconstruction, statistical model. Note: http://www.plantevolution.org/pdf/Joly&al_2009_AmNat.pdf.
Toggle abstract
"The extent and evolutionary significance of hybridization is difficult to evaluate because of the difficulty in distinguishing hybridization from incomplete lineage sorting. Here we present a novel parametric approach for statistically distinguishing hybridization from incomplete lineage sorting based on minimum genetic distances of a nonrecombining locus. It is based on the idea that the expected minimum genetic distance between sequences from two species is smaller for some hybridization events than for incomplete lineage sorting scenarios. When applied to empirical data sets, distributions can be generated for the minimum interspecies distances expected under incomplete lineage sorting using coalescent simulations. If the observed distance between sequences from two species is smaller than its predicted distribution, incomplete lineage sorting can be rejected and hybridization inferred. We demonstrate the power of the method using simulations and illustrate its application on New Zealand alpine buttercups (Ranunculus). The method is robust and complements existing approaches. Thus it should allow biologists to assess with greater accuracy the importance of hybridization in evolution. © 2009 by The University of Chicago."
|
|
|
Simon Joly. JML: Testing hybridization from species trees. In Molecular Ecology Ressources, Vol. 12(1):179-184, 2012. Keywords: from species tree, hybridization, lineage sorting, phylogenetic network, phylogeny, Program JML, statistical model. Note: http://www.plantevolution.org/pdf/JMLpaper_accepted.pdf.
Toggle abstract
"I introduce the software jml that tests for the presence of hybridization in multispecies sequence data sets by posterior predictive checking following Joly, McLenachan and Lockhart (2009, American Naturalist e54). Although their method could potentially be applied on any data set, the lack of appropriate software made its application difficult. The software jml thus fills a need for an easy application of the method but also includes improvements such as the possibility to incorporate uncertainty in the species tree topology. The jml software uses a posterior distribution of species trees, population sizes and branch lengths to simulate replicate sequence data sets using the coalescent with no migration. A test quantity, defined as the minimum pairwise sequence distance between sequences of two species, is then evaluated on the simulated data sets and compared to the one estimated from the original data. Because the test quantity is a good predictor of hybridization events, departure from the bifurcating species tree model could be interpreted as evidence of hybridization. Software performance in terms of computing time is evaluated for several parameters. I also show an application example of the software for detecting hybridization among native diploid North American roses. © 2011 Blackwell Publishing Ltd."
|
|
|
Zhi-Zhong Chen and
Lusheng Wang. Algorithms for Reticulate Networks of Multiple Phylogenetic Trees. In TCBB, Vol. 9(2):372-384, 2012. Keywords: explicit network, from rooted trees, minimum number, phylogenetic network, phylogeny, Program CMPT, Program MaafB, reconstruction, software. Note: http://rnc.r.dendai.ac.jp/~chen/papers/rMaaf.pdf.
Toggle abstract
"A reticulate network N of multiple phylogenetic trees may have nodes with two or more parents (called reticulation nodes). There are two ways to define the reticulation number of N. One way is to define it as the number of reticulation nodes in N in this case, a reticulate network with the smallest reticulation number is called an optimal type-I reticulate network of the trees. The better way is to define it as the total number of parents of reticulation nodes in N minus the number of reticulation nodes in N ; in this case, a reticulate network with the smallest reticulation number is called an optimal type-II reticulate network of the trees. In this paper, we first present a fast fixed-parameter algorithm for constructing one or all optimal type-I reticulate networks of multiple phylogenetic trees. We then use the algorithm together with other ideas to obtain an algorithm for estimating a lower bound on the reticulation number of an optimal type-II reticulate network of the input trees. To our knowledge, these are the first fixed-parameter algorithms for the problems. We have implemented the algorithms in ANSI C, obtaining programs CMPT and MaafB. Our experimental data show that CMPT can construct optimal type-I reticulate networks rapidly and MaafB can compute better lower bounds for optimal type-II reticulate networks within shorter time than the previously best program PIRN designed by Wu. © 2006 IEEE."
|
|
|
Stephen J. Willson. Tree-average distances on certain phylogenetic networks have their weights uniquely determined. In ALMOB, Vol. 7(13), 2012. Keywords: from distances, from network, normal network, phylogenetic network, phylogeny, reconstruction, tree-child network. Note: hhttp://www.public.iastate.edu/~swillson/Tree-AverageDis10All.pdf.
Toggle abstract
"A phylogenetic network N has vertices corresponding to species and arcs corresponding to direct genetic inheritance from the species at the tail to the species at the head. Measurements of DNA are often made on species in the leaf set, and one seeks to infer properties of the network, possibly including the graph itself. In the case of phylogenetic trees, distances between extant species are frequently used to infer the phylogenetic trees by methods such as neighbor-joining.This paper proposes a tree-average distance for networks more general than trees. The notion requires a weight on each arc measuring the genetic change along the arc. For each displayed tree the distance between two leaves is the sum of the weights along the path joining them. At a hybrid vertex, each character is inherited from one of its parents. We will assume that for each hybrid there is a probability that the inheritance of a character is from a specified parent. Assume that the inheritance events at different hybrids are independent. Then for each displayed tree there will be a probability that the inheritance of a given character follows the tree; this probability may be interpreted as the probability of the tree. The tree-average distance between the leaves is defined to be the expected value of their distance in the displayed trees.For a class of rooted networks that includes rooted trees, it is shown that the weights and the probabilities at each hybrid vertex can be calculated given the network and the tree-average distances between the leaves. Hence these weights and probabilities are uniquely determined. The hypotheses on the networks include that hybrid vertices have indegree exactly 2 and that vertices that are not leaves have a tree-child. © 2012 Willson; licensee BioMed Central Ltd."
|
|
|
Jean-Philippe Doyon,
Vincent Ranwez,
Vincent Daubin and
Vincent Berry. Models, algorithms and programs for phylogeny reconciliation. In Briefings in Bioinformatics, Vol. 12(5):392-400, 2011. Keywords: explicit network, lateral gene transfer, phylogenetic network, phylogeny, reconstruction, survey.
Toggle abstract
"Gene sequences contain a goldmine of phylogenetic information. But unfortunately for taxonomists this information does not only tell the story of the species from which it was collected. Genes have their own complex histories which record speciation events, of course, but also many other events. Among them, gene duplications, transfers and losses are especially important to identify. These events are crucial to account for when reconstructing the history of species, and they play a fundamental role in the evolution of genomes, the diversification of organisms and the emergence of new cellular functions.We review reconciliations between gene and species trees, which are rigorous approaches for identifying duplications, transfers and losses that mark the evolution of a gene family. Existing reconciliation models and algorithms are reviewed and difficulties in modeling gene transfers are discussed. We also compare different reconciliation programs along with their advantages and disadvantages. © The Author 2011. Published by Oxford University Press."
|
|
|
Alix Boc and
Vladimir Makarenkov. Towards an accurate identification of mosaic genes and partial horizontal gene transfers. In NAR, Vol. 39(21):e144, 2011. Keywords: explicit network, from sequences, lateral gene transfer, phylogenetic network, phylogeny, Program T REX, reconstruction. Note: http://dx.doi.org/10.1093/nar/gkr735.
Toggle abstract
"Many bacteria and viruses adapt to varying environmental conditions through the acquisition of mosaic genes. A mosaic gene is composed of alternating sequence polymorphisms either belonging to the host original allele or derived from the integrated donor DNA. Often, the integrated sequence contains a selectable genetic marker (e.g. marker allowing for antibiotic resistance). An effective identification of mosaic genes and detection of corresponding partial horizontal gene transfers (HGTs) are among the most important challenges posed by evolutionary biology. We developed a method for detecting partial HGT events and related intragenic recombination giving rise to the formation of mosaic genes. A bootstrap procedure incorporated in our method is used to assess the support of each predicted partial gene transfer. The proposed method can be also applied to confirm or discard complete (i.e. traditional) horizontal gene transfers detected by any HGT inferring method. While working on a full-genome scale, the new method can be used to assess the level of mosaicism in the considered genomes as well as the rates of complete and partial HGT underlying their evolution. © 2011 The Author(s)."
|
|
|
Changiz Eslahchi,
Reza Hassanzadeh,
Ehsan Mottaghi,
Mahnaz Habibi,
Hamid Pezeshk and
Mehdi Sadeghi. Constructing circular phylogenetic networks from weighted quartets using simulated annealing. In MBIO, Vol. 235(2):123-127, 2012. Keywords: abstract network, from quartets, heuristic, phylogenetic network, phylogeny, Program SAQ-Net, Program SplitsTree, reconstruction, simulated annealing, software, split network. Note: http://dx.doi.org/10.1016/j.mbs.2011.11.003.
Toggle abstract
"In this paper, we present a heuristic algorithm based on the simulated annealing, SAQ-Net, as a method for constructing phylogenetic networks from weighted quartets. Similar to QNet algorithm, SAQ-Net constructs a collection of circular weighted splits of the taxa set. This collection is represented by a split network. In order to show that SAQ-Net performs better than QNet, we apply these algorithm to both the simulated and actual data sets containing salmonella, Bees, Primates and Rubber data sets. Then we draw phylogenetic networks corresponding to outputs of these algorithms using SplitsTree4 and compare the results. We find that SAQ-Net produces a better circular ordering and phylogenetic networks than QNet in most cases. SAQ-Net has been implemented in Matlab and is available for download at http://bioinf.cs.ipm.ac.ir/softwares/saq.net. © 2011 Elsevier Inc."
|
|
|
Benjamin Albrecht,
Celine Scornavacca,
Alberto Cenci and
Daniel H. Huson. Fast computation of minimum hybridization networks. In BIO, Vol. 28(2):191-197, 2012. Keywords: explicit network, from rooted trees, minimum number, phylogenetic network, phylogeny, Program Dendroscope, Program Hybroscale, reconstruction. Note: http://dx.doi.org/10.1093/bioinformatics/btr618.
Toggle abstract
"Motivation: Hybridization events in evolution may lead to incongruent gene trees. One approach to determining possible interspecific hybridization events is to compute a hybridization network that attempts to reconcile incongruent gene trees using a minimum number of hybridization events. Results: We describe how to compute a representative set of minimum hybridization networks for two given bifurcating input trees, using a parallel algorithm and provide a user-friendly implementation. A simulation study suggests that our program performs significantly better than existing software on biologically relevant data. Finally, we demonstrate the application of such methods in the context of the evolution of the Aegilops/Triticum genera. Availability and implementation: The algorithm is implemented in the program Dendroscope 3, which is freely available from www.dendroscope.org and runs on all three major operating systems. © The Author 2011. Published by Oxford University Press. All rights reserved."
|
|
|
|
|
Steven Kelk,
Leo van Iersel,
Nela Lekic,
Simone Linz,
Celine Scornavacca and
Leen Stougie. Cycle killer... qu'est-ce que c'est? On the comparative approximability of hybridization number and directed feedback vertex set. In SIDMA, Vol. 26(4):1635-1656, 2012. Keywords: agreement forest, approximation, explicit network, from rooted trees, minimum number, phylogenetic network, phylogeny, Program CycleKiller, reconstruction. Note: http://arxiv.org/abs/1112.5359, about the title.
Toggle abstract
"We show that the problem of computing the hybridization number of two rooted binary phylogenetic trees on the same set of taxa X has a constant factor polynomial-time approximation if and only if the problem of computing a minimum-size feedback vertex set in a directed graph (DFVS) has a constant factor polynomial-time approximation. The latter problem, which asks for a minimum number of vertices to be removed from a directed graph to transform it into a directed acyclic graph, is one of the problems in Karp's seminal 1972 list of 21 NP-complete problems. Despite considerable attention from the combinatorial optimization community, it remains to this day unknown whether a constant factor polynomial-time approximation exists for DFVS. Our result thus places the (in)approximability of hybridization number in a much broader complexity context, and as a consequence we obtain that it inherits inapproximability results from the problem Vertex Cover. On the positive side, we use results from the DFVS literature to give an O(log r log log r) approximation for the hybridization number where r is the correct value. Copyright © by SIAM."
|
|
|
Rosalba Radice. A Bayesian Approach to Modelling Reticulation Events with Application to the Ribosomal Protein Gene rps11 of Flowering Plants. In Australian & New Zealand Journal of Statistics, Vol. 54(4):401-426, 2012. Keywords: bayesian, phylogenetic network, phylogeny, reconstruction, statistical model.
Toggle abstract
"Traditional phylogenetic inference assumes that the history of a set of taxa can be explained by a tree. This assumption is often violated as some biological entities can exchange genetic material giving rise to non-treelike events often called reticulations. Failure to consider these events might result in incorrectly inferred phylogenies. Phylogenetic networks provide a flexible tool which allows researchers to model the evolutionary history of a set of organisms in the presence of reticulation events. In recent years, a number of methods addressing phylogenetic network parameter estimation have been introduced. Some of them are based on the idea that a phylogenetic network can be defined as a directed acyclic graph. Based on this definition, we propose a Bayesian approach to the estimation of phylogenetic network parameters which allows for different phylogenies to be inferred at different parts of a multiple DNA alignment. The algorithm is tested on simulated data and applied to the ribosomal protein gene rps11 data from five flowering plants, where reticulation events are suspected to be present. The proposed approach can be applied to a wide variety of problems which aim at exploring the possibility of reticulation events in the history of a set of taxa. © 2012 Australian Statistical Publishing Association Inc. Published by Wiley Publishing Asia Pty Ltd."
|
|
|
Philippe Gambette,
Vincent Berry and
Christophe Paul. Quartets and Unrooted Phylogenetic Networks. In JBCB, Vol. 10(4):1250004, 2012. Keywords: abstract network, circular split system, explicit network, from quartets, level k phylogenetic network, orientation, phylogenetic network, phylogeny, polynomial, reconstruction, split, split network. Note: http://hal.archives-ouvertes.fr/hal-00678046/en/.
Toggle abstract
"Phylogenetic networks were introduced to describe evolution in the presence of exchanges of genetic material between coexisting species or individuals. Split networks in particular were introduced as a special kind of abstract network to visualize conflicts between phylogenetic trees which may correspond to such exchanges. More recently, methods were designed to reconstruct explicit phylogenetic networks (whose vertices can be interpreted as biological events) from triplet data. In this article, we link abstract and explicit networks through their combinatorial properties, by introducing the unrooted analog of level-k networks. In particular, we give an equivalence theorem between circular split systems and unrooted level-1 networks. We also show how to adapt to quartets some existing results on triplets, in order to reconstruct unrooted level-k phylogenetic networks. These results give an interesting perspective on the combinatorics of phylogenetic networks and also raise algorithmic and combinatorial questions. © 2012 Imperial College Press."
|
|
|
Yun Yu,
James H. Degnan and
Luay Nakhleh. The probability of a gene tree topology within a phylogenetic network with applications to hybridization detection. In PLoS Genetics, Vol. 8(4):e1002660, 2012. Keywords: AIC, BIC, explicit network, hybridization, phylogenetic network, phylogeny, statistical model. Note: http://dx.doi.org/10.1371/journal.pgen.1002660.
Toggle abstract
"Gene tree topologies have proven a powerful data source for various tasks, including species tree inference and species delimitation. Consequently, methods for computing probabilities of gene trees within species trees have been developed and widely used in probabilistic inference frameworks. All these methods assume an underlying multispecies coalescent model. However, when reticulate evolutionary events such as hybridization occur, these methods are inadequate, as they do not account for such events. Methods that account for both hybridization and deep coalescence in computing the probability of a gene tree topology currently exist for very limited cases. However, no such methods exist for general cases, owing primarily to the fact that it is currently unknown how to compute the probability of a gene tree topology within the branches of a phylogenetic network. Here we present a novel method for computing the probability of gene tree topologies on phylogenetic networks and demonstrate its application to the inference of hybridization in the presence of incomplete lineage sorting. We reanalyze a Saccharomyces species data set for which multiple analyses had converged on a species tree candidate. Using our method, though, we show that an evolutionary hypothesis involving hybridization in this group has better support than one of strict divergence. A similar reanalysis on a group of three Drosophila species shows that the data is consistent with hybridization. Further, using extensive simulation studies, we demonstrate the power of gene tree topologies at obtaining accurate estimates of branch lengths and hybridization probabilities of a given phylogenetic network. Finally, we discuss identifiability issues with detecting hybridization, particularly in cases that involve extinction or incomplete sampling of taxa. © 2012 Yu et al."
|
|
|
Bonnie Kirkpatrick,
Yakir Reshef,
Hilary Finucane,
Haitao Jiang,
Binhai Zhu and
Richard M. Karp. Comparing Pedigree Graphs. In JCB, Vol. 19(9):998-1014, 2012. Keywords: distance between networks, from network, pedigree. Note: http://arxiv.org/abs/1009.0909, preliminary version as poster at WABI 2010.
Toggle abstract
"Pedigree graphs, or family trees, are typically constructed by an expensive process of examining genealogical records to determine which pairs of individuals are parent and child. New methods to automate this process take as input genetic data from a set of extant individuals and reconstruct ancestral individuals. There is a great need to evaluate the quality of these methods by comparing the estimated pedigree to the true pedigree. In this article, we consider two main pedigree comparison problems. The first is the pedigree isomorphism problem, for which we present a linear-time algorithm for leaf-labeled pedigrees. The second is the pedigree edit distance problem, for which we present (1) several algorithms that are fast and exact in various special cases, and (2) a general, randomized heuristic algorithm. In the negative direction, we first prove that the pedigree isomorphism problem is as hard as the general graph isomorphism problem, and that the sub-pedigree isomorphism problem is NP-hard. We then show that the pedigree edit distance problem is APX-hard in general and NP-hard on leaf-labeled pedigrees. We use simulated pedigrees to compare our edit-distance algorithms to each other as well as to a branch-and-bound algorithm that always finds an optimal solution. © 2012, Mary Ann Liebert, Inc."
|
|
|
Reza Hassanzadeh,
Changiz Eslahchi and
Wing-Kin Sung. Constructing phylogenetic supernetworks based on simulated annealing. In MPE, Vol. 63(3):738-744, 2012. Keywords: abstract network, from unrooted trees, heuristic, phylogenetic network, phylogeny, Program SNSA, reconstruction, simulated annealing, software, split network. Note: http://dx.doi.org/10.1016/j.ympev.2012.02.009.
Toggle abstract
Different partial phylogenetic trees can be derived from different sources of evidence and different methods. One important problem is to summarize these partial phylogenetic trees using a supernetwork. We propose a novel simulated annealing based method called SNSA which uses an optimization function to produce a simple network that still retains a great deal of phylogenetic information. We report the performance of this new method on real and simulated datasets. © 2012 Elsevier Inc.
|
|
|
Leo van Iersel and
Simone Linz. A quadratic kernel for computing the hybridization number of multiple trees. In IPL, Vol. 113:318-323, 2013. Keywords: explicit network, FPT, from rooted trees, kernelization, minimum number, phylogenetic network, phylogeny, Program Clustistic, Program MaafB, Program PIRN, reconstruction. Note: http://arxiv.org/abs/1203.4067, poster.
Toggle abstract
"It has recently been shown that the NP-hard problem of calculating the minimum number of hybridization events that is needed to explain a set of rooted binary phylogenetic trees by means of a hybridization network is fixed-parameter tractable if an instance of the problem consists of precisely two such trees. In this paper, we show that this problem remains fixed-parameter tractable for an arbitrarily large set of rooted binary phylogenetic trees. In particular, we present a quadratic kernel. © 2013 Elsevier B.V."
|
|
|
|
|
Tetsuo Asano,
Jesper Jansson,
Kunihiko Sadakane,
Ryuhei Uehara and
Gabriel Valiente. Faster computation of the Robinson–Foulds distance between phylogenetic networks. In Information Sciences, Vol. 197:77-90, 2012. Keywords: distance between networks, explicit network, level k phylogenetic network, phylogenetic network, polynomial, spread.
Toggle abstract
"The Robinson-Foulds distance, a widely used metric for comparing phylogenetic trees, has recently been generalized to phylogenetic networks. Given two phylogenetic networks N 1, N 2 with n leaf labels and at most m nodes and e edges each, the Robinson-Foulds distance measures the number of clusters of descendant leaves not shared by N 1 and N 2. The fastest known algorithm for computing the Robinson-Foulds distance between N 1 and N 2 runs in O(me) time. In this paper, we improve the time complexity to O(ne/log n) for general phylogenetic networks and O(nm/log n) for general phylogenetic networks with bounded degree (assuming the word RAM model with a word length of ⌈logn⌉ bits), and to optimal O(m) time for leaf-outerplanar networks as well as optimal O(n) time for level-1 phylogenetic networks (that is, galled-trees). We also introduce the natural concept of the minimum spread of a phylogenetic network and show how the running time of our new algorithm depends on this parameter. As an example, we prove that the minimum spread of a level-k network is at most k + 1, which implies that for one level-1 and one level-k phylogenetic network, our algorithm runs in O((k + 1)e) time. © 2012 Elsevier Inc. All rights reserved."
|
|
|
Hadi Poormohammadi,
Changiz Eslahchi and
Ruzbeh Tusserkani. TripNet: A Method for Constructing Rooted Phylogenetic Networks from Rooted Triplets. In PLoS ONE, Vol. 9(9):e106531, 2014. Keywords: explicit network, from triplets, heuristic, level k phylogenetic network, phylogenetic network, phylogeny, Program TripNet, reconstruction, software. Note: http://arxiv.org/abs/1201.3722.
Toggle abstract
"The problem of constructing an optimal rooted phylogenetic network from an arbitrary set of rooted triplets is an NP-hard problem. In this paper, we present a heuristic algorithm called TripNet, which tries to construct a rooted phylogenetic network with the minimum number of reticulation nodes from an arbitrary set of rooted triplets. Despite of current methods that work for dense set of rooted triplets, a key innovation is the applicability of TripNet to non-dense set of rooted triplets. We prove some theorems to clarify the performance of the algorithm. To demonstrate the efficiency of TripNet, we compared TripNet with SIMPLISTIC. It is the only available software which has the ability to return some rooted phylogenetic network consistent with a given dense set of rooted triplets. But the results show that for complex networks with high levels, the SIMPLISTIC running time increased abruptly. However in all cases TripNet outputs an appropriate rooted phylogenetic network in an acceptable time. Also we tetsed TripNet on the Yeast data. The results show that Both TripNet and optimal networks have the same clustering and TripNet produced a level-3 network which contains only one more reticulation node than the optimal network."
|
|
|
Chris Whidden,
Robert G. Beiko and
Norbert Zeh. Fixed-Parameter Algorithms for Maximum Agreement Forests. In SICOMP, Vol. 42(4):1431-1466, 2013. Keywords: agreement forest, explicit network, FPT, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, Program HybridInterleave, reconstruction, SPR distance. Note: http://arxiv.org/abs/1108.2664, slides.
Toggle abstract
"We present new and improved fixed-parameter algorithms for computing maximum agreement forests of pairs of rooted binary phylogenetic trees. The size of such a forest for two trees corresponds to their subtree prune-and-regraft distance and, if the agreement forest is acyclic, to their hybridization number. These distance measures are essential tools for understanding reticulate evolution. Our algorithm for computing maximum acyclic agreement forests is the first depth-bounded search algorithm for this problem. Our algorithms substantially outperform the best previous algorithms for these problems. © 2013 Society for Industrial and Applied Mathematics."
|
|
|
Stefan Grünewald,
Andreas Spillner,
Sarah Bastkowski,
Anja Bögershausen and
Vincent Moulton. SuperQ: Computing Supernetworks from Quartets. In TCBB, Vol. 10(1):151-160, 2013. Keywords: abstract network, circular split system, from quartets, heuristic, phylogenetic network, phylogeny, Program QNet, Program SplitsTree, Program SuperQ, software, split network.
Toggle abstract
"Supertrees are a commonly used tool in phylogenetics to summarize collections of partial phylogenetic trees. As a generalization of supertrees, phylogenetic supernetworks allow, in addition, the visual representation of conflict between the trees that is not possible to observe with a single tree. Here, we introduce SuperQ, a new method for constructing such supernetworks (SuperQ is freely available at >www.uea.ac.uk/computing/superq.). It works by first breaking the input trees into quartet trees, and then stitching these together to form a special kind of phylogenetic network, called a split network. This stitching process is performed using an adaptation of the QNet method for split network reconstruction employing a novel approach to use the branch lengths from the input trees to estimate the branch lengths in the resulting network. Compared with previous supernetwork methods, SuperQ has the advantage of producing a planar network. We compare the performance of SuperQ to the Z-closure and Q-imputation supernetwork methods, and also present an analysis of some published data sets as an illustration of its applicability. © 2004-2012 IEEE."
|
|
|
Lavanya Kannan and
Ward C Wheeler. Maximum Parsimony on Phylogenetic Networks. In ALMOB, Vol. 7:9, 2012. Keywords: dynamic programming, explicit network, from sequences, heuristic, parsimony, phylogenetic network, phylogeny. Note: http://dx.doi.org/10.1186/1748-7188-7-9.
Toggle abstract
"Background: Phylogenetic networks are generalizations of phylogenetic trees, that are used to model evolutionary events in various contexts. Several different methods and criteria have been introduced for reconstructing phylogenetic trees. Maximum Parsimony is a character-based approach that infers a phylogenetic tree by minimizing the total number of evolutionary steps required to explain a given set of data assigned on the leaves. Exact solutions for optimizing parsimony scores on phylogenetic trees have been introduced in the past.Results: In this paper, we define the parsimony score on networks as the sum of the substitution costs along all the edges of the network; and show that certain well-known algorithms that calculate the optimum parsimony score on trees, such as Sankoff and Fitch algorithms extend naturally for networks, barring conflicting assignments at the reticulate vertices. We provide heuristics for finding the optimum parsimony scores on networks. Our algorithms can be applied for any cost matrix that may contain unequal substitution costs of transforming between different characters along different edges of the network. We analyzed this for experimental data on 10 leaves or fewer with at most 2 reticulations and found that for almost all networks, the bounds returned by the heuristics matched with the exhaustively determined optimum parsimony scores.Conclusion: The parsimony score we define here does not directly reflect the cost of the best tree in the network that displays the evolution of the character. However, when searching for the most parsimonious network that describes a collection of characters, it becomes necessary to add additional cost considerations to prefer simpler structures, such as trees over networks. The parsimony score on a network that we describe here takes into account the substitution costs along the additional edges incident on each reticulate vertex, in addition to the substitution costs along the other edges which are common to all the branching patterns introduced by the reticulate vertices. Thus the score contains an in-built cost for the number of reticulate vertices in the network, and would provide a criterion that is comparable among all networks. Although the problem of finding the parsimony score on the network is believed to be computationally hard to solve, heuristics such as the ones described here would be beneficial in our efforts to find a most parsimonious network. © 2012 Kannan and Wheeler; licensee BioMed Central Ltd."
|
|
|
Alix Boc,
Alpha B. Diallo and
Vladimir Makarenkov. T-REX: a web server for inferring, validating and visualizing phylogenetic trees and networks. In NAR, Vol. 40(W1):W573-W579, 2012. Keywords: from rooted trees, from species tree, lateral gene transfer, phylogenetic network, phylogeny, Program T REX, reconstruction, reticulogram, software. Note: http://dx.doi.org/10.1093/nar/gks485.
Toggle abstract
"T-REX (Tree and reticulogram REConstruction) is a web server dedicated to the reconstruction of phylogenetic trees, reticulation networks and to the inference of horizontal gene transfer (HGT) events. T-REX includes several popular bioinformatics applications such as MUSCLE, MAFFT, Neighbor Joining, NINJA, BioNJ, PhyML, RAxML, random phylogenetic tree generator and some well-known sequence-to-distance transformation models. It also comprises fast and effective methods for inferring phylogenetic trees from complete and incomplete distance matrices as well as for reconstructing reticulograms and HGT networks, including the detection and validation of complete and partial gene transfers, inference of consensus HGT scenarios and interactive HGT identification, developed by the authors. The included methods allows for validating and visualizing phylogenetic trees and networks which can be built from distance or sequence data. The web server is available at: www.trex.uqam.ca. © 2012 The Author(s)."
|
|
|
Daniel H. Huson and
Celine Scornavacca. Dendroscope 3: An Interactive Tool for Rooted Phylogenetic Trees and Networks. In Systematic Biology, Vol. 61(6):1061-1067, 2012. Keywords: from rooted trees, from triplets, phylogenetic network, phylogeny, Program Dendroscope, reconstruction, software, visualization.
Toggle abstract
"Dendroscope 3 is a new program for working with rooted phylogenetic trees and networks. It provides a number of methods for drawing and comparing rooted phylogenetic networks, and for computing them from rooted trees. The program can be used interactively or in command-line mode. The program is written in Java, use of the software is free, and installers for all 3 major operating systems can be downloaded from www.dendroscope.org. [Phylogenetic trees; phylogenetic networks; software.] © 2012 The Author(s)."
|
|
|
Zhi-Zhong Chen,
Lusheng Wang and
Satoshi Yamanaka. A fast tool for minimum hybridization networks. In BMCB, Vol. 13:155, 2012. Keywords: agreement forest, explicit network, from rooted trees, phylogenetic network, phylogeny, Program FastHN, reconstruction, software. Note: http://dx.doi.org/10.1186/1471-2105-13-155.
Toggle abstract
"Background: Due to hybridization events in evolution, studying two different genes of a set of species may yield two related but different phylogenetic trees for the set of species. In this case, we want to combine the two phylogenetic trees into a hybridization network with the fewest hybridization events. This leads to three computational problems, namely, the problem of computing the minimum size of a hybridization network, the problem of constructing one minimum hybridization network, and the problem of enumerating a representative set of minimum hybridization networks. The previously best software tools for these problems (namely, Chen and Wang's HybridNet and Albrecht et al.'s Dendroscope 3) run very slowly for large instances that cannot be reduced to relatively small instances. Indeed, when the minimum size of a hybridization network of two given trees is larger than 23 and the problem for the trees cannot be reduced to relatively smaller independent subproblems, then HybridNet almost always takes longer than 1 day and Dendroscope 3 often fails to complete. Thus, a faster software tool for the problems is in need.Results: We develop a software tool in ANSI C, named FastHN, for the following problems: Computing the minimum size of a hybridization network, constructing one minimum hybridization network, and enumerating a representative set of minimum hybridization networks. We obtain FastHN by refining HybridNet with three ideas. The first idea is to preprocess the input trees so that the trees become smaller or the problem becomes to solve two or more relatively smaller independent subproblems. The second idea is to use a fast algorithm for computing the rSPR distance of two given phylognetic trees to cut more branches of the search tree in the exhaustive-search stage of the algorithm. The third idea is that during the exhaustive-search stage of the algorithm, we find two sibling leaves in one of the two forests (obtained from the given trees by cutting some edges) such that they are as far as possible in the other forest. As the result, FastHN always runs much faster than HybridNet. Unlike Dendroscope 3, FastHN is a single-threaded program. Despite this disadvantage, our experimental data shows that FastHN runs substantially faster than the multi-threaded Dendroscope 3 on a PC with multiple cores. Indeed, FastHN can finish within 16 minutes (on average on a Windows-7 (x64) desktop PC with i7-2600 CPU) even if the minimum size of a hybridization network of two given trees is about 25, the trees each have 100 leaves, and the problem for the input trees cannot be reduced to two or more independent subproblems via cluster reductions. It is also worth mentioning that like HybridNet, FastHN does not use much memory (indeed, the amount of memory is at most quadratic in the input size). In contrast, Dendroscope 3 uses a huge amount of memory. Executables of FastHN for Windows XP (x86), Windows 7 (x64), Linux, and Mac OS are available (see the Results and discussion section for details).Conclusions: For both biological datasets and simulated datasets, our experimental results show that FastHN runs substantially faster than HybridNet and Dendroscope 3. The superiority of FastHN in speed over the previous tools becomes more significant as the hybridization number becomes larger. In addition, FastHN uses much less memory than Dendroscope 3 and uses the same amount of memory as HybridNet. © 2012 Chen et al.; licensee BioMed Central Ltd."
|
|
|
Michel Habib and
Thu-Hien To. Constructing a Minimum Phylogenetic Network from a Dense Triplet Set. In JBCB, Vol. 10(5):1250013, 2012. Keywords: explicit network, from triplets, level k phylogenetic network, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://arxiv.org/abs/1103.2266.
Toggle abstract
"For a given set L of species and a set T of triplets on L, we seek to construct a phylogenetic network which is consistent with T i.e. which represents all triplets of T. The level of a network is defined as the maximum number of hybrid vertices in its biconnected components. When T is dense, there exist polynomial time algorithms to construct level-0,1 and 2 networks (Aho et al., 1981; Jansson, Nguyen and Sung, 2006; Jansson and Sung, 2006; Iersel et al., 2009). For higher levels, partial answers were obtained in the paper by Iersel and Kelk (2008), with a polynomial time algorithm for simple networks. In this paper, we detail the first complete answer for the general case, solving a problem proposed in Jansson and Sung (2006) and Iersel et al. (2009). For any k fixed, it is possible to construct a level-k network having the minimum number of hybrid vertices and consistent with T, if there is any, in time O(|T| k+1 n⌊4k/3⌋+1). © 2012 Imperial College Press."
|
|
|
Ruogu Sheng and
Sergey Bereg. Approximating Metrics with Planar Boundary-Labeled Phylogenetic Networks. In JBCB, Vol. 10(6):1250017, 2012. Keywords: abstract network, from distances, phylogenetic network, phylogeny, reconstruction.
Toggle abstract
"Phylogenetic networks are useful for visualizing evolutionary relationships between species with reticulate events such as hybridizations and horizontal gene transfers. In this paper, we consider the problem of constructing undirected phylogenetic networks that (1) are planar graphs and (2) admit embeddings in the plane where the vertices labeling all taxa are on the boundary of the network. We develop a new algorithm for constructing phylogenetic networks satisfying these constraints. First, we show that only approximate networks can be constructed for some distance matrices with at least five taxa. Then we prove that any five-point metric can be represented approximately by a planar boundary-labeled network with guaranteed fit value of 94.79. We extend the networks constructed in the proof to design an algorithm for computing planar boundary-labeled networks for any number of taxa. © 2012 Imperial College Press."
|
|
|
Teresa Piovesan and
Steven Kelk. A simple fixed parameter tractable algorithm for computing the hybridization number of two (not necessarily binary) trees. In TCBB, Vol. 10(1):18-25, 2013. Keywords: FPT, from rooted trees, phylogenetic network, phylogeny, Program TerminusEst, reconstruction. Note: http://arxiv.org/abs/1207.6090.
Toggle abstract
"Here, we present a new fixed parameter tractable algorithm to compute the hybridization number (r) of two rooted, not necessarily binary phylogenetic trees on taxon set (X) in time ((6r r) · poly(n)), where (n= |X|). The novelty of this approach is its use of terminals, which are maximal elements of a natural partial order on (X), and several insights from the softwired clusters literature. This yields a surprisingly simple and practical bounded-search algorithm and offers an alternative perspective on the underlying combinatorial structure of the hybridization number problem. © 2004-2012 IEEE."
|
|
|
|
|
Joseph K. Pickrell and
Jonathan K. Pritchard. Inference of Population Splits and Mixtures from Genome-Wide Allele Frequency Data. In PLoS Genetics, Vol. 8(11):e1002967, 2012. Keywords: explicit network, heuristic, likelihood, phylogenetic network, phylogeny, population genetics, Program TreeMix. Note: http://dx.doi.org/10.1371/journal.pgen.1002967.
Toggle abstract
"Many aspects of the historical relationships between populations in a species are reflected in genetic data. Inferring these relationships from genetic data, however, remains a challenging task. In this paper, we present a statistical model for inferring the patterns of population splits and mixtures in multiple populations. In our model, the sampled populations in a species are related to their common ancestor through a graph of ancestral populations. Using genome-wide allele frequency data and a Gaussian approximation to genetic drift, we infer the structure of this graph. We applied this method to a set of 55 human populations and a set of 82 dog breeds and wild canids. In both species, we show that a simple bifurcating tree does not fully describe the data; in contrast, we infer many migration events. While some of the migration events that we find have been detected previously, many have not. For example, in the human data, we infer that Cambodians trace approximately 16% of their ancestry to a population ancestral to other extant East Asian populations. In the dog data, we infer that both the boxer and basenji trace a considerable fraction of their ancestry (9% and 25%, respectively) to wolves subsequent to domestication and that East Asian toy breeds (the Shih Tzu and the Pekingese) result from admixture between modern toy breeds and "ancient" Asian breeds. Software implementing the model described here, called TreeMix, is available at http://treemix.googlecode.com. © 2012 Pickrell, Pritchard."
|
|
|
Nick J. Patterson,
Priya Moorjani,
Yontao Luo,
Swapan Mallick,
Nadin Rohland,
Yiping Zhan,
Teri Genschoreck,
Teresa Webster and
David Reich. Ancient Admixture in Human History. In Genetics, Vol. 192(3):1065-1093, 2012. Keywords: explicit network, phylogenetic network, phylogeny, population genetics, Program AdmixTools. Note: http://genetics.med.harvard.edu/reich/Reich_Lab/Welcome_files/2012_Patterson_AncientAdmixture_Genetics.pdf.
Toggle abstract
"Population mixture is an important process in biology. We present a suite of methods for learning about population mixtures, implemented in a software package called ADMIXTOOLS, that support formal tests for whether mixture occurred and make it possible to infer proportions and dates of mixture. We also describe the development of a new single nucleotide polymorphism (SNP) array consisting of 629,433 sites with clearly documented ascertainment that was specifically designed for population genetic analyses and that we genotyped in 934 individuals from 53 diverse populations. To illustrate the methods, we give a number of examples that provide new insights about the history of human admixture. The most striking finding is a clear signal of admixture into northern Europe, with one ancestral population related to present-day Basques and Sardinians and the other related to present-day populations of northeast Asia and the Americas. This likely reflects a history of admixture between Neolithic migrants and the indigenous Mesolithic population of Europe, consistent with recent analyses of ancient bones from Sweden and the sequencing of the genome of the Tyrolean "Iceman." © 2012 by the Genetics Society of America."
|
|
|
Leo van Iersel and
Vincent Moulton. Trinets encode tree-child and level-2 phylogenetic networks. In JOMB, Vol. 68(7):1707-1729, 2014. Keywords: explicit network, from subnetworks, from trinets, level k phylogenetic network, phylogenetic network, phylogeny, reconstruction. Note: http://arxiv.org/abs/1210.0362.
Toggle abstract
"Phylogenetic networks generalize evolutionary trees, and are commonly used to represent evolutionary histories of species that undergo reticulate evolutionary processes such as hybridization, recombination and lateral gene transfer. Recently, there has been great interest in trying to develop methods to construct rooted phylogenetic networks from triplets, that is rooted trees on three species. However, although triplets determine or encode rooted phylogenetic trees, they do not in general encode rooted phylogenetic networks, which is a potential issue for any such method. Motivated by this fact, Huber and Moulton recently introduced trinets as a natural extension of rooted triplets to networks. In particular, they showed that level-1 phylogenetic networks are encoded by their trinets, and also conjectured that all "recoverable" rooted phylogenetic networks are encoded by their trinets. Here we prove that recoverable binary level-2 networks and binary tree-child networks are also encoded by their trinets. To do this we prove two decomposition theorems based on trinets which hold for all recoverable binary rooted phylogenetic networks. Our results provide some additional evidence in support of the conjecture that trinets encode all recoverable rooted phylogenetic networks, and could also lead to new approaches to construct phylogenetic networks from trinets. © 2013 Springer-Verlag Berlin Heidelberg."
|
|
|
Stephen J. Willson. Reconstruction of certain phylogenetic networks from their tree-average distances. In BMB, Vol. 75(10):1840-1878, 2013. Keywords: explicit network, from distances, galled tree, normal network, phylogenetic network, phylogeny, unicyclic network. Note: http://www.public.iastate.edu/~swillson/Tree-AverageReconPaper9.pdf.
Toggle abstract
"Trees are commonly utilized to describe the evolutionary history of a collection of biological species, in which case the trees are called phylogenetic trees. Often these are reconstructed from data by making use of distances between extant species corresponding to the leaves of the tree. Because of increased recognition of the possibility of hybridization events, more attention is being given to the use of phylogenetic networks that are not necessarily trees. This paper describes the reconstruction of certain such networks from the tree-average distances between the leaves. For a certain class of phylogenetic networks, a polynomial-time method is presented to reconstruct the network from the tree-average distances. The method is proved to work if there is a single reticulation cycle. © 2013 Society for Mathematical Biology."
|
|
|
Peter J. Humphries,
Simone Linz and
Charles Semple. On the complexity of computing the temporal hybridization number for two phylogenies. In DAM, Vol. 161:871-880, 2013. Keywords: agreement forest, APX hard, characterization, from rooted trees, hybridization, NP complete, phylogenetic network, phylogeny, reconstruction, time consistent network. Note: http://ab.inf.uni-tuebingen.de/people/linz/publications/TAFapx.pdf.
Toggle abstract
"Phylogenetic networks are now frequently used to explain the evolutionary history of a set of species for which a collection of gene trees, reconstructed from genetic material of different parts of the species' genomes, reveal inconsistencies. However, in the context of hybridization, the reconstructed networks are often not temporal. If a hybridization network is temporal, then it satisfies the time constraint of instantaneously occurring hybridization events; i.e. all species that are involved in such an event coexist in time. Furthermore, although a collection of phylogenetic trees can often be merged into a hybridization network that is temporal, many algorithms do not necessarily find such a network since their primary optimization objective is to minimize the number of hybridization events. In this paper, we present a characterization for when two rooted binary phylogenetic trees admit a temporal hybridization network. Furthermore, we show that the underlying optimization problem is APX-hard and, therefore, NP-hard. Thus, unless P=NP, it is unlikely that there are efficient algorithms for either computing an exact solution or approximating it within a ratio arbitrarily close to one. © 2012 Elsevier B.V. All rights reserved."
|
|
|
|
|
Mareike Fischer,
Leo van Iersel,
Steven Kelk and
Celine Scornavacca. On Computing The Maximum Parsimony Score Of A Phylogenetic Network. In SIDMA, Vol. 29(1):559-585, 2015. Keywords: APX hard, cluster containment, explicit network, FPT, from network, from sequences, integer linear programming, level k phylogenetic network, NP complete, parsimony, phylogenetic network, phylogeny, polynomial, Program MPNet, reconstruction, software. Note: http://arxiv.org/abs/1302.2430.
|
|
|
Fenglou Mao,
David Williams,
Olga Zhaxybayeva,
Maria S. Poptsova,
Pascal Lapierre,
J. Peter Gogarten and
Ying Xu. Quartet decomposition server: a platform for analyzing phylogenetic trees. In BMCB, Vol. 13:123, 2012. Keywords: abstract network, from quartets, phylogenetic network, phylogeny, Program Quartet Decomposition, reconstruction, software, split network.
Toggle abstract
"Background: The frequent exchange of genetic material among prokaryotes means that extracting a majority or plurality phylogenetic signal from many gene families, and the identification of gene families that are in significant conflict with the plurality signal is a frequent task in comparative genomics, and especially in phylogenomic analyses. Decomposition of gene trees into embedded quartets (unrooted trees each with four taxa) is a convenient and statistically powerful technique to address this challenging problem. This approach was shown to be useful in several studies of completely sequenced microbial genomes.Results: We present here a web server that takes a collection of gene phylogenies, decomposes them into quartets, generates a Quartet Spectrum, and draws a split network. Users are also provided with various data download options for further analyses. Each gene phylogeny is to be represented by an assessment of phylogenetic information content, such as sets of trees reconstructed from bootstrap replicates or sampled from a posterior distribution. The Quartet Decomposition server is accessible at http://quartets.uga.edu.Conclusions: The Quartet Decomposition server presented here provides a convenient means to perform Quartet Decomposition analyses and will empower users to find statistically supported phylogenetic conflicts. © 2012 Mao et al.; licensee BioMed Central Ltd."
|
|
|
Jialiang Yang,
Stefan Grünewald and
Xiu-Feng Wan. Quartet-Net: A Quartet Based Method to Reconstruct Phylogenetic Networks. In MBE, Vol. 30(5):1206-1217, 2013. Keywords: from quartets, phylogenetic network, phylogeny, Program QuartetNet, reconstruction.
Toggle abstract
"Phylogenetic networks can model reticulate evolutionary events such as hybridization, recombination, and horizontal gene transfer. However, reconstructing such networks is not trivial. Popular character-based methods are computationally inefficient, whereas distance-based methods cannot guarantee reconstruction accuracy because pairwise genetic distances only reflect partial information about a reticulate phylogeny. To balance accuracy and computational efficiency, here we introduce a quartet-based method to construct a phylogenetic network from a multiple sequence alignment. Unlike distances that only reflect the relationship between a pair of taxa, quartets contain information on the relationships among four taxa; these quartets provide adequate capacity to infer a more accurate phylogenetic network. In applications to simulated and biological data sets, we demonstrate that this novel method is robust and effective in reconstructing reticulate evolutionary events and it has the potential to infer more accurate phylogenetic distances than other conventional phylogenetic network construction methods such as Neighbor-Joining, Neighbor-Net, and Split Decomposition. This method can be used in constructing phylogenetic networks from simple evolutionary events involving a few reticulate events to complex evolutionary histories involving a large number of reticulate events. A software called Quartet-Net is implemented and available at http://sysbio.cvm.msstate.edu/QuartetNet/. © 2013 The Author."
|
|
|
Thi-Hau Nguyen,
Vincent Ranwez,
Stéphanie Pointet,
Anne-Muriel Chifolleau Arigon,
Jean-Philippe Doyon and
Vincent Berry. Reconciliation and local gene tree rearrangement can be of mutual profit. In ALMOB, Vol. 8(12), 2013. Keywords: duplication, explicit network, from rooted trees, heuristic, lateral gene transfer, phylogenetic network, phylogeny, Program Mowgli, Program MowgliNNI, Program Prunier, reconstruction, software.
Toggle abstract
"Background: Reconciliation methods compare gene trees and species trees to recover evolutionary events such as duplications, transfers and losses explaining the history and composition of genomes. It is well-known that gene trees inferred from molecular sequences can be partly erroneous due to incorrect sequence alignments as well as phylogenetic reconstruction artifacts such as long branch attraction. In practice, this leads reconciliation methods to overestimate the number of evolutionary events. Several methods have been proposed to circumvent this problem, by collapsing the unsupported edges and then resolving the obtained multifurcating nodes, or by directly rearranging the binary gene trees. Yet these methods have been defined for models of evolution accounting only for duplications and losses, i.e. can not be applied to handle prokaryotic gene families.Results: We propose a reconciliation method accounting for gene duplications, losses and horizontal transfers, that specifically takes into account the uncertainties in gene trees by rearranging their weakly supported edges. Rearrangements are performed on edges having a low confidence value, and are accepted whenever they improve the reconciliation cost. We prove useful properties on the dynamic programming matrix used to compute reconciliations, which allows to speed-up the tree space exploration when rearrangements are generated by Nearest Neighbor Interchanges (NNI) edit operations. Experiments on synthetic data show that gene trees modified by such NNI rearrangements are closer to the correct simulated trees and lead to better event predictions on average. Experiments on real data demonstrate that the proposed method leads to a decrease in the reconciliation cost and the number of inferred events. Finally on a dataset of 30 k gene families, this reconciliation method shows a ranking of prokaryotic phyla by transfer rates identical to that proposed by a different approach dedicated to transfer detection [BMCBIOINF 11:324, 2010, PNAS 109(13):4962-4967, 2012].Conclusions: Prokaryotic gene trees can now be reconciled with their species phylogeny while accounting for the uncertainty of the gene tree. More accurate and more precise reconciliations are obtained with respect to previous parsimony algorithms not accounting for such uncertainties [LNCS 6398:93-108, 2010, BIOINF 28(12): i283-i291, 2012].A software implementing the method is freely available at http://www.atgc-montpellier.fr/Mowgli/. © 2013 Nguyen et al.; licensee BioMed Central Ltd."
|
|
|
|
|
Mukul S. Bansal,
Guy Banay,
Timothy J. Harlow,
J. Peter Gogarten and
Ron Shamir. Systematic inference of highways of horizontal gene transfer in prokaryotes. In BIO, Vol. 29(5):571-579, 2013. Keywords: duplication, explicit network, from species tree, from unrooted trees, lateral gene transfer, phylogenetic network, phylogeny, Program HiDe, Program RANGER-DTL, reconstruction. Note: http://people.csail.mit.edu/mukul/Bansal_Highways_Bioinformatics_2013.pdf.
|
|
|
|
|
Donovan H. Parks and
Robert G. Beiko. Measuring Community Similarity with Phylogenetic Networks. In MBE, Vol. 29(12):3947-3958, 2012. Keywords: abstract network, diversity, phylogenetic network, phylogeny, split network. Note: poster available at http://dparks.wdfiles.com/local--files/publications/SMBE_BetaDiversity_2011.pdf.
Toggle abstract
"Environmental drivers of biodiversity can be identified by relating patterns of community similarity to ecological factors. Community variation has traditionally been assessed by considering changes in species composition and more recently by incorporating phylogenetic information to account for the relative similarity of taxa. Here, we describe how an important class of measures including Bray-Curtis, Canberra, and UniFrac can be extended to allow community variation to be computed on a phylogenetic network. We focus on phylogenetic split systems, networks that are produced by the widely used median network and neighbor-net methods, which can represent incongruence in the evolutionary history of a set of taxa. Calculating β diversity over a split system provides a measure of community similarity averaged over uncertainty or conflict in the available phylogenetic signal. Our freely available software, Network Diversity, provides 11 qualitative (presence-absence, unweighted) and 14 quantitative (weighted) network-based measures of community similarity that model different aspects of community richness and evenness. We demonstrate the broad applicability of network-based diversity approaches by applying them to three distinct data sets: pneumococcal isolates from distinct geographic regions, human mitochondrial DNA data from the Indonesian island of Nias, and proteorhodopsin sequences from the Sargasso and Mediterranean Seas. Our results show that major expected patterns of variation for these data sets are recovered using network-based measures, which indicates that these patterns are robust to phylogenetic uncertainty and conflict. Nonetheless, network-based measures of community similarity can differ substantially from measures ignoring phylogenetic relationships or from tree-based measures when incongruent signals are present in the underlying data. Network-based measures provide a methodology for assessing the robustness of β-diversity results in light of incongruent phylogenetic signal and allow β diversity to be calculated over widely used network structures such as median networks. © 2012 The Author 2012."
|
|
|
Eric Bapteste,
Leo van Iersel,
Axel Janke,
Scott Kelchner,
Steven Kelk,
James O. McInerney,
David A. Morrison,
Luay Nakhleh,
Mike Steel,
Leen Stougie and
James B. Whitfield. Networks: expanding evolutionary thinking. In Trends in Genetics, Vol. 29(8):439-441, 2013. Keywords: abstract network, explicit network, phylogenetic network, phylogeny, reconstruction. Note: http://bioinf.nuim.ie/wp-content/uploads/2013/06/Bapteste-TiG-2013.pdf.
Toggle abstract
"Networks allow the investigation of evolutionary relationships that do not fit a tree model. They are becoming a leading tool for describing the evolutionary relationships between organisms, given the comparative complexities among genomes. © 2013 Elsevier Ltd."
|
|
|
Yun Yu,
R. Matthew Barnett and
Luay Nakhleh. Parsimonious Inference of Hybridization in the Presence of Incomplete Lineage Sorting. In Systematic Biology, Vol. 62(5):738-751, 2013. Keywords: from network, from rooted trees, hybridization, lineage sorting, parsimony, phylogenetic network, phylogeny, Program PhyloNet, reconstruction.
Toggle abstract
"Hybridization plays an important evolutionary role in several groups of organisms. A phylogenetic approach to detect hybridization entails sequencing multiple loci across the genomes of a group of species of interest, reconstructing their gene trees, and taking their differences as indicators of hybridization. However, methods that follow this approach mostly ignore population effects, such as incomplete lineage sorting (ILS). Given that hybridization occurs between closely related organisms, ILS may very well be at play and, hence, must be accounted for in the analysis framework. To address this issue, we present a parsimony criterion for reconciling gene trees within the branches of a phylogenetic network, and a local search heuristic for inferring phylogenetic networks from collections of gene-tree topologies under this criterion. This framework enables phylogenetic analyses while accounting for both hybridization and ILS. Further, we propose two techniques for incorporating information about uncertainty in gene-tree estimates. Our simulation studies demonstrate the good performance of our framework in terms of identifying the location of hybridization events, as well as estimating the proportions of genes that underwent hybridization. Also, our framework shows good performance in terms of efficiency on handling large data sets in our experiments. Further, in analysing a yeast data set, we demonstrate issues that arise when analysing real data sets. Although a probabilistic approach was recently introduced for this problem, and although parsimonious reconciliations have accuracy issues under certain settings, our parsimony framework provides a much more computationally efficient technique for this type of analysis. Our framework now allows for genome-wide scans for hybridization, while also accounting for ILS. [Phylogenetic networks; hybridization; incomplete lineage sorting; coalescent; multi-labeled trees.] © 2013 The Author(s). All rights reserved."
|
|
|
Juan Wang,
Maozu Guo,
Xiaoyan Liu,
Yang Liu,
Chunyu Wang,
Linlin Xing and
Kai Che. LNETWORK: An Efficient and Effective Method for Constructing Phylogenetic Networks. In BIO, Vol. 29(18):2269-2276, 2013. Keywords: explicit network, from rooted trees, phylogenetic network, phylogeny, Program LNetwork, reconstruction, software.
Toggle abstract
"Motivation: The evolutionary history of species is traditionally represented with a rooted phylogenetic tree. Each tree comprises a set of clusters, i.e. subsets of the species that are descended from a common ancestor. When rooted phylogenetic trees are built from several different datasets (e.g. from different genes), the clusters are often conflicting. These conflicting clusters cannot be expressed as a simple phylogenetic tree; however, they can be expressed in a phylogenetic network. Phylogenetic networks are a generalization of phylogenetic trees that can account for processes such as hybridization, horizontal gene transfer and recombination, which are difficult to represent in standard tree-like models of evolutionary histories. There is currently a large body of research aimed at developing appropriate methods for constructing phylogenetic networks from cluster sets. The Cass algorithm can construct a much simpler network than other available methods, but is extremely slow for large datasets or for datasets that need lots of reticulate nodes. The networks constructed by Cass are also greatly dependent on the order of input data, i.e. it generally derives different phylogenetic networks for the same dataset when different input orders are used.Results: In this study, we introduce an improved Cass algorithm, Lnetwork, which can construct a phylogenetic network for a given set of clusters. We show that Lnetwork is significantly faster than Cass and effectively weakens the influence of input data order. Moreover, we show that Lnetwork can construct a much simpler network than most of the other available methods. © The Author 2013."
|
|
|
Juan Wang,
Maozu Guo,
Linlin Xing,
Kai Che,
Xiaoyan Liu and
Chunyu Wang. BIMLR: A Method for Constructing Rooted Phylogenetic Networks from Rooted Phylogenetic Trees. In Gene, Vol. 527(1):344-351, 2013. Keywords: explicit network, from clusters, from rooted trees, phylogenetic network, phylogeny, Program BIMLR, Program Dendroscope, reconstruction, software.
Toggle abstract
"Rooted phylogenetic trees constructed from different datasets (e.g. from different genes) are often conflicting with one another, i.e. they cannot be integrated into a single phylogenetic tree. Phylogenetic networks have become an important tool in molecular evolution, and rooted phylogenetic networks are able to represent conflicting rooted phylogenetic trees. Hence, the development of appropriate methods to compute rooted phylogenetic networks from rooted phylogenetic trees has attracted considerable research interest of late. The CASS algorithm proposed by van Iersel et al. is able to construct much simpler networks than other available methods, but it is extremely slow, and the networks it constructs are dependent on the order of the input data. Here, we introduce an improved CASS algorithm, BIMLR. We show that BIMLR is faster than CASS and less dependent on the input data order. Moreover, BIMLR is able to construct much simpler networks than almost all other methods. BIMLR is available at http://nclab.hit.edu.cn/wangjuan/BIMLR/. © 2013 Elsevier B.V."
|
|
|
Zhi-Zhong Chen and
Lusheng Wang. An Ultrafast Tool for Minimum Reticulate Networks. In JCB, Vol. 20(1):38-41, 2013. Keywords: agreement forest, explicit network, from rooted trees, phylogenetic network, phylogeny, Program ultra-Net, reconstruction. Note: http://www.cs.cityu.edu.hk/~lwang/research/jcb2013.pdf.
Toggle abstract
"Due to hybridization events in evolution, studying different genes of a set of species may yield two or more related but different phylogenetic trees for the set of species. In this case, we want to combine the trees into a reticulate network with the fewest hybridization events. In this article, we develop a software tool (named UltraNet) for several fundamental problems related to the construction of minimum reticulate networks from two or more phylogenetic trees. Our experimental results show that UltraNet is much faster than all previous tools for these problems. © 2013 Mary Ann Liebert, Inc."
|
|
|
Anthony Labarre and
Sicco Verwer. Merging partially labelled trees: hardness and a declarative programming solution. In TCBB, Vol. 11(2):389-397, 2014. Keywords: abstract network, from unrooted trees, heuristic, NP complete, phylogenetic network, phylogeny, reconstruction. Note: https://hal-upec-upem.archives-ouvertes.fr/hal-00855669.
Toggle abstract
"Intraspecific studies often make use of haplotype networks instead of gene genealogies to represent the evolution of a set of genes. Cassens et al. proposed one such network reconstruction method, based on the global maximum parsimony principle, which was later recast by the first author of the present work as the problem of finding a minimum common supergraph of a set of t partially labelled trees. Although algorithms have been proposed for solving that problem on two graphs, the complexity of the general problem on trees remains unknown. In this paper, we show that the corresponding decision problem is NP-complete for t=3. We then propose a declarative programming approach to solving the problem to optimality in practice, as well as a heuristic approach, both based on the idpsystem, and assess the performance of both methods on randomly generated data. © 2004-2012 IEEE."
|
|
|
Peter J. Humphries,
Simone Linz and
Charles Semple. Cherry picking: a characterization of the temporal hybridization number for a set of phylogenies. In BMB, Vol. 75(10):1879-1890, 2013. Keywords: characterization, cherry-picking, from rooted trees, hybridization, NP complete, phylogenetic network, phylogeny, reconstruction, time consistent network. Note: http://ab.inf.uni-tuebingen.de/people/linz/publications/CPSpaper.pdf.
Toggle abstract
"Recently, we have shown that calculating the minimum-temporal-hybridization number for a set P of rooted binary phylogenetic trees is NP-hard and have characterized this minimum number when P consists of exactly two trees. In this paper, we give the first characterization of the problem for P being arbitrarily large. The characterization is in terms of cherries and the existence of a particular type of sequence. Furthermore, in an online appendix to the paper, we show that this new characterization can be used to show that computing the minimum-temporal hybridization number for two trees is fixed-parameter tractable. © 2013 Society for Mathematical Biology."
|
|
|
|
|
Alexey A. Morozov,
Yuri P. Galachyants and
Yelena V. Likhoshway. Inferring Phylogenetic Networks from Gene Order Data. In BMRI, Vol. 2013(503193):1-7, 2013. Keywords: abstract network, from distances, from gene order, NeighborNet, phylogenetic network, phylogeny, Program SplitsTree, reconstruction, split decomposition, split network.
Toggle abstract
"Existing algorithms allow us to infer phylogenetic networks from sequences (DNA, protein or binary), sets of trees, and distance matrices, but there are no methods to build them using the gene order data as an input. Here we describe several methods to build split networks from the gene order data, perform simulation studies, and use our methods for analyzing and interpreting different real gene order datasets. All proposed methods are based on intermediate data, which can be generated from genome structures under study and used as an input for network construction algorithms. Three intermediates are used: set of jackknife trees, distance matrix, and binary encoding. According to simulations and case studies, the best intermediates are jackknife trees and distance matrix (when used with Neighbor-Net algorithm). Binary encoding can also be useful, but only when the methods mentioned above cannot be used. © 2013 Alexey Anatolievich Morozov et al."
|
|
|
Celine Scornavacca,
Paprotny Wojciech,
Vincent Berry and
Vincent Ranwez. Representing a set of reconciliations in a compact way. In JBCB, Vol. 11(2):1250025, 2013. Keywords: duplication, explicit network, from network, from rooted trees, from species tree, phylogeny, Program GraphDTL, Program TERA, visualization. Note: http://hal-lirmm.ccsd.cnrs.fr/lirmm-00818801.
Toggle abstract
"Comparative genomic studies are often conducted by reconciliation analyses comparing gene and species trees. One of the issues with reconciliation approaches is that an exponential number of optimal scenarios is possible. The resulting complexity is masked by the fact that a majority of reconciliation software pick up a random optimal solution that is returned to the end-user. However, the alternative solutions should not be ignored since they tell different stories that parsimony considers as viable as the output solution. In this paper, we describe a polynomial space and time algorithm to build a minimum reconciliation graph-a graph that summarizes the set of all most parsimonious reconciliations. Amongst numerous applications, it is shown how this graph allows counting the number of non-equivalent most parsimonious reconciliations. © 2013 Imperial College Press."
|
|
|
Luay Nakhleh. Computational approaches to species phylogeny inference and gene tree reconciliation. In Trends in Ecology and Evolution, Vol. 28(12):719-728, 2013. Keywords: from rooted trees, from species tree, phylogenetic network, phylogeny, reconstruction, survey. Note: http://bioinfo.cs.rice.edu/sites/bioinfo.cs.rice.edu/files/TREE-Nakhleh13.pdf.
Toggle abstract
"An intricate relation exists between gene trees and species phylogenies, due to evolutionary processes that act on the genes within and across the branches of the species phylogeny. From an analytical perspective, gene trees serve as character states for inferring accurate species phylogenies, and species phylogenies serve as a backdrop against which gene trees are contrasted for elucidating evolutionary processes and parameters. In a 1997 paper, Maddison discussed this relation, reviewed the signatures left by three major evolutionary processes on the gene trees, and surveyed parsimony and likelihood criteria for utilizing these signatures to elucidate computationally this relation. Here, I review progress that has been made in developing computational methods for analyses under these two criteria, and survey remaining challenges. © 2013 Elsevier Ltd."
|
|
|
Thi-Hau Nguyen,
Vincent Ranwez,
Vincent Berry and
Celine Scornavacca. Support Measures to Estimate the Reliability of Evolutionary Events Predicted by Reconciliation Methods. In PLoS ONE, Vol. 8(10):e73667, 2013. Keywords: duplication, from rooted trees, from species tree, phylogenetic network, phylogeny, polynomial, Program GraphDTL, reconstruction. Note: http://dx.doi.org/10.1371/journal.pone.0073667.
Toggle abstract
"The genome content of extant species is derived from that of ancestral genomes, distorted by evolutionary events such as gene duplications, transfers and losses. Reconciliation methods aim at recovering such events and at localizing them in the species history, by comparing gene family trees to species trees. These methods play an important role in studying genome evolution as well as in inferring orthology relationships. A major issue with reconciliation methods is that the reliability of predicted evolutionary events may be questioned for various reasons: Firstly, there may be multiple equally optimal reconciliations for a given species tree-gene tree pair. Secondly, reconciliation methods can be misled by inaccurate gene or species trees. Thirdly, predicted events may fluctuate with method parameters such as the cost or rate of elementary events. For all of these reasons, confidence values for predicted evolutionary events are sorely needed. It was recently suggested that the frequency of each event in the set of all optimal reconciliations could be used as a support measure. We put this proposition to the test here and also consider a variant where the support measure is obtained by additionally accounting for suboptimal reconciliations. Experiments on simulated data show the relevance of event supports computed by both methods, while resorting to suboptimal sampling was shown to be more effective. Unfortunately, we also show that, unlike the majority-rule consensus tree for phylogenies, there is no guarantee that a single reconciliation can contain all events having above 50% support. In this paper, we detail how to rely on the reconciliation graph to efficiently identify the median reconciliation. Such median reconciliation can be found in polynomial time within the potentially exponential set of most parsimonious reconciliations. © 2013 Nguyen et al."
|
|
|
Mukul S. Bansal,
Eric J. Alm and
Manolis Kellis. Reconciliation Revisited: Handling Multiple Optima when Reconciling with Duplication, Transfer, and Loss. In JCB, Vol. 20(10):738-754, 2013. Keywords: duplication, from rooted trees, from species tree, loss, phylogenetic network, phylogeny, Program RANGER-DTL, reconstruction. Note: http://www.engr.uconn.edu/~mukul/Bansal_JCB2013.pdf.
Toggle abstract
"Phylogenetic tree reconciliation is a powerful approach for inferring evolutionary events like gene duplication, horizontal gene transfer, and gene loss, which are fundamental to our understanding of molecular evolution. While duplication-loss (DL) reconciliation leads to a unique maximum-parsimony solution, duplication-transfer-loss (DTL) reconciliation yields a multitude of optimal solutions, making it difficult to infer the true evolutionary history of the gene family. This problem is further exacerbated by the fact that different event cost assignments yield different sets of optimal reconciliations. Here, we present an effective, efficient, and scalable method for dealing with these fundamental problems in DTL reconciliation. Our approach works by sampling the space of optimal reconciliations uniformly at random and aggregating the results. We show that even gene trees with only a few dozen genes often have millions of optimal reconciliations and present an algorithm to efficiently sample the space of optimal reconciliations uniformly at random in O(mn 2) time per sample, where m and n denote the number of genes and species, respectively. We use these samples to understand how different optimal reconciliations vary in their node mappings and event assignments and to investigate the impact of varying event costs. We apply our method to a biological dataset of approximately 4700 gene trees from 100 taxa and observe that 93% of event assignments and 73% of mappings remain consistent across different multiple optima. Our analysis represents the first systematic investigation of the space of optimal DTL reconciliations and has many important implications for the study of gene family evolution. © 2013 Mary Ann Liebert, Inc."
|
|
|
Jesper Jansson and
Andrzej Lingas. Computing the rooted triplet distance between galled trees by counting triangles. In Journal of Discrete Algorithms, Vol. 25:66-78, 2014. Keywords: distance between networks, explicit network, from network, galled network, phylogenetic network, phylogeny, polynomial, triplet distance.
Toggle abstract
"We consider a generalization of the rooted triplet distance between two phylogenetic trees to two phylogenetic networks. We show that if each of the two given phylogenetic networks is a so-called galled tree with n leaves then the rooted triplet distance can be computed in o(n2.687) time. Our upper bound is obtained by reducing the problem of computing the rooted triplet distance between two galled trees to that of counting monochromatic and almost-monochromatic triangles in an undirected, edge-colored graph. To count different types of colored triangles in a graph efficiently, we extend an existing technique based on matrix multiplication and obtain several new algorithmic results that may be of independent interest: (i) the number of triangles in a connected, undirected, uncolored graph with m edges can be computed in o(m1.408) time; (ii) if G is a connected, undirected, edge-colored graph with n vertices and C is a subset of the set of edge colors then the number of monochromatic triangles of G with colors in C can be computed in o(n2.687) time; and (iii) if G is a connected, undirected, edge-colored graph with n vertices and R is a binary relation on the colors that is computable in O(1) time then the number of R-chromatic triangles in G can be computed in o(n2.687) time. © 2013 Elsevier B.V. All rights reserved."
|
|
|
Ward C Wheeler. Phyletic groups on networks. In Cladistics, Vol. 30(4):447-451, 2014. Keywords: explicit network, from network, phylogenetic network, phylogeny. Note: http://dx.doi.org/10.1111/cla.12062.
Toggle abstract
"Three additional phyletic group types, "periphyletic," "epiphyletic", and "anaphyletic" (in addition to Hennigian mono-, para-, and polyphyletic) are defined in terms of trees and phylogenetic networks (trees with directed reticulate edges) via a generalization of the algorithmic definitions of Farris. These designations concern groups defined as monophyletic on trees, but with additional gains or losses of members from network edges. These distinctions should be useful in discussion of systems with non-vertical inheritance such as recombination between viruses, horizontal exchange between bacteria, hybridization in plants and animals, as well as human linguistic evolution. Examples are illustrated with Indo-European language groups. © The Willi Hennig Society 2013."
|
|
|
Sarah Bastkowski,
Andreas Spillner and
Vincent Moulton. Fishing for minimum evolution trees with Neighbor-Nets. In IPL, Vol. 114(1-2):3-18, 2014. Keywords: circular split system, from distances, NeighborNet, phylogeny, polynomial.
Toggle abstract
"In evolutionary biology, biologists commonly use a phylogenetic tree to represent the evolutionary history of some set of species. A common approach taken to construct such a tree is to search through the space of all possible phylogenetic trees on the set so as to find one that optimizes some score function, such as the minimum evolution criterion. However, this is hampered by the fact that the space of phylogenetic trees is extremely large in general. Interestingly, an alternative approach, which has received somewhat less attention in the literature, is to instead search for trees within some set of bipartitions or splits of the set of species in question. Here we consider the problem of searching through a set of splits that is circular. Such sets can, for example, be generated by the NeighborNet algorithm for constructing phylogenetic networks. More specifically, we present an O(n4) time algorithm for finding an optimal minimum evolution tree in a circular set of splits on a set of species of size n. In addition, using simulations, we compare the performance of this algorithm when applied to NeighborNet output with that of FastME, a leading method for searching for minimum evolution trees in tree space. We find that, even though a circular set of splits represents just a tiny fraction of the total number of possible splits of a set, the trees obtained from circular sets compare quite favorably with those obtained with FastME, suggesting that the approach could warrant further investigation. © 2013 Elsevier B.V."
|
|
|
Leo van Iersel,
Steven Kelk,
Nela Lekic,
Chris Whidden and
Norbert Zeh. Hybridization Number on Three Rooted Binary Trees is EPT. In SIDMA, Vol. 30(3):1607-1631, 2016. Keywords: agreement forest, explicit network, FPT, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, reconstruction. Note: http://arxiv.org/abs/1402.2136.
|
|
|
Lavanya Kannan and
Ward C Wheeler. Exactly Computing the Parsimony Scores on Phylogenetic Networks Using Dynamic Programming. In JCB, Vol. 21(4):303-319, 2014. Keywords: explicit network, exponential algorithm, from network, from sequences, parsimony, phylogenetic network, phylogeny, reconstruction.
Toggle abstract
"Scoring a given phylogenetic network is the first step that is required in searching for the best evolutionary framework for a given dataset. Using the principle of maximum parsimony, we can score phylogenetic networks based on the minimum number of state changes across a subset of edges of the network for each character that are required for a given set of characters to realize the input states at the leaves of the networks. Two such subsets of edges of networks are interesting in light of studying evolutionary histories of datasets: (i) the set of all edges of the network, and (ii) the set of all edges of a spanning tree that minimizes the score. The problems of finding the parsimony scores under these two criteria define slightly different mathematical problems that are both NP-hard. In this article, we show that both problems, with scores generalized to adding substitution costs between states on the endpoints of the edges, can be solved exactly using dynamic programming. We show that our algorithms require O(mpk) storage at each vertex (per character), where k is the number of states the character can take, p is the number of reticulate vertices in the network, m = k for the problem with edge set (i), and m = 2 for the problem with edge set (ii). This establishes an O(nmpk2) algorithm for both the problems (n is the number of leaves in the network), which are extensions of Sankoff's algorithm for finding the parsimony scores for phylogenetic trees. We will discuss improvements in the complexities and show that for phylogenetic networks whose underlying undirected graphs have disjoint cycles, the storage at each vertex can be reduced to O(mk), thus making the algorithm polynomial for this class of networks. We will present some properties of the two approaches and guidance on choosing between the criteria, as well as traverse through the network space using either of the definitions. We show that our methodology provides an effective means to study a wide variety of datasets. © Copyright 2014, Mary Ann Liebert, Inc. 2014."
|
|
|
Jialiang Yang,
Stefan Grünewald,
Yifei Xu and
Xiu-Feng Wan. Quartet-based methods to reconstruct phylogenetic networks. In BMC Systems Biology, Vol. 80(21), 2014. Keywords: abstract network, from quartets, phylogenetic network, phylogeny, Program QuartetMethods, Program QuartetNet, Program SplitsTree, reconstruction. Note: http://dx.doi.org/10.1186/1752-0509-8-21
.
Toggle abstract
"Background: Phylogenetic networks are employed to visualize evolutionary relationships among a group of nucleotide sequences, genes or species when reticulate events like hybridization, recombination, reassortant and horizontal gene transfer are believed to be involved. In comparison to traditional distance-based methods, quartet-based methods consider more information in the reconstruction process and thus have the potential to be more accurate.Results: We introduce QuartetSuite, which includes a set of new quartet-based methods, namely QuartetS, QuartetA, and QuartetM, to reconstruct phylogenetic networks from nucleotide sequences. We tested their performances and compared them with other popular methods on two simulated nucleotide sequence data sets: one generated from a tree topology and the other from a complicated evolutionary history containing three reticulate events. We further validated these methods to two real data sets: a bacterial data set consisting of seven concatenated genes of 36 bacterial species and an influenza data set related to recently emerging H7N9 low pathogenic avian influenza viruses in China.Conclusion: QuartetS, QuartetA, and QuartetM have the potential to accurately reconstruct evolutionary scenarios from simple branching trees to complicated networks containing many reticulate events. These methods could provide insights into the understanding of complicated biological evolutionary processes such as bacterial taxonomy and reassortant of influenza viruses. © 2014 Yang et al.; licensee BioMed Central Ltd."
|
|
|
Kevin J. Liu,
Jingxuan Dai,
Kathy Truong,
Ying Song,
Michael H. Kohn and
Luay Nakhleh. An HMM-Based Comparative Genomic Framework for Detecting Introgression in Eukaryotes. In PLoS ONE, Vol. 10(6):e1003649, 2014. Keywords: explicit network, from network, phylogenetic network, phylogeny, Program PhyloNet-HMM. Note: http://arxiv.org/abs/1310.7989.
Toggle abstract
"One outcome of interspecific hybridization and subsequent effects of evolutionary forces is introgression, which is the integration of genetic material from one species into the genome of an individual in another species. The evolution of several groups of eukaryotic species has involved hybridization, and cases of adaptation through introgression have been already established. In this work, we report on PhyloNet-HMM-a new comparative genomic framework for detecting introgression in genomes. PhyloNet-HMM combines phylogenetic networks with hidden Markov models (HMMs) to simultaneously capture the (potentially reticulate) evolutionary history of the genomes and dependencies within genomes. A novel aspect of our work is that it also accounts for incomplete lineage sorting and dependence across loci. Application of our model to variation data from chromosome 7 in the mouse (Mus musculus domesticus) genome detected a recently reported adaptive introgression event involving the rodent poison resistance gene Vkorc1, in addition to other newly detected introgressed genomic regions. Based on our analysis, it is estimated that about 9% of all sites within chromosome 7 are of introgressive origin (these cover about 13 Mbp of chromosome 7, and over 300 genes). Further, our model detected no introgression in a negative control data set. We also found that our model accurately detected introgression and other evolutionary processes from synthetic data sets simulated under the coalescent model with recombination, isolation, and migration. Our work provides a powerful framework for systematic analysis of introgression while simultaneously accounting for dependence across sites, point mutations, recombination, and ancestral polymorphism. © 2014 Liu et al."
|
|
|
|
|
|
|
|
|
|
|
Alberto Apostolico,
Matteo Comin,
Andreas W. M. Dress and
Laxmi Parida. Ultrametric networks: a new tool for phylogenetic analysis. In Algorithms for Molecular Biology, Vol. 8(7):1-10, 2013. Keywords: abstract network, from distances, phylogenetic network, phylogeny, Program Ultranet. Note: http://dx.doi.org/10.1186/1748-7188-8-7.
Toggle abstract
"Background: The large majority of optimization problems related to the inference of distance-based trees used in phylogenetic analysis and classification is known to be intractable. One noted exception is found within the realm of ultrametric distances. The introduction of ultrametric trees in phylogeny was inspired by a model of evolution driven by the postulate of a molecular clock, now dismissed, whereby phylogeny could be represented by a weighted tree in which the sum of the weights of the edges separating any given leaf from the root is the same for all leaves. Both, molecular clocks and rooted ultrametric trees, fell out of fashion as credible representations of evolutionary change. At the same time, ultrametric dendrograms have shown good potential for purposes of classification in so far as they have proven to provide good approximations for additive trees. Most of these approximations are still intractable, but the problem of finding the nearest ultrametric distance matrix to a given distance matrix with respect to the L∞ distance has been long known to be solvable in polynomial time, the solution being incarnated in any minimum spanning tree for the weighted graph subtending to the matrix.Results: This paper expands this subdominant ultrametric perspective by studying ultrametric networks, consisting of the collection of all edges involved in some minimum spanning tree. It is shown that, for a graph with n vertices, the construction of such a network can be carried out by a simple algorithm in optimal time O(n2) which is faster by a factor of n than the direct adaptation of the classical O(n3) paradigm by Warshall for computing the transitive closure of a graph. This algorithm, called UltraNet, will be shown to be easily adapted to compute relaxed networks and to support the introduction of artificial points to reduce the maximum distance between vertices in a pair. Finally, a few experiments will be discussed to demonstrate the applicability of subdominant ultrametric networks.Availability: http://www.dei.unipd.it/~ciompin/main/Ultranet/Ultranet.html. © 2013 Apostolico et al.; licensee BioMed Central Ltd."
|
|
|
Mehdi Layeghifard,
Pedro R. Peres-Neto and
Vladimir Makarenkov. Inferring explicit weighted consensus networks to represent alternative evolutionary histories. In BMCEB, Vol. 13(274):1-25, 2013. Keywords: explicit network, from rooted trees, from species tree, phylogenetic network, phylogeny, Program ConsensusNetwork, reconstruction. Note: http://dx.doi.org/10.1186/1471-2148-13-274.
Toggle abstract
"Background: The advent of molecular biology techniques and constant increase in availability of genetic material have triggered the development of many phylogenetic tree inference methods. However, several reticulate evolution processes, such as horizontal gene transfer and hybridization, have been shown to blur the species evolutionary history by causing discordance among phylogenies inferred from different genes. Methods. To tackle this problem, we hereby describe a new method for inferring and representing alternative (reticulate) evolutionary histories of species as an explicit weighted consensus network which can be constructed from a collection of gene trees with or without prior knowledge of the species phylogeny. Results: We provide a way of building a weighted phylogenetic network for each of the following reticulation mechanisms: diploid hybridization, intragenic recombination and complete or partial horizontal gene transfer. We successfully tested our method on some synthetic and real datasets to infer the above-mentioned evolutionary events which may have influenced the evolution of many species. Conclusions: Our weighted consensus network inference method allows one to infer, visualize and validate statistically major conflicting signals induced by the mechanisms of reticulate evolution. The results provided by the new method can be used to represent the inferred conflicting signals by means of explicit and easy-to-interpret phylogenetic networks. © 2013 Layeghifard et al.; licensee BioMed Central Ltd."
|
|
|
|
|
Leo van Iersel,
Steven Kelk,
Nela Lekic and
Celine Scornavacca. A practical approximation algorithm for solving massive instances of hybridization number for binary and nonbinary trees. In BMCB, Vol. 15(127):1-12, 2014. Keywords: agreement forest, approximation, explicit network, from rooted trees, phylogenetic network, phylogeny, Program CycleKiller, Program TerminusEst, reconstruction. Note: http://dx.doi.org/10.1186/1471-2105-15-127.
|
|
|
Gergely J. Szöllösi,
Eric Tannier,
Nicolas Lartillot and
Vincent Daubin. Lateral Gene Transfer from the Dead. In Systematic Biology, Vol. 62(3):386-397, 2013. Keywords: duplication, lateral gene transfer, likelihood, loss, phylogeny, Program TERA, reconstruction. Note: http://dx.doi.org/10.1093/sysbio/syt003.
Toggle abstract
"In phylogenetic studies, the evolution of molecular sequences is assumed to have taken place along the phylogeny traced by the ancestors of extant species. In the presence of lateral gene transfer, however, this may not be the case, because the species lineage from which a gene was transferred may have gone extinct or not have been sampled. Because it is not feasible to specify or reconstruct the complete phylogeny of all species, we must describe the evolution of genes outside the represented phylogeny by modeling the speciation dynamics that gave rise to the complete phylogeny. We demonstrate that if the number of sampled species is small compared with the total number of existing species, the overwhelming majority of gene transfers involve speciation to and evolution along extinct or unsampled lineages. We show that the evolution of genes along extinct or unsampled lineages can to good approximation be treated as those of independently evolving lineages described by a few global parameters. Using this result, we derive an algorithm to calculate the probability of a gene tree and recover the maximum-likelihood reconciliation given the phylogeny of the sampled species. Examining 473 near-universal gene families from 36 cyanobacteria, we find that nearly a third of transfer events (28%) appear to have topological signatures of evolution along extinct species, but only approximately 6% of transfers trace their ancestry to before the common ancestor of the sampled cyanobacteria. © 2013 The Author(s)."
|
|
|
Gergely J. Szöllösi,
Wojciech Rosikiewicz,
Bastien Boussau,
Eric Tannier and
Vincent Daubin. Efficient Exploration of the Space of Reconciled Gene Trees. In Systematic Biology, Vol. 62(6):901-912, 2013. Keywords: duplication, explicit network, lateral gene transfer, likelihood, loss, phylogeny, Program ALE, reconstruction. Note: http://arxiv.org/abs/1306.2167.
Toggle abstract
"Gene trees record the combination of gene-level events, such as duplication, transfer and loss (DTL), and species-level events, such as speciation and extinction. Gene tree-species tree reconciliation methods model these processes by drawing gene trees into the species tree using a series of gene and species-level events. The reconstruction of gene trees based on sequence alone almost always involves choosing between statistically equivalent or weakly distinguishable relationships that could be much better resolved based on a putative species tree. To exploit this potential for accurate reconstruction of gene trees, the space of reconciled gene trees must be explored according to a joint model of sequence evolution and gene tree-species tree reconciliation. Here we present amalgamated likelihood estimation (ALE), a probabilistic approach to exhaustively explore all reconciled gene trees that can be amalgamated as a combination of clades observed in a sample of gene trees. We implement the ALE approach in the context of a reconciliation model (Szöllo{double acute}si et al. 2013), which allows for the DTL of genes. We use ALE to efficiently approximate the sum of the joint likelihood over amalgamations and to find the reconciled gene tree that maximizes the joint likelihood among all such trees. We demonstrate using simulations that gene trees reconstructed using the joint likelihood are substantially more accurate than those reconstructed using sequence alone. Using realistic gene tree topologies, branch lengths, and alignment sizes, we demonstrate that ALE produces more accurate gene trees even if the model of sequence evolution is greatly simplified. Finally, examining 1099 gene families from 36 cyanobacterial genomes we find that joint likelihood-based inference results in a striking reduction in apparent phylogenetic discord, with respectively. 24%, 59%, and 46% reductions in the mean numbers of duplications, transfers, and losses per gene family. The open source implementation of ALE is available from https://github.com/ssolo/ALE.git. © The Author(s) 2013."
|
|
|
Monika Balvociute,
Andreas Spillner and
Vincent Moulton. FlatNJ: A Novel Network-Based Approach to Visualize Evolutionary and Biogeographical Relationships. In Systematic Biology, Vol. 63(3):383-396, 2014. Keywords: abstract network, flat, phylogenetic network, phylogeny, Program FlatNJ, Program SplitsTree, split network. Note: http://dx.doi.org/10.1093/sysbio/syu001.
Toggle abstract
"Split networks are a type of phylogenetic network that allow visualization of conflict in evolutionary data. We present a new method for constructing such networks called FlatNetJoining (FlatNJ). A key feature of FlatNJ is that it produces networks that can be drawn in the plane in which labels may appear inside of the network. For complex data sets that involve, for example, non-neutral molecular markers, this can allow additional detail to be visualized as compared to previous methods such as split decomposition and NeighborNet. We illustrate the application of FlatNJ by applying it to whole HIV genome sequences, where recombination has taken place, fluorescent proteins in corals, where ancestral sequences are present, and mitochondrial DNA sequences from gall wasps, where biogeographical relationships are of interest. We find that the networks generated by FlatNJ can facilitate the study of genetic variation in the underlying molecular sequence data and, in particular, may help to investigate processes such as intra-locus recombination. FlatNJ has been implemented in Java and is freely available at www.uea.ac.uk/computing/software/ flatnj. [flat split system; NeighborNet; Phylogenetic network; QNet; split; split network.] © The Author(s) 2014."
|
|
|
Johann-Mattis List,
Shijulal Nelson-Sathi,
Hans Geisler and
William Martin. Networks of lexical borrowing and lateral gene transfer in language and genome evolution. In BioEssays, Vol. 36(2):141-150, 2014. Keywords: explicit network, minimal lateral network, phylogenetic network, Program lingpy. Note: http://dx.doi.org/10.1002/bies.201300096.
Toggle abstract
"Like biological species, languages change over time. As noted by Darwin, there are many parallels between language evolution and biological evolution. Insights into these parallels have also undergone change in the past 150 years. Just like genes, words change over time, and language evolution can be likened to genome evolution accordingly, but what kind of evolution? There are fundamental differences between eukaryotic and prokaryotic evolution. In the former, natural variation entails the gradual accumulation of minor mutations in alleles. In the latter, lateral gene transfer is an integral mechanism of natural variation. The study of language evolution using biological methods has attracted much interest of late, most approaches focusing on language tree construction. These approaches may underestimate the important role that borrowing plays in language evolution. Network approaches that were originally designed to study lateral gene transfer may provide more realistic insights into the complexities of language evolution. Editor's suggested further reading in BioEssays Linguistic evidence supports date for Homeric epics. © 2014 The Authors. BioEssays Published by WILEY Periodicals, Inc."
|
|
|
Zhi-Zhong Chen,
Fei Deng and
Lusheng Wang. Simultaneous Identification of Duplications, Losses, and Lateral Gene Transfers. In TCBB, Vol. 9(5):1515-1528, 2012. Keywords: duplication, explicit network, FPT, from rooted trees, from species tree, lateral gene transfer, loss, phylogenetic network, phylogeny, reconstruction. Note: http://www.cs.cityu.edu.hk/~lwang/research/tcbb2012c.pdf.
Toggle abstract
"We give a fixed-parameter algorithm for the problem of enumerating all minimum-cost LCA-reconciliations involving gene duplications, gene losses, and lateral gene transfers (LGTs) for a given species tree S and a given gene tree G. Our algorithm can work for the weighted version of the problem, where the costs of a gene duplication, a gene loss, and an LGT are left to the user's discretion. The algorithm runs in O(m+3 k/c n) time, where m is the number of vertices in S, n is the number of vertices in G, c is the smaller between a gene duplication cost and an LGT cost, and k is the minimum cost of an LCA-reconciliation between S and G. The time complexity is indeed better if the cost of a gene loss is greater than 0. In particular, when the cost of a gene loss is at least 0.614c, the running time of the algorithm is O(m+2.78 k/cn). © 2004-2012 IEEE."
|
|
|
Juan Wang. A new algorithm to construct phylogenetic networks from trees. In Genetics and Molecular Research, Vol. 13(1):1456-1464, 2014. Keywords: explicit network, from clusters, heuristic, phylogenetic network, Program LNetwork, Program QuickCass, reconstruction. Note: http://dx.doi.org/10.4238/2014.March.6.4.
Toggle abstract
"Developing appropriate methods for constructing phylogenetic networks from tree sets is an important problem, and much research is currently being undertaken in this area. BIMLR is an algorithm that constructs phylogenetic networks from tree sets. The algorithm can construct a much simpler network than other available methods. Here, we introduce an improved version of the BIMLR algorithm, QuickCass. QuickCass changes the selection strategy of the labels of leaves below the reticulate nodes, i.e., the nodes with an indegree of at least 2 in BIMLR. We show that QuickCass can construct simpler phylogenetic networks than BIMLR. Furthermore, we show that QuickCass is a polynomial-time algorithm when the output network that is constructed by QuickCass is binary. © FUNPEC-RP."
|
|
|
Matthieu Willems,
Nadia Tahiri and
Vladimir Makarenkov. A new efficient algorithm for inferring explicit hybridization networks following the Neighbor-Joining principle. In JBCB, Vol. 12(5), 2014. Keywords: explicit network, from distances, heuristic, phylogenetic network, phylogeny, reconstruction.
Toggle abstract
"Several algorithms and software have been developed for inferring phylogenetic trees. However, there exist some biological phenomena such as hybridization, recombination, or horizontal gene transfer which cannot be represented by a tree topology. We need to use phylogenetic networks to adequately represent these important evolutionary mechanisms. In this article, we present a new efficient heuristic algorithm for inferring hybridization networks from evolutionary distance matrices between species. The famous Neighbor-Joining concept and the least-squares criterion are used for building networks. At each step of the algorithm, before joining two given nodes, we check if a hybridization event could be related to one of them or to both of them. The proposed algorithm finds the exact tree solution when the considered distance matrix is a tree metric (i.e. it is representable by a unique phylogenetic tree). It also provides very good hybrids recovery rates for large trees (with 32 and 64 leaves in our simulations) for both distance and sequence types of data. The results yielded by the new algorithm for real and simulated datasets are illustrated and discussed in detail. © Imperial College Press."
|
|
|
Katharina Huber,
Leo van Iersel,
Vincent Moulton and
Taoyang Wu. How much information is needed to infer reticulate evolutionary histories? In Systematic Biology, Vol. 64(1):102-111, 2015. Keywords: explicit network, from network, from rooted trees, from subnetworks, from trinets, identifiability, phylogenetic network, phylogeny, reconstruction, uniqueness. Note: http://dx.doi.org/10.1093/sysbio/syu076.
|
|
|
Paul Cordue,
Simone Linz and
Charles Semple. Phylogenetic Networks that Display a Tree Twice. In BMB, Vol. 76(10):2664-2679, 2014. Keywords: from rooted trees, normal network, phylogenetic network, phylogeny, reconstruction, tree-child network. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/CLS14.pdf.
Toggle abstract
"In the last decade, the use of phylogenetic networks to analyze the evolution of species whose past is likely to include reticulation events, such as horizontal gene transfer or hybridization, has gained popularity among evolutionary biologists. Nevertheless, the evolution of a particular gene can generally be described without reticulation events and therefore be represented by a phylogenetic tree. While this is not in contrast to each other, it places emphasis on the necessity of algorithms that analyze and summarize the tree-like information that is contained in a phylogenetic network. We contribute to the toolbox of such algorithms by investigating the question of whether or not a phylogenetic network embeds a tree twice and give a quadratic-time algorithm to solve this problem for a class of networks that is more general than tree-child networks. © 2014, Society for Mathematical Biology."
|
|
|
Katharina Huber,
Vincent Moulton,
Mike Steel and
Taoyang Wu. Folding and unfolding phylogenetic trees and networks. In JOMB, Vol. 73(6):1761-1780, 2016. Keywords: compressed network, explicit network, FU-stable network, NP complete, phylogenetic network, phylogeny, tree containment, tree sibling network. Note: http://arxiv.org/abs/1506.04438.
|
|
|
|
|
|
|
|
|
Steven Kelk,
Leo van Iersel,
Celine Scornavacca and
Mathias Weller. Phylogenetic incongruence through the lens of Monadic Second Order logic. In JGAA, Vol. 20(2):189-215, 2016. Keywords: agreement forest, explicit network, FPT, from rooted trees, hybridization, minimum number, MSOL, phylogenetic network, phylogeny, reconstruction. Note: http://jgaa.info/accepted/2016/KelkIerselScornavaccaWeller2016.20.2.pdf.
|
|
|
Katharina Huber,
Leo van Iersel,
Vincent Moulton,
Celine Scornavacca and
Taoyang Wu. Reconstructing phylogenetic level-1 networks from nondense binet and trinet sets. In ALG, Vol. 77(1):173-200, 2017. Keywords: explicit network, FPT, from binets, from subnetworks, from trinets, NP complete, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://arxiv.org/abs/1411.6804.
|
|
|
|
|
|
|
|
|
|
|
Joel Sjöstrand,
Ali Tofigh,
Vincent Daubin,
Lars Arvestad,
Bengt Sennblad and
Jens Lagergren. A Bayesian Method for Analyzing Lateral Gene Transfer. In Systematic Biology, Vol. 63(3):409-420, 2014. Keywords: bayesian, duplication, from rooted trees, from sequences, from species tree, lateral gene transfer, loss, phylogenetic network, phylogeny, Program JPrIME-DLTRS, reconstruction. Note: http://dx.doi.org/10.1093/sysbio/syu007.
|
|
|
Sajad Mirzaei and
Yufeng Wu. Fast Construction of Near Parsimonious Hybridization Networks for Multiple Phylogenetic Trees. In TCBB, Vol. 13(3):565-570, 2016. Keywords: bound, explicit network, from rooted trees, heuristic, phylogenetic network, phylogeny, Program PIRN, reconstruction, software. Note: http://www.engr.uconn.edu/~ywu/Papers/PIRNs-preprint.pdf.
|
|
|
Benjamin Albrecht. Computing all hybridization networks for multiple binary phylogenetic input trees. In BMCB, Vol. 16(236):1-15, 2015. Keywords: agreement forest, explicit network, exponential algorithm, FPT, from rooted trees, phylogenetic network, phylogeny, Program Hybroscale, Program PIRN, reconstruction. Note: http://dx.doi.org/10.1186/s12859-015-0660-7.
|
|
|
|
|
|
|
|
|
Vincent Ranwez,
Celine Scornavacca,
Jean-Philippe Doyon and
Vincent Berry. Inferring gene duplications, transfers and losses can be done in a discrete framework. In JOMB, Vol. 72(7):1811-1844, 2016. Keywords: duplication, explicit network, from rooted trees, from species tree, lateral gene transfer, loss, phylogenetic network, phylogeny, reconstruction.
|
|
|
|
|
|
|
|
|
Adrià Alcalà Mena,
Mercè Llabrés,
Francesc Rosselló and
Pau Rullan. Tree-Child Cluster Networks. In Fundamenta Informaticae, Vol. 134(1-2):1-15, 2014. Keywords: explicit network, from clusters, phylogenetic network, phylogeny, Program PhyloNetwork, reconstruction, tree-child network.
|
|
|
Sha Zhu,
James H. Degnan,
Sharyn J. Goldstein and
Bjarki Eldon. Hybrid-Lambda: simulation of multiple merger and Kingman gene genealogies in species networks and species trees. In BMCB, Vol. 16(292):1-7, 2015. Keywords: explicit network, from network, phylogenetic network, phylogeny, Program Hybrid-Lambda, simulation, software. Note: http://dx.doi.org/10.1186/s12859-015-0721-y.
|
|
|
Gergely J. Szöllösi,
Adrián Arellano Davín,
Eric Tannier,
Vincent Daubin and
Bastien Boussau. Genome-scale phylogenetic analysis finds extensive gene transfer among fungi. In Philosophical Transactions of the Royal Society of London B: Biological Sciences, Vol. 370(1678):1-11, 2015. Keywords: duplication, from sequences, lateral gene transfer, loss, phylogenetic network, phylogeny, Program ALE, reconstruction. Note: http://dx.doi.org/10.1098/rstb.2014.0335.
|
|
|
Marc Thuillard and
Didier Fraix-Burnet. Phylogenetic Trees and Networks Reduce to Phylogenies on Binary States: Does It Furnish an Explanation to the Robustness of Phylogenetic Trees against Lateral Transfers? In Evolutionary Bioinformatics, Vol. 11:213-221, 2015. [Abstract] Keywords: circular split system, explicit network, from multistate characters, outerplanar, perfect, phylogenetic network, phylogeny, planar, polynomial, reconstruction, split. Note: http://dx.doi.org/10.4137%2FEBO.S28158.
|
|
|
François Chevenet,
Jean-Philippe Doyon,
Celine Scornavacca,
Edwin Jacox,
Emmanuelle Jousselin and
Vincent Berry. SylvX: a viewer for phylogenetic tree reconciliations. In BIO, Vol. 32(4):608-610, 2016. Keywords: duplication, explicit network, from rooted trees, from species tree, lateral gene transfer, loss, phylogenetic network, phylogeny, Program SylvX, software, visualization. Note: https://www.researchgate.net/profile/Emmanuelle_Jousselin/publication/283446016_SylvX_a_viewer_for_phylogenetic_tree_reconciliations/links/5642146108aec448fa621efa.pdf.
|
|
|
|
|
|
|
|
|
Hussein A. Hejase and
Kevin J. Liu. A scalability study of phylogenetic network inference methods using empirical datasets and simulations involving a single reticulation. Vol. 17(422):1-12, 2016. Keywords: abstract network, evaluation, from sequences, phylogenetic network, phylogeny, Program PhyloNet, Program PhyloNetworks SNaQ, reconstruction, simulation, unicyclic network. Note: http://dx.doi.org/10.1186/s12859-016-1277-1.
|
|
|
Leo van Iersel,
Steven Kelk,
Giorgios Stamoulis,
Leen Stougie and
Olivier Boes. On unrooted and root-uncertain variants of several well-known phylogenetic network problems. In ALG, Vol. 80(11):2993-3022, 2018. Keywords: explicit network, FPT, from network, from unrooted trees, NP complete, phylogenetic network, phylogeny, reconstruction, tree containment. Note: https://hal.inria.fr/hal-01599716.
|
|
|
|
|
|
|
Philippe Gambette,
Leo van Iersel,
Steven Kelk,
Fabio Pardi and
Celine Scornavacca. Do branch lengths help to locate a tree in a phylogenetic network? In BMB, Vol. 78(9):1773-1795, 2016. Keywords: branch length, explicit network, FPT, from network, from rooted trees, NP complete, phylogenetic network, phylogeny, pseudo-polynomial, time consistent network, tree containment, tree sibling network. Note: http://arxiv.org/abs/1607.06285.
|
|
|
|
|
|
|
|
|
Philippe Gambette,
Katharina Huber and
Guillaume Scholz. Uprooted Phylogenetic Networks. In BMB, Vol. 79(9):2022-2048, 2017. Keywords: circular split system, explicit network, from splits, galled tree, phylogenetic network, phylogeny, polynomial, reconstruction, split network, uniqueness. Note: http://arxiv.org/abs/1511.08387.
|
|
|
|
|
Maria Anaya,
Olga Anipchenko-Ulaj,
Aisha Ashfaq,
Joyce Chiu,
Mahedi Kaiser,
Max Shoji Ohsawa,
Megan Owen,
Ella Pavlechko,
Katherine St. John,
Shivam Suleria,
Keith Thompson and
Corrine Yap. On Determining if Tree-based Networks Contain Fixed Trees. In BMB, Vol. 78(5):961-969, 2016. Keywords: explicit network, FPT, NP complete, phylogenetic network, phylogeny, tree-based network. Note: http://arxiv.org/abs/1602.02739.
|
|
|
|
|
Jessica W. Leigh and
David Bryant. PopART: full-feature software for haplotype network construction. In Methods in Ecology and Evolution, Vol. 6(9):1110-1116, 2015. Keywords: abstract network, from sequences, haplotype network, MedianJoining, phylogenetic network, phylogeny, population genetics, Program PopART, Program TCS, software. Note: http://dx.doi.org/10.1111/2041-210X.12410.
|
|
|
Julia Matsieva,
Steven Kelk,
Celine Scornavacca,
Chris Whidden and
Dan Gusfield. A Resolution of the Static Formulation Question for the Problem of Computing the History Bound. In TCBB, Vol. 14(2):404-417, 2017. Keywords: ARG, explicit network, from sequences, minimum number, phylogenetic network, phylogeny.
|
|
|
Gabriel Cardona,
Joan Carles Pons and
Francesc Rosselló. A reconstruction problem for a class of phylogenetic networks with lateral gene transfers. In ALMOB, Vol. 10(28):1-15, 2015. Keywords: explicit network, from rooted trees, lateral gene transfer, phylogenetic network, phylogeny, Program LGTnetwork, reconstruction, software, tree-based network. Note: http://dx.doi.org/10.1186/s13015-015-0059-z.
|
|
|
|
|
James Oldman,
Taoyang Wu,
Leo van Iersel and
Vincent Moulton. TriLoNet: Piecing together small networks to reconstruct reticulate evolutionary histories. In MBE, Vol. 33(8):2151-2162, 2016. Keywords: explicit network, from subnetworks, from trinets, galled tree, phylogenetic network, phylogeny, Program LEV1ATHAN, Program TriLoNet, reconstruction.
|
|
|
|
|
|
|
|
|
|
|
Sha Zhu and
James H. Degnan. Displayed Trees Do Not Determine Distinguishability Under the Network Multispecies Coalescent. In SB, Vol. 66(2):283-298, 2017. Keywords: branch length, coalescent, explicit network, from network, likelihood, phylogenetic network, phylogeny, Program Hybrid-coal, Program Hybrid-Lambda, Program PhyloNet, software, uniqueness. Note: presentation available at https://www.youtube.com/watch?v=JLYGTfEZG7g.
|
|
|
|
|
Misagh Kordi and
Mukul S. Bansal. On the Complexity of Duplication-Transfer-Loss Reconciliation with Non-Binary Gene Trees. In TCBB, Vol. 14(3):587-599, 2017. Keywords: duplication, from rooted trees, from species tree, lateral gene transfer, loss, NP complete, phylogenetic network, phylogeny, reconstruction. Note: http://compbio.engr.uconn.edu/papers/Kordi_DTLreconciliationPreprint2015.pdf.
|
|
|
Andreas Gunawan,
Bhaskar DasGupta and
Louxin Zhang. A decomposition theorem and two algorithms for reticulation-visible networks. In Information and Computation, Vol. 252:161-175, 2017. Keywords: cluster containment, explicit network, from clusters, from network, from rooted trees, phylogenetic network, phylogeny, polynomial, reticulation-visible network, tree containment. Note: https://www.cs.uic.edu/~dasgupta/resume/publ/papers/Infor_Comput_IC4848_final.pdf.
|
|
|
Juan Wang. A Survey of Methods for Constructing Rooted Phylogenetic Networks. In PLoS ONE, Vol. 11(11):e0165834, 2016. Keywords: evaluation, explicit network, from clusters, phylogenetic network, phylogeny, Program BIMLR, Program Dendroscope, Program LNetwork, reconstruction, survey. Note: http://dx.doi.org/10.1371/journal.pone.0165834.
|
|
|
|
|
Andrew R. Francis,
Katharina Huber,
Vincent Moulton and
Taoyang Wu. Bounds for phylogenetic network space metrics. In JOMB, Vol. 76(5):1229-1248, 2018. Keywords: bound, distance between networks, from network, NNI distance, NNI moves, phylogenetic network, phylogeny, SPR distance, TBR distance. Note: https://arxiv.org/abs/1702.05609.
|
|
|
|
|
Andrew R. Francis,
Katharina Huber and
Vincent Moulton. Tree-based unrooted phylogenetic networks. In BMB, Vol. 80(2):404-416, 2018. Keywords: characterization, explicit network, NP complete, phylogenetic network, phylogeny, tree containment, tree-based network, unrooted tree-based network. Note: https://arxiv.org/abs/1704.02062.
|
|
|
|
|
|
|
Magnus Bordewich,
Simone Linz and
Charles Semple. Lost in space? Generalising subtree prune and regraft to spaces of phylogenetic networks. In JTB, Vol. 423:1-12, 2017. Keywords: distance between networks, explicit network, phylogenetic network, phylogeny, reticulation-visible network, SPR distance, tree-based network, tree-child network. Note: https://simonelinz.files.wordpress.com/2017/04/bls171.pdf.
|
|
|
Magnus Bordewich,
Charles Semple and
Nihan Tokac. Constructing tree-child networks from distance matrices. In Algorithmica, Vol. 80(8):2240-2259, 2018. Keywords: compressed network, explicit network, from distances, phylogenetic network, phylogeny, polynomial, reconstruction, tree-child network, uniqueness. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BSN17.pdf.
|
|
|
Leo van Iersel,
Steven Kelk and
Celine Scornavacca. Kernelizations for the hybridization number problem on multiple nonbinary trees. In JCSS, Vol. 82(6):1075-1089, 2016. Keywords: explicit network, from rooted trees, kernelization, minimum number, phylogenetic network, phylogeny, Program Treeduce, reconstruction. Note: https://arxiv.org/abs/1311.4045v3.
|
|
|
Celine Scornavacca,
Joan Carles Pons and
Gabriel Cardona. Fast algorithm for the reconciliation of gene trees and LGT networks. In JTB, Vol. 418:129-137, 2017. Keywords: duplication, explicit network, from network, from rooted trees, lateral gene transfer, LGT network, loss, parsimony, phylogenetic network, phylogeny, polynomial, reconstruction.
|
|
|
Leo van Iersel,
Vincent Moulton,
Eveline De Swart and
Taoyang Wu. Binets: fundamental building blocks for phylogenetic networks. In BMB, Vol. 79(5):1135-1154, 2017. Keywords: approximation, explicit network, from binets, from subnetworks, galled tree, level k phylogenetic network, NP complete, phylogenetic network, phylogeny, reconstruction. Note: http://dx.doi.org/10.1007/s11538-017-0275-4.
|
|
|
|
|
Philippe Gambette,
Andreas Gunawan,
Anthony Labarre,
Stéphane Vialette and
Louxin Zhang. Solving the Tree Containment Problem in Linear Time for Nearly Stable Phylogenetic Networks. In DAM, Vol. 246:62-79, 2018. Keywords: explicit network, from network, from rooted trees, nearly-stable network, phylogenetic network, phylogeny, polynomial, tree containment. Note: https://hal-upec-upem.archives-ouvertes.fr/hal-01575001/en/.
|
|
|
Philippe Gambette,
Leo van Iersel,
Mark Jones,
Manuel Lafond,
Fabio Pardi and
Celine Scornavacca. Rearrangement Moves on Rooted Phylogenetic Networks. In PLoS Computational Biology, Vol. 13(8):e1005611.1-21, 2017. Keywords: distance between networks, explicit network, from network, NNI distance, NNI moves, phylogenetic network, phylogeny, SPR distance. Note: https://hal-upec-upem.archives-ouvertes.fr/hal-01572624/en/.
|
|
|
|
|
Remie Janssen,
Mark Jones,
Péter L. Erdös,
Leo van Iersel and
Celine Scornavacca. Exploring the tiers of rooted phylogenetic network space using tail moves. In BMB, Vol. 80(8):2177-2208, 2018. Keywords: distance between networks, explicit network, from network, NNI moves, orientation, phylogenetic network, phylogeny, SPR distance. Note: https://arxiv.org/abs/1708.07656.
|
|
|
Joan Carles Pons,
Charles Semple and
Mike Steel. Tree-based networks: characterisations, metrics, and support trees. In JOMB, Vol. 78(4):899-918, 2019. Keywords: characterization, explicit network, from network, phylogenetic network, phylogeny, time consistent network, tree-based network. Note: https://arxiv.org/abs/1710.07836.
|
|
|
Claudia Solís-Lemus,
Paul Bastide and
Cécile Ané. PhyloNetworks: A Package for Phylogenetic Networks. In MBE, Vol. 34(12):3292-3298, 2017. Keywords: from sequences, from trees, likelihood, phylogenetic network, phylogeny, Program PhyloNetworks SNaQ, reconstruction, software. Note: https://doi.org/10.1093/molbev/msx235.
|
|
|
|
|
Sarah Bastkowski,
Daniel Mapleson,
Andreas Spillner,
Taoyang Wu,
Monika Balvociute and
Vincent Moulton. SPECTRE: a Suite of PhylogEnetiC Tools for Reticulate Evolution. In BIO, Vol. 34(6):1057-1058, 2018. Keywords: abstract network, NeighborNet, phylogenetic network, phylogeny, Program FlatNJ, Program QNet, Program SplitsTree, reconstruction, software, split network. Note: https://doi.org/10.1101/169177.
|
|
|
Paul Bastide,
Claudia Solís-Lemus,
Ricardo Kriebel,
Kenneth William Sparks and
Cécile Ané. Phylogenetic Comparative Methods on Phylogenetic Networks with Reticulations. In SB, Vol. 67(5):800-820, 2018. Keywords: ancestral trait reconstruction, from network, likelihood, Program PhyloNetworks SNaQ, software, statistical model, statistical test. Note: https://doi.org/10.1101/194050.
|
|
|
Katharina Huber,
Vincent Moulton,
Charles Semple and
Taoyang Wu. Quarnet inference rules for level-1 networks. In BMB, Vol. 80:2137-2153, 2018. Keywords: explicit network, from quarnets, from subnetworks, galled tree, level k phylogenetic network, phylogenetic network, phylogeny, reconstruction. Note: https://arxiv.org/abs/1711.06720.
|
|
|
|
|
Klaus Schliep,
Alastair J. Potts,
David A. Morrison and
Guido W. Grimm. Intertwining phylogenetic trees and networks. In Methods in Ecology and Evolution, Vol. 8(10):1212-1220, 2017. Keywords: abstract network, from network, from unrooted trees, phylogenetic network, phylogeny, split network, visualization. Note: http://dx.doi.org/10.1111/2041-210X.12760.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Janosch Döcker and
Simone Linz. On the existence of a cherry-picking sequence. In TCS, Vol. 714:36-50, 2018. Keywords: cherry-picking, explicit network, from rooted trees, NP complete, phylogenetic network, phylogeny, reconstruction, temporal-hybridization number, time consistent network, tree-child network. Note: https://arxiv.org/abs/1712.04127.
|
|
|
|
|
|
|
|
|
Leo van Iersel,
Mark Jones and
Celine Scornavacca. Improved maximum parsimony models for phylogenetic networks. In SB, Vol. 67(3):518-542, 2018. Keywords: explicit network, FPT, from sequences, NP complete, parsimony, phylogenetic network, phylogeny, reconstruction, weakly displaying. Note: https://leovaniersel.files.wordpress.com/2017/12/improved_parsimony_networks.pdf.
|
|
|
|
|
Janosch Döcker,
Leo van Iersel,
Steven Kelk and
Simone Linz. Deciding the existence of a cherry-picking sequence is hard on two trees. In DAM, Vol. 260:131-143, 2019. Keywords: cherry-picking, explicit network, hybridization, minimum number, NP complete, phylogenetic network, phylogeny, reconstruction, temporal-hybridization number, time consistent network, tree-child network. Note: https://arxiv.org/abs/1712.02965.
|
|
|
|
|
|
|
Magnus Bordewich,
Katharina Huber,
Vincent Moulton and
Charles Semple. Recovering normal networks from shortest inter-taxa distance information. In JOMB, Vol. 77(3):571-594, 2018. Keywords: explicit network, from distances, normal network, phylogenetic network, phylogeny, polynomial, reconstruction, uniqueness. Note: http://www.math.canterbury.ac.nz/~c.semple/papers/BHMS18.pdf.
|
|
|
Chi Zhang,
Huw A. Ogilvie,
Alexei J. Drummond and
Tanja Stadler. Bayesian Inference of Species Networks from Multilocus Sequence Data. In MBE, Vol. 35(2):504-517, 2018. Keywords: bayesian, explicit network, from sequences, phylogenetic network, phylogeny, reconstruction, statistical model. Note: https://dx.doi.org/10.1093/molbev/msx307.
|
|
|
Edwin Jacox,
Cédric Chauve,
Gergely J. Szöllösi,
Yann Ponty and
Celine Scornavacca. EcceTERA: comprehensive gene tree-species tree reconciliation using parsimony. In BIO, Vol. 32(13):2056-2058, 2016. Keywords: duplication, explicit network, from rooted trees, from species tree, lateral gene transfer, loss, parsimony, phylogenetic network, phylogeny, polynomial, Program ecceTERA. Note: https://doi.org/10.1093/bioinformatics/btw105.
|
|
|
Edwin Jacox,
Mathias Weller,
Eric Tannier and
Celine Scornavacca. Resolution and reconciliation of non-binary gene trees with transfers, duplications and losses. In BIO, Vol. 33(7):980-987, 2017. Keywords: duplication, explicit network, FPT, from rooted trees, from species tree, lateral gene transfer, loss, phylogenetic network, phylogeny, reconstruction. Note: http://dx.doi.org/10.1093/bioinformatics/btw778.
|
|
|
|
|
|
|
|
|
|
|
|
|
Andreas Gunawan,
Hongwei Yan and
Louxin Zhang. Compression of Phylogenetic Networks and Algorithm for the Tree Containment Problem. In JCB, Vol. 25(3), 2019. Keywords: explicit network, phylogenetic network, phylogeny, polynomial, quasi-reticulation-visible network, reticulation-visible network, tree containment, tree-child network. Note: https://arxiv.org/abs/1806.07625.
|
|
|
Dingqiao Wen,
Yun Yu,
Jiafan Zhu and
Luay Nakhleh. Inferring Phylogenetic Networks Using PhyloNet. In SB, Vol. 67(4):735-740, 2018. Keywords: bayesian, likelihood, parsimony, phylogenetic network, phylogeny, Program PhyloNet, reconstruction, software.
|
|
|
Jiafan Zhu,
Dingqiao Wen,
Yun Yu,
Heidi M. Meudt and
Luay Nakhleh. Bayesian inference of phylogenetic networks from bi-allelic genetic markers. In PLoS Computational Biology, Vol. 14(1):e1005932.1-32, 2018. Keywords: bayesian, explicit network, from multistate characters, phylogenetic network, phylogeny, Program PhyloNet. Note: https://doi.org/10.1371/journal.pcbi.1005932.
|
|
|
|
|
Gabriel Cardona and
Louxin Zhang. Counting and Enumerating Tree-Child Networks and Their Subclasses. In JCSS, Vol. 114:84-104, 2020. Keywords: counting, enumeration, explicit network, galled network, galled tree, normal network, phylogenetic network, phylogeny, tree-child network.
|
|
|
|
|
|
|
|
|
Yukihiro Murakami,
Leo van Iersel,
Remie Janssen,
Mark Jones and
Vincent Moulton. Reconstructing Tree-Child Networks from Reticulate-Edge-Deleted Subnetworks. In BMB, Vol. 81:3823-3863, 2019. Keywords: from subnetworks, level k phylogenetic network, phylogenetic network, phylogeny, reconstruction, tree-child network, uniqueness, valid network. Note: https://doi.org/10.1007/s11538-019-00641-w.
|
|
|
|
|
Juan Wang and
Maozu Guo. A review of metrics measuring dissimilarity for rooted phylogenetic networks. In Briefings in Bioinformatics, Vol. 20(6):1972-1980, 2019. Keywords: distance between networks, explicit network, from network, mu distance, phylogenetic network, phylogeny, survey, tree sibling network, tree-child network.
|
|
|
|
|
Leo van Iersel,
Remie Janssen,
Mark Jones,
Yukihiro Murakami and
Norbert Zeh. Polynomial-Time Algorithms for Phylogenetic Inference Problems Involving Duplication and Reticulation. In TCBB, Vol. 17(1):14-26, 2020. Keywords: hybridization, minimum number, parental hybridization, phylogenetic network, phylogeny, reconstruction, weakly displaying. Note: http://pure.tudelft.nl/ws/portalfiles/portal/71270795/08798653.pdf.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hadi Poormohammadi,
Mohsen Sardari Zarchi and
Hossein Ghaneai. NCHB: A method for constructing rooted phylogenetic networks from rooted triplets based on height function and binarization. In JTB, Vol. 489(110144), 2020. Keywords: explicit network, from triplets, heuristic, phylogenetic network, phylogeny, Program Simplistic, Program TripNet, reconstruction. Note: https://doi.org/10.1016/j.jtbi.2019.110144.
|
|
|
|
|
Gabriel Cardona,
Joan Carles Pons and
Celine Scornavacca. Generation of Binary Tree-Child phylogenetic networks. In PLoS Computational Biology, Vol. 15(10):e1007440.1-29, 2019. Keywords: enumeration, explicit network, generation, phylogenetic network, phylogeny, Program PhyloNetwork, Program TCGenerators, software, tree-child network. Note: https://doi.org/10.1371/journal.pcbi.1007440.
|
|
|
|
|
|
|
|
|
Hans-Jürgen Bandelt. Phylogenetic Networks. In Verhandlungen des Naturwissenschaftlichen Vereins Hamburg, Vol. 34:51-71, 1994.
|
|
|
Sergey Bereg and
Kathryn Bean. Constructing Phylogenetic Networks from Trees. In BIBE05, Pages 299-305, 2005. Keywords: evaluation, from distances, phylogenetic network, phylogeny, Program SplitsTree, Program T REX, reconstruction, split, split network. Note: http://dx.doi.org/10.1109/BIBE.2005.19.
Toggle abstract
We present a new method of constructing a phylogenetic network from a given phylogenetic tree. It is based on a procedure that locally improves the tree. The procedure is quite general and can be applied to phylogenetic networks. By repeating local improvements user can introduce a given number of recombination cycles. A sequence of networks with decreasing distance deviation can be generated. The algorithm is efficient and shows a good performance on an example with plants. This is due to the fact that the update in every step is local and optimal. © 2005 IEEE.
|
|
|
Sergey Bereg and
Yuanyi Zhang. Phylogenetic Networks Based on the Molecular Clock Hypothesis. In BIBE05, Pages 320-323, 2005. Note: http://dx.doi.org/10.1109/BIBE.2005.46.
Toggle abstract
A classical result in phylogenetic trees is that a binary phylogenetic tree adhering to the molecular clock hypothesis exists if and only if the matrix of distances between taxa is ultrametric. The ultrametric condition is very restrictive. In this paper we study phylogenetic networks that can be constructed assuming the molecular clock hypothesis. We characterize distance matrices that admit such networks for 3 and 4 taxa. We design an efficient algorithm for a special class of phylogenetic networks that can detect the existence of a network and constructs it. © 2005 IEEE.
|
|
|
|
|
David Bryant and
Vincent Moulton. Neighbor-Net: An Agglomerative Method for the Construction of Planar Phylogenetic Networks. In WABI02, Vol. 2452:375-391 of LNCS, springer, 2002. Keywords: abstract network, circular split system, from distances, NeighborNet, phylogenetic network, phylogeny, Program SplitsTree, reconstruction, split network. Note: http://dx.doi.org/10.1007/3-540-45784-4_28.
|
|
|
|
|
Charles Choy,
Jesper Jansson,
Kunihiko Sadakane and
Wing-Kin Sung. Computing the maximum agreement of phylogenetic networks. In Proceedings of Computing: the Tenth Australasian Theory Symposium (CATS'04), Vol. 91:134-147 of Electronic Notes in Theoretical Computer Science, 2004. Keywords: dynamic programming, FPT, level k phylogenetic network, MASN, NP complete, phylogenetic network, phylogeny. Note: http://www.df.lth.se/~jj/Publications/masn6_CATS2004.pdf.
Toggle abstract
"We introduce the maximum agreement phylogenetic subnetwork problem (MASN) of finding a branching structure shared by a set of phylogenetic networks. We prove that the problem is NP-hard even if restricted to three phylogenetic networks and give an O(n2)-time algorithm for the special case of two level-1 phylogenetic networks, where n is the number of leaves in the input networks and where N is called a level-f phylogenetic network if every biconnected component in the underlying undirected graph contains at most f nodes having indegree 2 in N. Our algorithm can be extended to yield a polynomial-time algorithm for two level-f phylogenetic networks N 1,N2 for any f which is upper-bounded by a constant; more precisely, its running time is O(|V(N1)|·|V(N 2)|·4f), where V(Ni) denotes the set of nodes of Ni. © 2004 Published by Elsevier B.V."
|
|
|
|
|
Elena Dubrova. Phylogenetic networks with edge-disjoint recombination cycles. In Proceedings of SPIE Bioengineered and Bioinspired Systems II (SPIE-BBS II), Vol. 5839:381-388, 2005. Keywords: galled tree, phylogenetic network, polynomial, site consistency. Note: http://dx.doi.org/10.1117/12.607910.
Toggle abstract
"Phylogenetic analysis is a branch of biology that establishes the evolutionary relationships between living organisms. The goal of phylogenetic analysis is to determine the order and approximate timing of speciation events in the evolution of a given set of species. Phylogenetic networks allow to represent evolutionary histories that include events like recombination and hybridization. In this paper, we introduce a class of phylogenetic networks called extended galled-trees in which recombination cycles share no edge. We show that the site consistency problem, which is NP-hard in general, can be solved in polynomial time for this class of phylogenetic networks."
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Trinh N. D. Huynh,
Jesper Jansson,
Nguyen Bao Nguyen and
Wing-Kin Sung. Constructing a Smallest Refining Galled Phylogenetic Network. In RECOMB05, Vol. 3500:265-280 of LNCS, springer, 2005. Keywords: from rooted trees, galled tree, NP complete, phylogenetic network, phylogeny, polynomial, Program SPNet, reconstruction. Note: http://www.df.lth.se/~jj/Publications/refining_gn3_RECOMB2005.pdf.
|
|
|
Daniel H. Huson,
Tobias Kloepper,
Peter J. Lockhart and
Mike Steel. Reconstruction of Reticulate Networks from Gene Trees. In RECOMB05, Vol. 3500:233-249 of LNCS, springer, 2005. Keywords: from rooted trees, from splits, phylogenetic network, phylogeny, reconstruction, split, split network, visualization. Note: http://dx.doi.org/10.1007/11415770_18.
|
|
|
|
|
|
|
Daniel H. Huson and
Tobias Kloepper. Computing recombination networks from binary sequences. In ECCB05, Vol. 21(suppl. 2):ii159-ii165 of BIO, 2005. Keywords: from sequences, phylogenetic network, phylogeny, recombination. Note: http://dx.doi.org/10.1093/bioinformatics/bti1126.
Toggle abstract
"Motivation:Phylogenetic networks are becoming an important tool in molecular evolution, as the evolutionary role of reticulate events, such as hybridization, horizontal gene transfer and recombination, is becoming more evident, and as the available data is dramatically increasing in quantity and quality. Results: This paper addresses the problem of computing a most parsimonious recombination network for an alignment of binary sequences that are assumed to have arisen under the 'infinite sites' model of evolution with recombinations. Using the concept of a splits network as the underlying datastructure, this paper shows how a recent method designed for the computation of hybridization networks can be extended to also compute recombination networks. A robust implementation of the approach is provided and is illustrated using a number of real biological datasets. © The Author 2005. Published by Oxford University Press. All rights reserved."
|
|
|
Daniel H. Huson and
Tobias Kloepper. Beyond Galled Trees - Decomposition and Computation of Galled Networks. In RECOMB07, Vol. 4453:211-225 of LNCS, springer, 2007. Keywords: FPT, from splits, from trees, galled network, phylogenetic network, phylogeny, Program SplitsTree, reconstruction. Note: http://dx.doi.org/10.1007/978-3-540-71681-5_15, errata..
|
|
|
Leo van Iersel,
Judith Keijsper,
Steven Kelk,
Leen Stougie,
Ferry Hagen and
Teun Boekhout. Constructing level-2 phylogenetic networks from triplets. In RECOMB08, Vol. 4955:450-462 of LNCS, springer, 2008. Keywords: explicit network, from triplets, level k phylogenetic network, NP complete, phylogenetic network, phylogeny, polynomial, Program Level2, reconstruction. Note: http://homepages.cwi.nl/~iersel/level2full.pdf. An appendix with proofs can be found here http://arxiv.org/abs/0707.2890.
Toggle abstract
"Jansson and Sung showed that, given a dense set of input triplets T (representing hypotheses about the local evolutionary relationships of triplets of taxa), it is possible to determine in polynomial time whether there exists a level-1 network consistent with T, and if so, to construct such a network [24]. Here, we extend this work by showing that this problem is even polynomial time solvable for the construction of level-2 networks. This shows that, assuming density, it is tractable to construct plausible evolutionary histories from input triplets even when such histories are heavily nontree-like. This further strengthens the case for the use of triplet-based methods in the construction of phylogenetic networks. We also implemented the algorithm and applied it to yeast data. © 2009 IEEE."
|
|
|
|
|
Jesper Jansson and
Wing-Kin Sung. Inferring a level-1 phylogenetic network from a dense set of rooted triplets. In COCOON04, Vol. 3106:462-471 of LNCS, springer, 2004. Keywords: explicit network, from triplets, galled tree, level k phylogenetic network, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://www.df.lth.se/~jj/Publications/ipnrt6_COCOON2004.pdf.
|
|
|
Jesper Jansson,
Nguyen Bao Nguyen and
Wing-Kin Sung. Algorithms for Combining Rooted Triplets into a Galled Phylogenetic Network. In SODA05, Pages 349-358, 2005. Keywords: approximation, explicit network, from triplets, galled tree, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://portal.acm.org/citation.cfm?id=1070481.
|
|
|
|
|
Guohua Jin,
Luay Nakhleh,
Sagi Snir and
Tamir Tuller. A New Linear-time Heuristic Algorithm for Computing the Parsimony Score of Phylogenetic Networks: Theoretical Bounds and Empirical Performance. In ISBRA07, Vol. 4463:61-72 of LNCS, springer, 2007. Keywords: approximation, heuristic, parsimony, phylogenetic network, phylogeny, Program Nepal. Note: http://www.cs.rice.edu/~nakhleh/Papers/isbra07.pdf.
|
|
|
|
|
|
|
|
|
Rune Lyngsø,
Yun S. Song and
Jotun Hein. Minimum Recombination Histories by Branch and Bound. In WABI05, Vol. 3692:239-250 of LNCS, springer, 2005. Keywords: ARG, branch and bound, from sequences, minimum number, Program Beagle, recombination, reconstruction, software. Note: http://www.cs.ucdavis.edu/~yssong/Pub/WABI05-239.pdf.
|
|
|
Luay Nakhleh and
Li-San Wang. Phylogenetic Networks, Trees, and Clusters. In IWBRA05, Vol. 3515:919-926 of LNCS, springer, 2005. Keywords: cluster containment, evaluation, from clusters, from network, from rooted trees, phylogenetic network, phylogeny, polynomial, tree containment, tree-child network. Note: http://www.cs.rice.edu/~nakhleh/Papers/NakhlehWang.pdf.
|
|
|
Luay Nakhleh and
Li-San Wang. Phylogenetic Networks: Properties and Relationship to Trees and Clusters. In TCSB2, Vol. 3680:82-99 of LNCS, springer, 2005. Keywords: cluster containment, evaluation, from clusters, from network, from rooted trees, phylogenetic network, phylogeny, polynomial, tree containment, tree-child network. Note: http://www.cs.rice.edu/~nakhleh/Papers/LNCS_TCSB05.pdf.
|
|
|
|
|
|
|
|
|
|
|
Luay Nakhleh,
Jerry Sun,
Tandy Warnow,
C. Randal Linder,
Bernard M. E. Moret and
Anna Tholse. Towards the Development of Computational Tools for Evaluating Phylogenetic Network Reconstruction Methods. In PSB03, 2003. Keywords: distance between networks, evaluation, phylogenetic network, phylogeny, polynomial, tripartition distance. Note: http://www.cs.rice.edu/~nakhleh/Papers/psb03.pdf.
|
|
|
Luay Nakhleh,
Tandy Warnow and
C. Randal Linder. Reconstructing reticulate evolution in species - theory and practice. In RECOMB04, Pages 337-346, 2004. Keywords: from rooted trees, galled tree, phylogenetic network, phylogeny, polynomial, Program SPNet, reconstruction, software. Note: http://www.cs.rice.edu/~nakhleh/Papers/144-nakhleh.pdf.
|
|
|
|
|
Yun S. Song,
Yufeng Wu and
Dan Gusfield. Efficient computation of close lower and upper bounds on the minimum number of recombinations in biological sequence evolution. In ISMB05, Vol. 21:i413-i422 of BIO, 2005. Keywords: integer linear programming, minimum number, Program HapBound, Program SHRUB, recombination. Note: http://dx.doi.org/10.1093/bioinformatics/bti1033.
Toggle abstract
"Motivation: We are interested in studying the evolution of DNA single nucleotide polymorphism sequences which have undergone (meiotic) recombination. For a given set of sequences, computing the minimum number of recombinations needed to explain the sequences (with one mutation per site) is a standard question of interest, but it has been shown to be NP-hard, and previous algorithms that compute it exactly work either only on very small datasets or on problems with special structure. Results: In this paper, we present efficient, practical methods for computing both upper and lower bounds on the minimum number of needed recombinations, and for constructing evolutionary histories that explain the input sequences. We study in detail the efficiency and accuracy of these algorithms on both simulated and real data sets. The algorithms produce very close upper and lower bounds, which match exactly in a surprisingly wide range of data. Thus, with the use of new, very effective lower bounding methods and an efficient algorithm for computing upper bounds, this approach allows the efficient, exact computation of the minimum number of needed recombinations, with high frequency in a large range of data. When upper and lower bounds match, evolutionary histories found by our algorithm correspond to the most parsimonious histories. © The Author 2005. Published by Oxford University Press. All rights reserved."
|
|
|
Cuong Than,
Derek Ruths,
Hideki Innan and
Luay Nakhleh. Identifiability Issues in Phylogeny-Based Detection of Horizontal Gene Transfer. In Proceedings of the Fourth RECOMB Comparative Genomics Satellite Workshop (RECOMB-CG'06), Vol. 4205:215-229 of LNCS, springer, 2006. Keywords: explicit network, from rooted trees, from species tree, lateral gene transfer, phylogenetic network, phylogeny, Program LatTrans, Program PhyloNet. Note: http://www.cs.rice.edu/~nakhleh/Papers/recombcg06-final.pdf.
|
|
|
|
|
|
|
|
|
|
|
Lusheng Wang,
Kaizhong Zhang and
Louxin Zhang. Perfect phylogenetic networks with recombination. In SAC01, Pages 46-50, 2001. Keywords: from sequences, galled tree, NP complete, perfect, phylogenetic network, phylogeny, polynomial, recombination, reconstruction. Note: http://dx.doi.org/10.1145/372202.372271.
|
|
|
Bhaskar DasGupta,
Sergio Ferrarini,
Uthra Gopalakrishnan and
Nisha Raj Paryani. Inapproximability results for the lateral gene transfer problem. In Proceedings of the Ninth Italian Conference on Theoretical Computer Science (ICTCS'05), Pages 182-195, springer, 2005. Keywords: approximation, from rooted trees, from species tree, inapproximability, lateral gene transfer, parsimony, phylogenetic network, phylogeny. Note: http://www.cs.uic.edu/~dasgupta/resume/publ/papers/ictcs-final.pdf.
|
|
|
Mike Hallett,
Jens Lagergren and
Ali Tofigh. Simultaneous Identification of Duplications and Lateral Transfers. In RECOMB04, Pages 347-356, 2004. Keywords: duplication, explicit network, FPT, from rooted trees, from species tree, lateral gene transfer, loss, NP complete, parsimony, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://www.nada.kth.se/~jensl/p164-hallett.pdf.
|
|
|
|
|
|
|
|
|
Rune Lyngsø,
Yun S. Song and
Jotun Hein. Accurate Computation of Likelihoods in the Coalescent with Recombination via Parsimony. In RECOMB08, Vol. 4955:463-477 of LNCS, springer, 2008. Keywords: coalescent, likelihood, phylogenetic network, phylogeny, recombination, statistical model. Note: http://dx.doi.org/10.1007/978-3-540-78839-3_41.
Toggle abstract
"Understanding the variation of recombination rates across a given genome is crucial for disease gene mapping and for detecting signatures of selection, to name just a couple of applications. A widely-used method of estimating recombination rates is the maximum likelihood approach, and the problem of accurately computing likelihoods in the coalescent with recombination has received much attention in the past. A variety of sampling and approximation methods have been proposed, but no single method seems to perform consistently better than the rest, and there still is great value in developing better statistical methods for accurately computing likelihoods. So far, with the exception of some two-locus models, it has remained unknown how the true likelihood exactly behaves as a function of model parameters, or how close estimated likelihoods are to the true likelihood. In this paper, we develop a deterministic, parsimony-based method of accurately computing the likelihood for multi-locus input data of moderate size. We first find the set of all ancestral configurations (ACs) that occur in evolutionary histories with at most k crossover recombinations. Then, we compute the likelihood by summing over all evolutionary histories that can be constructed only using the ACs in that set. We allow for an arbitrary number of crossing over, coalescent and mutation events in a history, as long as the transitions stay within that restricted set of ACs. For given parameter values, by gradually increasing the bound k until the likelihood stabilizes, we can obtain an accurate estimate of the likelihood. At least for moderate crossover rates, the algorithm-based method described here opens up a new window of opportunities for testing and fine-tuning statistical methods for computing likelihoods. © 2008 Springer-Verlag Berlin Heidelberg."
|
|
|
|
|
|
|
|
|
|
|
Leo van Iersel and
Steven Kelk. Constructing the Simplest Possible Phylogenetic Network from Triplets. In ISAAC08, Vol. 5369:472-483 of LNCS, springer, 2008. Keywords: explicit network, from triplets, galled tree, level k phylogenetic network, minimum number, phylogenetic network, phylogeny, polynomial, Program Marlon, Program Simplistic. Note: http://arxiv.org/abs/0805.1859.
|
|
|
Cuong Than and
Luay Nakhleh. SPR-based Tree Reconciliation: Non-binary Trees and Multiple Solutions. In APBC08, Pages 251-260, 2008. Keywords: evaluation, from rooted trees, lateral gene transfer, phylogenetic network, phylogeny, Program LatTrans, Program PhyloNet, reconstruction, SPR distance. Note: http://www.cs.rice.edu/~nakhleh/Papers/apbc08.pdf.
|
|
|
Daniel H. Huson and
Regula Rupp. Summarizing Multiple Gene Trees Using Cluster Networks. In WABI08, Vol. 5251:296-305 of LNCS, springer, 2008. Keywords: abstract network, from clusters, from rooted trees, phylogenetic network, phylogeny, polynomial, Program Dendroscope. Note: http://dx.doi.org/10.1007/978-3-540-87361-7_25, slides from the MIEP Conference available at http://www.lirmm.fr/MIEP08/slides/11_13_rupp.pdf.
Toggle abstract
"The result of a multiple gene tree analysis is usually a number of different tree topologies that are each supported by a significant proportion of the genes. We introduce the concept of a cluster network that can be used to combine such trees into a single rooted network, which can be drawn either as a cladogram or phylogram. In contrast to split networks, which can grow exponentially in the size of the input, cluster networks grow only quadratically. A cluster network is easily computed using a modification of the tree-popping algorithm, which we call network-popping. The approach has been implemented as part of the Dendroscope tree-drawing program and its application is illustrated using data and results from three recent studies on large numbers of gene trees. © 2008 Springer-Verlag Berlin Heidelberg."
|
|
|
Lichen Bao and
Sergey Bereg. Clustered SplitsNetworks. In COCOA08, Vol. 5165:469-478 of LNCS, springer, 2008. Keywords: abstract network, from distances, NeighborNet, realization, reconstruction. Note: http://dx.doi.org/10.1007/978-3-540-85097-7_44, slides available at http://www.utdallas.edu/~besp/cocoa08talk.pdf.
Toggle abstract
"We address the problem of constructing phylogenetic networks using two criteria: the number of cycles and the fit value of the network. Traditionally the fit value is the main objective for evaluating phylogenetic networks. However, a small number of cycles in a network is desired and pointed out in several publications. We propose a new phylogenetic network called CS-network and a method for constructing it. The method is based on the well-known splitstree method. A CS-network contains a face which is k-cycle, k ≥ 3 (not as splitstree). We discuss difficulties of using non-parallelogram faces in splitstree networks. Our method involves clustering and optimization of weights of the network edges. The algorithm for constructing the underlying graph (except the optimization step) has a polynomial time. Experimental results show a good performance of our algorithm. © Springer-Verlag Berlin Heidelberg 2008."
|
|
|
|
|
|
|
Sagi Snir and
Tamir Tuller. Novel Phylogenetic Network Inference by Combining Maximum Likelihood and Hidden Markov Models. In WABI08, Vol. 5251:354-368 of LNCS, springer, 2008. Keywords: explicit network, from sequences, HMM, lateral gene transfer, likelihood, phylogenetic network, phylogeny, statistical model. Note: http://dx.doi.org/10.1007/978-3-540-87361-7_30.
Toggle abstract
"Horizontal Gene Transfer (HGT) is the event of transferring genetic material from one lineage in the evolutionary tree to a different lineage. HGT plays a major role in bacterial genome diversification and is a significant mechanism by which bacteria develop resistance to antibiotics. Although the prevailing assumption is of complete HGT, cases of partial HGT (which are also named chimeric HGT) where only part of a gene is horizontally transferred, have also been reported, albeit less frequently. In this work we suggest a new probabilistic model for analyzing and modeling phylogenetic networks, the NET-HMM. This new model captures the biologically realistic assumption that neighboring sites of DNA or amino acid sequences are not independent, which increases the accuracy of the inference. The model describes the phylogenetic network as a Hidden Markov Model (HMM), where each hidden state is related to one of the network's trees. One of the advantages of the NET-HMM is its ability to infer partial HGT as well as complete HGT. We describe the properties of the NET-HMM, devise efficient algorithms for solving a set of problems related to it, and implement them in software. We also provide a novel complementary significance test for evaluating the fitness of a model (NET-HMM) to a given data set. Using NET-HMM we are able to answer interesting biological questions, such as inferring the length of partial HGT's and the affected nucleotides in the genomic sequences, as well as inferring the exact location of HGT events along the tree branches. These advantages are demonstrated through the analysis of synthetical inputs and two different biological inputs. © 2008 Springer-Verlag Berlin Heidelberg."
|
|
|
Stefan Grünewald,
Andreas Spillner,
Kristoffer Forslund and
Vincent Moulton. Constructing Phylogenetic Supernetworks from Quartets. In WABI08, Vol. 5251:284-295 of LNCS, springer, 2008. Keywords: abstract network, from quartets, from unrooted trees, phylogenetic network, phylogeny, Program QNet, Program SplitsTree, reconstruction, split network. Note: http://dx.doi.org/10.1007/978-3-540-87361-7_24.
Toggle abstract
"In phylogenetics it is common practice to summarize collections of partial phylogenetic trees in the form of supertrees. Recently it has been proposed to construct phylogenetic supernetworks as an alternative to supertrees as these allow the representation of conflicting information in the trees, information that may not be representable in a single tree. Here we introduce SuperQ, a new method for constructing such supernetworks. It works by breaking the input trees into quartet trees, and stitching together the resulting set to form a network. The stitching process is performed using an adaptation of the QNet method for phylogenetic network reconstruction. In addition to presenting the new method, we illustrate the applicability of SuperQ to three data sets and discuss future directions for testing and development. © 2008 Springer-Verlag Berlin Heidelberg."
|
|
|
Gabriel Cardona,
Mercè Llabrés,
Francesc Rosselló and
Gabriel Valiente. Phylogenetic Networks: Justification, Models, Distances and Algorithms. In VI Jornadas de Matemática Discreta y Algorítmica (JMDA'08), 2008. Keywords: distance between networks, mu distance, phylogenetic network, phylogeny, polynomial, survey, time consistent network, tree-child network, tripartition distance, triplet distance. Note: http://bioinfo.uib.es/media/uploaded/jmda2008_submission_61-1.pdf.
|
|
|
Ernst Althaus and
Rouven Naujoks. Reconstructing Phylogenetic Networks with One Recombination. In Proceedings of the seventh International Workshop on Experimental Algorithms (WEA'08), Vol. 5038:275-288 of LNCS, springer, 2008. Keywords: enumeration, explicit network, exponential algorithm, from sequences, generation, parsimony, phylogenetic network, phylogeny, reconstruction, unicyclic network. Note: http://dx.doi.org/10.1007/978-3-540-68552-4_21.
Toggle abstract
"In this paper we propose a new method for reconstructing phylogenetic networks under the assumption that recombination events have occurred rarely. For a fixed number of recombinations, we give a generalization of the maximum parsimony criterion. Furthermore, we describe an exact algorithm for one recombination event and show that in this case our method is not only able to identify the recombined sequence but also to reliably reconstruct the complete evolutionary history. © 2008 Springer-Verlag Berlin Heidelberg."
|
|
|
Cuong Than,
Guohua Jin and
Luay Nakhleh. Integrating Sequence and Topology for Efficient and Accurate Detection of Horizontal Gene Transfer. In Proceedings of the Sixth RECOMB Comparative Genomics Satellite Workshop (RECOMB-CG'08), Vol. 5267:113-127 of LNCS, springer, 2008. Keywords: bootstrap, explicit network, from rooted trees, from sequences, lateral gene transfer, phylogenetic network, phylogeny, Program Nepal, Program PhyloNet, reconstruction. Note: http://www.cs.rice.edu/~nakhleh/Papers/recombcg-08.pdf, slides available at http://igm.univ-mlv.fr/RCG08/RCG08slides/Cuong_Than_RCG08.pdf.
|
|
|
|
|
Pawel Górecki. Reconciliation problems for duplication, loss and horizontal gene transfer. In RECOMB04, Pages 316-325, 2004. Keywords: duplication, explicit network, from rooted trees, from species tree, lateral gene transfer, loss, NP complete, parsimony, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://ai.stanford.edu/~serafim/CS374_2004/Papers/Gorecki_Reconciliation.pdf.
|
|
|
Pawel Górecki. Single step reconciliation algorithm for duplication, loss and horizontal gene transfer model. In ECCB03, 2003. Keywords: duplication, explicit network, from rooted trees, from species tree, lateral gene transfer, NP complete, parsimony, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://www.inra.fr/eccb2003/posters/pdf/short/S_gorecki.ps.
|
|
|
Bin Ma,
Lusheng Wang and
Ming Li. Fixed topology alignment with recombination. In CPM98, Vol. 1448:174-188 of LNCS, springer, 1998. Keywords: approximation, explicit network, from network, from sequences, galled tree, inapproximability, phylogenetic network, phylogeny, recombination. Note: http://dx.doi.org/10.1007/BFb0030789.
|
|
|
Thu-Hien To and
Michel Habib. Level-k Phylogenetic Networks Are Constructable from a Dense Triplet Set in Polynomial Time. In CPM09, (5577):275-288, springer, 2009. Keywords: explicit network, from triplets, level k phylogenetic network, minimum number, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://arxiv.org/abs/0901.1657.
Toggle abstract
"For a given dense triplet set Τ there exist two natural questions [7]: Does there exist any phylogenetic network consistent with Τ? In case such networks exist, can we find an effective algorithm to construct one? For cases of networks of levels k = 0, 1 or 2, these questions were answered in [1,6,7,8,10] with effective polynomial algorithms. For higher levels k, partial answers were recently obtained in [11] with an O(/Τ/k+1)time algorithm for simple networks. In this paper, we give a complete answer to the general case, solving a problem proposed in [7]. The main idea of our proof is to use a special property of SN-sets in a level-k network. As a consequence, for any fixed k, we can also find a level-k network with the minimum number of reticulations, if one exists, in polynomial time. © 2009 Springer Berlin Heidelberg."
|
|
|
Philippe Gambette,
Vincent Berry and
Christophe Paul. The structure of level-k phylogenetic networks. In CPM09, Vol. 5577:289-300 of LNCS, springer, 2009. Keywords: coalescent, explicit network, galled tree, level k phylogenetic network, phylogenetic network, Program Recodon. Note: http://hal-lirmm.ccsd.cnrs.fr/lirmm-00371485/en/.
Toggle abstract
"Evolution is usually described as a phylogenetic tree, but due to some exchange of genetic material, it can be represented as a phylogenetic network which has an underlying tree structure. The notion of level was recently introduced as a parameter on realistic kinds of phylogenetic networks to express their complexity and tree-likeness. We study the structure of level-k networks, and how they can be decomposed into level-k generators. We also provide a polynomial time algorithm which takes as input the set of level-k generators and builds the set of level-(k + 1) generators. Finally, with a simulation study, we evaluate the proportion of level-k phylogenetic networks among networks generated according to the coalescent model with recombination. © 2009 Springer Berlin Heidelberg."
|
|
|
Daniel H. Huson,
Regula Rupp,
Vincent Berry,
Philippe Gambette and
Christophe Paul. Computing Galled Networks from Real Data. In ISMBECCB09, Vol. 25(12):i85-i93 of BIO, 2009. Keywords: abstract network, cluster containment, explicit network, FPT, from clusters, from rooted trees, galled network, NP complete, phylogenetic network, phylogeny, polynomial, Program Dendroscope, reconstruction. Note: http://hal-lirmm.ccsd.cnrs.fr/lirmm-00368545/en/.
Toggle abstract
"Motivation: Developing methods for computing phylogenetic networks from biological data is an important problem posed by molecular evolution and much work is currently being undertaken in this area. Although promising approaches exist, there are no tools available that biologists could easily and routinely use to compute rooted phylogenetic networks on real datasets containing tens or hundreds of taxa. Biologists are interested in clades, i.e. groups of monophyletic taxa, and these are usually represented by clusters in a rooted phylogenetic tree. The problem of computing an optimal rooted phylogenetic network from a set of clusters, is hard, in general. Indeed, even the problem of just determining whether a given network contains a given cluster is hard. Hence, some researchers have focused on topologically restricted classes of networks, such as galled trees and level-k networks, that are more tractable, but have the practical draw-back that a given set of clusters will usually not possess such a representation. Results: In this article, we argue that galled networks (a generalization of galled trees) provide a good trade-off between level of generality and tractability. Any set of clusters can be represented by some galled network and the question whether a cluster is contained in such a network is easy to solve. Although the computation of an optimal galled network involves successively solving instances of two different NP-complete problems, in practice our algorithm solves this problem exactly on large datasets containing hundreds of taxa and many reticulations in seconds, as illustrated by a dataset containing 279 prokaryotes. © 2009 The Author(s)."
|
|
|
Lichen Bao and
Sergey Bereg. Counting Faces in Split Networks. In ISBRA09, Vol. 5251:284-295 of LNCS, 2009. Note: http://dx.doi.org/10.1007/978-3-642-01551-9_12.
Toggle abstract
SplitsTree is a popular program for inferring and visualizing various phylogenetic networks including split networks. Split networks are useful for realizing metrics that are linear combinations of split metrics. We show that the realization is not unique in some cases and design an algorithm for computing split networks with minimum number of faces. We also prove that the minimum number of faces in a split network is equal to the number of pairs of incompatible splits.
|
|
|
|
|
|
|
|
|
Leo van Iersel,
Steven Kelk,
Regula Rupp and
Daniel H. Huson. Phylogenetic Networks Do not Need to Be Complex: Using Fewer Reticulations to Represent Conflicting Clusters. In ISMB10, Vol. 26(12):i124-i131 of BIO, 2010. Keywords: from clusters, level k phylogenetic network, Program Dendroscope, Program HybridInterleave, Program HybridNumber, reconstruction. Note: http://dx.doi.org/10.1093/bioinformatics/btq202, with proofs: http://arxiv.org/abs/0910.3082.
Toggle abstract
"Phylogenetic trees are widely used to display estimates of how groups of species are evolved. Each phylogenetic tree can be seen as a collection of clusters, subgroups of the species that evolved from a common ancestor. When phylogenetic trees are obtained for several datasets (e.g. for different genes), then their clusters are often contradicting. Consequently, the set of all clusters of such a dataset cannot be combined into a single phylogenetic tree. Phylogenetic networks are a generalization of phylogenetic trees that can be used to display more complex evolutionary histories, including reticulate events, such as hybridizations, recombinations and horizontal gene transfers. Here, we present the new CASS algorithm that can combine any set of clusters into a phylogenetic network. We show that the networks constructed by CASS are usually simpler than networks constructed by other available methods. Moreover, we show that CASS is guaranteed to produce a network with at most two reticulations per biconnected component, whenever such a network exists. We have implemented CASS and integrated it into the freely available Dendroscope software. Contact: l.j.j.v.iersel@gmail.com. Supplementary information: Supplementary data are available at Bioinformatics online. © The Author(s) 2010. Published by Oxford University Press."
|
|
|
|
|
|
|
Tetsuo Asano,
Jesper Jansson,
Kunihiko Sadakane,
Ryuhei Uehara and
Gabriel Valiente. Faster Computation of the Robinson-Foulds Distance between Phylogenetic Networks. In CPM10, Vol. 6129:190-201 of LNCS, springer, 2010. Keywords: distance between networks, explicit network, level k phylogenetic network, phylogenetic network, polynomial, spread. Note: http://hdl.handle.net/10119/9859, slides available at http://cs.nyu.edu/parida/CPM2010/MainPage_files/18.pdf.
Toggle abstract
"The Robinson-Foulds distance, which is the most widely used metric for comparing phylogenetic trees, has recently been generalized to phylogenetic networks. Given two networks N1,N2 with n leaves, m nodes, and e edges, the Robinson-Foulds distance measures the number of clusters of descendant leaves that are not shared by N1 and N2. The fastest known algorithm for computing the Robinson-Foulds distance between those networks runs in O(m(m + e)) time. In this paper, we improve the time complexity to O(n(m+ e)/ log n) for general networks and O(nm/log n) for general networks with bounded degree, and to optimal O(m + e) time for planar phylogenetic networks and boundedlevel phylogenetic networks.We also introduce the natural concept of the minimum spread of a phylogenetic network and show how the running time of our new algorithm depends on this parameter. As an example, we prove that the minimum spread of a level-k phylogenetic network is at most k + 1, which implies that for two level-k phylogenetic networks, our algorithm runs in O((k + 1)(m + e)) time. © Springer-Verlag Berlin Heidelberg 2010."
|
|
|
Yufeng Wu. Close Lower and Upper Bounds for the Minimum Reticulate Network of Multiple Phylogenetic Trees. In ISMB10, Vol. 26(12):i140-i148 of BIO, 2010. Keywords: explicit network, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, Program PIRN, software. Note: http://dx.doi.org/10.1093/bioinformatics/btq198.
Toggle abstract
"Motivation: Reticulate network is a model for displaying and quantifying the effects of complex reticulate processes on the evolutionary history of species undergoing reticulate evolution. A central computational problem on reticulate networks is: given a set of phylogenetic trees (each for some region of the genomes), reconstruct the most parsimonious reticulate network (called the minimum reticulate network) that combines the topological information contained in the given trees. This problem is well-known to be NP-hard. Thus, existing approaches for this problem either work with only two input trees or make simplifying topological assumptions. Results: We present novel results on the minimum reticulate network problem. Unlike existing approaches, we address the fully general problem: there is no restriction on the number of trees that are input, and there is no restriction on the form of the allowed reticulate network. We present lower and upper bounds on the minimum number of reticulation events in the minimum reticulate network (and infer an approximately parsimonious reticulate network). A program called PIRN implements these methods, which also outputs a graphical representation of the inferred network. Empirical results on simulated and biological data show that our methods are practical for a wide range of data. More importantly, the lower and upper bounds match for many datasets (especially when the number of trees is small or reticulation level is low), and this allows us to solve the minimum reticulate network problem exactly for these datasets. Availability: A software tool, PIRN, is available for download from the web page: http://www.engr.uconn.edu/ywu. Contact: ywu@engr.uconn.edu. Supplementary information: Supplementary data is available at Bioinformatics online. © The Author(s) 2010. Published by Oxford University Press."
|
|
|
Yufeng Wu and
Jiayin Wang. Fast Computation of the Exact Hybridization Number of Two Phylogenetic Trees. In ISBRA10, Vol. 6053:203-214 of LNCS, springer, 2010. Keywords: agreement forest, explicit network, from rooted trees, hybridization, integer linear programming, minimum number, phylogenetic network, phylogeny, Program HybridNumber, Program SPRDist, SPR distance. Note: http://www.engr.uconn.edu/~ywu/Papers/ISBRA10WuWang.pdf.
Toggle abstract
"Hybridization is a reticulate evolutionary process. An established problem on hybridization is computing the minimum number of hybridization events, called the hybridization number, needed in the evolutionary history of two phylogenetic trees. This problem is known to be NP-hard. In this paper, we present a new practical method to compute the exact hybridization number. Our approach is based on an integer linear programming formulation. Simulation results on biological and simulated datasets show that our method (as implemented in program SPRDist) is more efficient and robust than an existing method. © 2010 Springer-Verlag Berlin Heidelberg."
|
|
|
Jean-Philippe Doyon,
Celine Scornavacca,
Konstantin Yu Gorbunov,
Gergely J. Szöllösi,
Vincent Ranwez and
Vincent Berry. An efficient algorithm for gene/species trees parsimonious reconciliation with losses, duplications, and transfers. In Proceedings of the Eighth RECOMB Comparative Genomics Satellite Workshop (RECOMB-CG'10), Vol. 6398:93-108 of LNCS, springer, 2011. Keywords: branch length, duplication, dynamic programming, explicit network, from multilabeled tree, from species tree, from unrooted trees, lateral gene transfer, loss, phylogenetic network, phylogeny, polynomial, Program Mowgli, reconstruction. Note: http://www.lirmm.fr/~vberry/Publis/MPR-DoyonEtAl.pdf, software available at http://www.atgc-montpellier.fr/MPR/.
Toggle abstract
"Tree reconciliation methods aim at estimating the evolutionary events that cause discrepancy between gene trees and species trees. We provide a discrete computational model that considers duplications, transfers and losses of genes. The model yields a fast and exact algorithm to infer time consistent and most parsimonious reconciliations. Then we study the conditions under which parsimony is able to accurately infer such events. Overall, it performs well even under realistic rates, transfers being in general less accurately recovered than duplications. An implementation is freely available at http://www.atgc- montpellier.fr/MPR. © 2010 Springer-Verlag."
|
|
|
Mukul S. Bansal,
J. Peter Gogarten and
Ron Shamir. Detecting Highways of Horizontal Gene Transfer. In Proceedings of the Eighth RECOMB Comparative Genomics Satellite Workshop (RECOMB-CG'10), Vol. 6398:109-120 of LNCS, springer, 2011. Keywords: explicit network, from rooted trees, from species tree, lateral gene transfer, phylogenetic network, phylogeny, polynomial, reconstruction. Note: http://www.cs.iastate.edu/~bansal/Highways_RCG10.pdf.
Toggle abstract
"In a horizontal gene transfer (HGT) event a gene is transferred between two species that do not share an ancestor-descendant relationship. Typically, no more than a few genes are horizontally transferred between any two species. However, several studies identified pairs of species between which many different genes were horizontally transferred. Such a pair is said to be linked by a highway of gene sharing. We present a method for inferring such highways. Our method is based on the fact that the evolutionary histories of horizontally transferred genes disagree with the corresponding species phylogeny. Specifically, given a set of gene trees and a trusted rooted species tree, each gene tree is first decomposed into its constituent quartet trees and the quartets that are inconsistent with the species tree are identified. Our method finds a pair of species such that a highway between them explains the largest (normalized) fraction of inconsistent quartets. For a problem on n species, our method requires O(n 4) time, which is optimal with respect to the quartets input size. An application of our method to a dataset of 1128 genes from 11 cyanobacterial species, as well as to simulated datasets, illustrates the efficacy of our method. © 2010 Springer-Verlag."
|
|
|
|
|
Vincent Berry and
David Bryant. Faster reliable phylogenetic analysis. In RECOMB99, Pages 59-68, 1999. Keywords: abstract network, from quartets, phylogenetic network, phylogeny, polynomial, Program SplitsTree, reconstruction, split network, weakly compatible. Note: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.95.9151.
|
|
|
Hans-Jürgen Bandelt and
Andreas W. M. Dress. A relational approach to split decomposition. In
H.-H. Bock,
W. Lenski and
M. M. Richter editors, Information Systems and Data Analysis, Proceedings of the 17th Annual Conference of the Gesellschaft Für Klassifikation (GFKL93), Vol. 42:123-131 of Studies in Classification, Data Analysis, and Knowledge Organization, springer, 1994. Keywords: characterization, from quartets, phylogenetic network, weakly compatible.
|
|
|
Chris Whidden,
Robert G. Beiko and
Norbert Zeh. Fast FPT Algorithms for Computing Rooted Agreement Forests: Theory and Experiments. In Proceedings of the ninth International Symposium on Experimental Algorithms (SEA'10), Vol. 6049:141-153 of LNCS, springer, 2010. Keywords: agreement forest, explicit network, FPT, from rooted trees, hybridization, minimum number, phylogenetic network, phylogeny, Program HybridInterleave, reconstruction, SPR distance. Note: https://www.cs.dal.ca/sites/default/files/technical_reports/CS-2010-03.pdf.
Toggle abstract
"We improve on earlier FPT algorithms for computing a rooted maximum agreement forest (MAF) or a maximum acyclic agreement forest (MAAF) of a pair of phylogenetic trees. Their sizes give the subtree-prune-and-regraft (SPR) distance and the hybridization number of the trees, respectively. We introduce new branching rules that reduce the running time of the algorithms from O(3 kn) and O(3 kn log n) to O(2.42 kn) and O(2.42 kn log n), respectively. In practice, the speed up may be much more than predicted by the worst-case analysis.We confirm this intuition experimentally by computing MAFs for simulated trees and trees inferred from protein sequence data. We show that our algorithm is orders of magnitude faster and can handle much larger trees and SPR distances than the best previous methods, treeSAT and sprdist. © Springer-Verlag Berlin Heidelberg 2010."
|
|
|
Celine Scornavacca,
Franziska Zickmann and
Daniel H. Huson. Tanglegrams for Rooted Phylogenetic Trees and Networks. In ISMB11, Vol. 27(13):i248-i256 of BIO, 2011. Keywords: from network, heuristic, integer linear programming, phylogenetic network, phylogeny, Program Dendroscope, tanglegram, visualization. Note: http://dx.doi.org/10.1093/bioinformatics/btr210.
Toggle abstract
"Motivation: In systematic biology, one is often faced with the task of comparing different phylogenetic trees, in particular in multi-gene analysis or cospeciation studies. One approach is to use a tanglegram in which two rooted phylogenetic trees are drawn opposite each other, using auxiliary lines to connect matching taxa. There is an increasing interest in using rooted phylogenetic networks to represent evolutionary history, so as to explicitly represent reticulate events, such as horizontal gene transfer, hybridization or reassortment. Thus, the question arises how to define and compute a tanglegram for such networks. Results: In this article, we present the first formal definition of a tanglegram for rooted phylogenetic networks and present a heuristic approach for computing one, called the NN-tanglegram method. We compare the performance of our method with existing tree tanglegram algorithms and also show a typical application to real biological datasets. For maximum usability, the algorithm does not require that the trees or networks are bifurcating or bicombining, or that they are on identical taxon sets. © The Author(s) 2011. Published by Oxford University Press."
|
|
|
Hadas Birin,
Zohar Gal-Or,
Isaac Elias and
Tamir Tuller. Inferring Models of Rearrangements, Recombinations, and Horizontal Transfers by the Minimum Evolution Criterion. In WABI07, Vol. 4645:111-123 of LNCS, springer, 2007. Keywords: explicit network, from sequences, phylogenetic network, phylogeny, reconstruction. Note: http://safrabio.cs.tau.ac.il/download/Papers/Birin_et_al.pdf.
|
|
|
|
|
Hyun Jung Park and
Luay Nakhleh. MURPAR: A fast heuristic for inferring parsimonious phylogenetic networks from multiple gene trees. In ISBRA12, Vol. 7292:213-224 of LNCS, springer, 2012. Keywords: explicit network, from unrooted trees, heuristic, phylogenetic network, phylogeny, reconstruction, software. Note: https://www.researchgate.net/profile/Hyun_Jung_Park2/publication/262318595_MURPAR_A_Fast_Heuristic_for_Inferring_Parsimonious_Phylogenetic_Networks_from_Multiple_Gene_Trees/links/54b7e7b50cf269d8cbf58cc4.pdf.
Toggle abstract
"Phylogenetic networks provide a graphical representation of evolutionary histories that involve non-treelike evolutionary events, such as horizontal gene transfer (HGT). One approach for inferring phylogenetic networks is based on reconciling gene trees, assuming all incongruence among the gene trees is due to HGT. Several mathematical results and algorithms, both exact and heuristic, have been introduced to construct and analyze phylogenetic networks. Here, we address the computational problem of inferring phylogenetic networks with minimum reticulations from a collection of gene trees. As this problem is known to be NP-hard even for a pair of gene trees, the problem at hand is very hard. In this paper, we present an efficient heuristic, MURPAR, for inferring a phylogenetic network from a collection of gene trees by using pairwise reconciliations of trees in the collection. Given the development of efficient and accurate methods for pairwise gene tree reconciliations, MURPAR inherits this efficiency and accuracy. Further, the method includes a formulation for combining pairwise reconciliations that is naturally amenable to an efficient integer linear programming (ILP) solution. We show that MURPAR produces more accurate results than other methods and is at least as fast, when run on synthetic and biological data. We believe that our method is especially important for rapidly obtaining estimates of genome-scale evolutionary histories that can be further refined by more detailed and compute-intensive methods. © 2012 Springer-Verlag."
|
|
|
Pawel Górecki and
Jerzy Tiuryn. Inferring evolutionary scenarios in the duplication, loss and horizontal gene transfer model. In Logic and Program Semantics, Vol. 7230:83-105 of LNCS, springer, 2012. Keywords: duplication, explicit network, lateral gene transfer, loss, phylogenetic network, phylogeny, reconstruction. Note: http://dx.doi.org/10.1007/978-3-642-29485-3_7.
Toggle abstract
"An H-tree is a formal model of evolutionary scenario. It can be used to represent any processes with gene duplication and loss, horizontal gene transfer (HGT) and speciation events. The model of H-trees, introduced in [26], is an extension of the duplication-loss model (DL-model). Similarly to its ancestor, it has a number of interesting mathematical and biological properties. It is, however, more computationally complex than the DL-model. In this paper, we primarily address the problem of inferring H-trees that are compatible with a given gene tree and a given phylogeny of species with HGTs. These results create a mathematical and computational foundation for a more general and practical problem of inferring HGTs from given gene and species trees with HGTs. We also demonstrate how our model can be used to support HGT hypotheses based on empirical data sets. © 2012 Springer-Verlag Berlin Heidelberg."
|
|
|
Mukul S. Bansal,
Eric J. Alm and
Manolis Kellis. Efficient Algorithms for the Reconciliation Problem with Gene Duplication, Horizontal Transfer, and Loss. In ISMB12, Vol. 28(12):i283-i291 of BIO, 2012. Keywords: duplication, explicit network, from rooted trees, from species tree, lateral gene transfer, loss, phylogenetic network, phylogeny, Program Angst, Program Mowgli, Program RANGER-DTL, reconstruction. Note: http://dx.doi.org/10.1093/bioinformatics/bts225.
Toggle abstract
"Motivation: Gene family evolution is driven by evolutionary events such as speciation, gene duplication, horizontal gene transfer and gene loss, and inferring these events in the evolutionary history of a given gene family is a fundamental problem in comparative and evolutionary genomics with numerous important applications. Solving this problem requires the use of a reconciliation framework, where the input consists of a gene family phylogeny and the corresponding species phylogeny, and the goal is to reconcile the two by postulating speciation, gene duplication, horizontal gene transfer and gene loss events. This reconciliation problem is referred to as duplication-transfer-loss (DTL) reconciliation and has been extensively studied in the literature. Yet, even the fastest existing algorithms for DTL reconciliation are too slow for reconciling large gene families and for use in more sophisticated applications such as gene tree or species tree reconstruction.Results: We present two new algorithms for the DTL reconciliation problem that are dramatically faster than existing algorithms, both asymptotically and in practice. We also extend the standard DTL reconciliation model by considering distance-dependent transfer costs, which allow for more accurate reconciliation and give an efficient algorithm for DTL reconciliation under this extended model. We implemented our new algorithms and demonstrated up to 100 000-fold speed-up over existing methods, using both simulated and biological datasets. This dramatic improvement makes it possible to use DTL reconciliation for performing rigorous evolutionary analyses of large gene families and enables its use in advanced reconciliation-based gene and species tree reconstruction methods. © The Author(s) 2012. Published by Oxford University Press."
|
|
|
An-Chiang Chu,
Jesper Jansson,
Richard Lemence,
Alban Mancheron and
Kun-Mao Chao. Asymptotic Limits of a New Type of Maximization Recurrence with an Application to Bioinformatics. In TAMC12, Vol. 7287:177-188 of LNCS, springer, 2012. Keywords: from triplets, galled network, level k phylogenetic network, phylogenetic network. Note: preliminary version.
Toggle abstract
"We study the asymptotic behavior of a new type of maximization recurrence, defined as follows. Let k be a positive integer and p k(x) a polynomial of degree k satisfying p k(0) = 0. Define A 0 = 0 and for n ≥ 1, let A n = max 0≤i<n{A i+n kp k(i/n)}. We prove that lim n→∞A n/n n = sup{pk(x)/1-x k : 0≤x<1}. We also consider two closely related maximization recurrences S n and S′ n, defined as S 0 = S′ 0 = 0, and for n ≥ 1, S n = max 0≤i<n{S i + i(n-i)(n-i-1)/2} and S′ n = max 0≤i<n{S′ i + ( 3 n-i) + 2i( 2 n-i) + (n-i)( 2 i)}. We prove that lim n→∞ S′n/3( 3 n) = 2(√3-1)/3 ≈ 0.488033..., resolving an open problem from Bioinformatics about rooted triplets consistency in phylogenetic networks. © 2012 Springer-Verlag."
|
|
|
Changiz Eslahchi and
Reza Hassanzadeh. New Algorithm for Constructing Supernetworks from Partial Trees. In MCCMB11, Pages 106-107, 2011. Keywords: abstract network, from unrooted trees, heuristic, phylogenetic network, phylogeny, Program SNSA, reconstruction, simulated annealing, split network. Note: http://mccmb.belozersky.msu.ru/2011/mccmb11.pdf#page=106.
|
|
|
|
|
Jesper Jansson and
Andrzej Lingas. Computing the rooted triplet distance between galled trees by counting triangles. In CPM12, Vol. 7354:385-398 of LNCS, springer, 2012. Keywords: distance between networks, explicit network, from network, galled tree, phylogenetic network, phylogeny, polynomial, triplet distance. Note: http://www.df.lth.se/~jj/Publications/d_rt_for_Galled_Trees5_CPM_2012.pdf.
Toggle abstract
"We consider a generalization of the rooted triplet distance between two phylogenetic trees to two phylogenetic networks. We show that if each of the two given phylogenetic networks is a so-called galled tree with n leaves then the rooted triplet distance can be computed in o(n 2.688) time. Our upper bound is obtained by reducing the problem of computing the rooted triplet distance to that of counting monochromatic and almost- monochromatic triangles in an undirected, edge-colored graph. To count different types of colored triangles in a graph efficiently, we extend an existing technique based on matrix multiplication and obtain several new related results that may be of independent interest. © 2012 Springer-Verlag."
|
|
|
Leo van Iersel,
Steven Kelk,
Nela Lekic and
Celine Scornavacca. A practical approximation algorithm for solving massive instances of hybridization number. In WABI12, Vol. 7534(430-440) of LNCS, springer, 2012. Keywords: agreement forest, approximation, explicit network, from rooted trees, hybridization, phylogenetic network, phylogeny, Program CycleKiller, Program Dendroscope, Program HybridNET, reconstruction, software. Note: http://arxiv.org/abs/1205.3417.
Toggle abstract
"Reticulate events play an important role in determining evolutionary relationships. The problem of computing the minimum number of such events to explain discordance between two phylogenetic trees is a hard computational problem. In practice, exact solvers struggle to solve instances with reticulation number larger than 40. For such instances, one has to resort to heuristics and approximation algorithms. Here we present the algorithm CycleKiller which is the first approximation algorithm that can produce solutions verifiably close to optimality for instances with hundreds or even thousands of reticulations. Theoretically, the algorithm is an exponential-time 2-approximation (or 4-approximation in its fastest mode). However, using simulations we demonstrate that in practice the algorithm runs quickly for large and difficult instances, producing solutions within one percent of optimality. An implementation of this algorithm, which extends the theoretical work of [14], has been made publicly available. © 2012 Springer-Verlag."
|
|
|
Hyun Jung Park and
Luay Nakhleh. Inference of reticulate evolutionary histories by maximum likelihood:
The performance of information criteria. In RECOMB-CG'12, Vol. 13(suppl 19):S12 of BMCB, 2012. Keywords: AIC, BIC, explicit network, heuristic, likelihood, phylogenetic network, phylogeny, reconstruction, statistical model. Note: http://www.biomedcentral.com/1471-2105/13/S19/S12.
|
|
|
Maureen Stolzer,
Han Lai,
Minli Xu,
Deepa Sathaye,
Benjamin Vernot and
Dannie Durand. Inferring Duplications, Losses, Transfers, and Incomplete Lineage Sorting with Non-Binary Species Trees. In ECCB12, Vol. 28(18):i409-i415 of BIO, 2012. Keywords: duplication, explicit network, from rooted trees, lateral gene transfer, loss, phylogenetic network, phylogeny, Program Notung, reconstruction. Note: http://dx.doi.org/10.1093/bioinformatics/bts386.
Toggle abstract
"Motivation: Gene duplication (D), transfer (T), loss (L) and incomplete lineage sorting (I) are crucial to the evolution of gene families and the emergence of novel functions.The history of these events can be inferred via comparison of gene and species trees, a process called reconciliation, yet current reconciliation algorithms model only a subset of these evolutionary processes. Results: We present an algorithm to reconcile a binary gene tree with a nonbinary species tree under a DTLI parsimony criterion. This is the first reconciliation algorithm to capture all four evolutionary processes driving tree incongruence and the first to reconcile nonbinary species trees with a transfer model. Our algorithm infers all optimal solutions and reports complete, temporally feasible event histories, giving the gene and species lineages in which each event occurred. It is fixed-parameter tractable, with polytime complexity when the maximum species outdegree is fixed. Application of our algorithms to prokaryotic and eukaryotic data show that use of an incomplete event model has substantial impact on the events inferred and resulting biological conclusions. © The Author(s) 2012. Published by Oxford University Press."
|
|
|
|
|
Thi-Hau Nguyen,
Jean-Philippe Doyon,
Stéphanie Pointet,
Anne-Muriel Chifolleau Arigon,
Vincent Ranwez and
Vincent Berry. Accounting for Gene Tree Uncertainties Improves Gene Trees and Reconciliation Inference. In WABI12, Vol. 7534:123-134 of LNCS, springer, 2012. Keywords: duplication, heuristic, lateral gene transfer, phylogenetic network, phylogeny, Program Mowgli, reconstruction. Note: http://hal.archives-ouvertes.fr/hal-00718347/en/.
Toggle abstract
"We propose a reconciliation heuristic accounting for gene duplications, losses and horizontal transfers that specifically takes into account the uncertainties in the gene tree. Rearrangements are tried for gene tree edges that are weakly supported, and are accepted whenever they improve the reconciliation cost. We prove useful properties on the dynamic programming matrix used to compute reconciliations, which allows to speed-up the tree space exploration when rearrangements are generated by Nearest Neighbor Interchanges (NNI) edit operations. Experimental results on simulated and real data confirm that running times are greatly reduced when considering the above-mentioned optimization in comparison to the naïve rearrangement procedure. Results also show that gene trees modified by such NNI rearrangements are closer to the correct (simulated) trees and lead to more correct event predictions on average. The program is available at http://www.atgc-montpellier.fr/ Mowgli/. © 2012 Springer-Verlag."
|
|
|
Katharina Huber,
Vincent Moulton,
Andreas Spillner,
Sabine Storandt and
Radoslaw Suchecki. Computing a consensus of multilabeled trees. In ALENEX12, Pages 84-92, 2012. Keywords: duplication, explicit network, exponential algorithm, phylogenetic network, phylogeny. Note: http://siam.omnibooksonline.com/2012ALENEX/data/papers/020.pdf.
Toggle abstract
In this paper we consider two challenging problems that arise in the context of computing a consensus of a collection of multilabeled trees, namely (1) selecting a compatible collection of clusters on a multiset from an ordered list of such clusters and (2) optimally refining high degree vertices in a multilabeled tree. Forming such a consensus is part of an approach to reconstruct the evolutionary history of a set of species for which events such as genome duplication and hybridization have occurred in the past. We present exact algorithms for solving (1) and (2) that have an exponential run-time in the worst case. To give some impression of their performance in practice, we apply them to simulated input and to a real biological data set highlighting the impact of several structural properties of the input on the performance.
|
|
|
Yufeng Wu. An Algorithm for Constructing Parsimonious Hybridization Networks with Multiple Phylogenetic Trees. In RECOMB13, Vol. 7821:291-303 of LNCS, springer, 2013. Keywords: explicit network, exponential algorithm, from rooted trees, phylogenetic network, phylogeny, Program PIRN, reconstruction. Note: http://www.engr.uconn.edu/~ywu/Papers/ExactNetRecomb2013.pdf.
Toggle abstract
"Phylogenetic network is a model for reticulate evolution. Hybridization network is one type of phylogenetic network for a set of discordant gene trees, and "displays" each gene tree. A central computational problem on hybridization networks is: given a set of gene trees, reconstruct the minimum (i.e. most parsimonious) hybridization network that displays each given gene tree. This problem is known to be NP-hard, and existing approaches for this problem are either heuristics or make simplifying assumptions (e.g. work with only two input trees or assume some topological properties). In this paper, we develop an exact algorithm (called PIRNC ) for inferring the minimum hybridization networks from multiple gene trees. The PIRNC algorithm does not rely on structural assumptions. To the best of our knowledge, PIRN C is the first exact algorithm for this formulation. When the number of reticulation events is relatively small (say four or fewer), PIRNC runs reasonably efficient even for moderately large datasets. For building more complex networks, we also develop a heuristic version of PIRNC called PIRNCH. Simulation shows that PIRNCH usually produces networks with fewer reticulation events than those by an existing method. © 2013 Springer-Verlag."
|
|
|
Mukul S. Bansal,
Eric J. Alm and
Manolis Kellis. Reconciliation Revisited: Handling Multiple Optima when Reconciling with Duplication, Transfer, and Loss. In RECOMB13, Vol. 7821:1-13 of LNCS, springer, 2013. Keywords: duplication, from rooted trees, from species tree, loss, phylogenetic network, phylogeny, polynomial, Program RANGER-DTL, reconstruction. Note: http://people.csail.mit.edu/mukul/Bansal_RECOMB2013.pdf.
Toggle abstract
"Phylogenetic tree reconciliation is a powerful approach for inferring evolutionary events like gene duplication, horizontal gene transfer, and gene loss, which are fundamental to our understanding of molecular evolution. While Duplication-Loss (DL) reconciliation leads to a unique maximum-parsimony solution, Duplication-Transfer-Loss (DTL) reconciliation yields a multitude of optimal solutions, making it difficult the infer the true evolutionary history of the gene family. Here, we present an effective, efficient, and scalable method for dealing with this fundamental problem in DTL reconciliation. Our approach works by sampling the space of optimal reconciliations uniformly at random and aggregating the results. We present an algorithm to efficiently sample the space of optimal reconciliations uniformly at random in O(mn 2) time, where m and n denote the number of genes and species, respectively. We use these samples to understand how different optimal reconciliations vary in their node mapping and event assignments, and to investigate the impact of varying event costs. © 2013 Springer-Verlag."
|
|
|
Cayla McBee. Generalizing Fourier Calculus on Evolutionary Trees to Splits Networks. In ISPAN'12, Pages 149-155, 2012. Keywords: abstract network, from sequences, phylogenetic network, phylogeny, split network, statistical model.
Toggle abstract
"Biologists have been interested in Phylogenetics, the study of evolutionary relatedness among various groups of organisms, for more than 140 years. In spite of this, it has only been in the last 40 years that advances in technology and the availability of DNA sequences have led to statistical, computational and algorithmic work on determining evolutionary relatedness between organisms. One method of determining historical relationships between organisms is to assume a group based evolutionary model and use a discrete Fourier transform. The 1993 paper 'Fourier Calculus on Evolutionary Trees' by L.A. Szekely, M.A. Steel and P.L. Erdos outlines this process. The transform presented in Szekely et al provides an invertible relationship between phylogenetic trees and expected frequencies of nucleotide patterns in nucleotide sequences. This implies that given a set of nucleotide sequences from various organisms it is possible to construct a phylogenetic tree that represents the historical relationships of those organisms. Some scenarios are poorly described by phylogenetic trees and there are biological and statistical reasons for using networks to model phylogenetic relationships. Given this motivation I have generalized Szekely et al's result to apply to a specific type of phylogenetic network known as a splits network. © 2012 IEEE."
|
|
|
Hoa Vu,
Francis Chin,
Wing-Kai Hon,
Henry Leung,
Kunihiko Sadakane,
Wing-Kin Sung and
Siu-Ming Yiu. Reconstructing k-Reticulated Phylogenetic Network from a Set of Gene Trees. In ISBRA13, Vol. 7875:112-124 of LNCS, springer, 2013. Keywords: from rooted trees, k-reticulated, phylogenetic network, phylogeny, polynomial, Program ARTNET, Program CMPT, reconstruction. Note: http://grid.cs.gsu.edu/~xguo9/publications/2013_Cloud%20computing%20for%20de%20novo%20metagenomic%20sequence%20assembly.pdf#page=123.
Toggle abstract
"The time complexity of existing algorithms for reconstructing a level-x phylogenetic network increases exponentially in x. In this paper, we propose a new classification of phylogenetic networks called k-reticulated network. A k-reticulated network can model all level-k networks and some level-x networks with x > k. We design algorithms for reconstructing k-reticulated network (k = 1 or 2) with minimum number of hybrid nodes from a set of m binary trees, each with n leaves in O(mn 2) time. The implication is that some level-x networks with x > k can now be reconstructed in a faster way. We implemented our algorithm (ARTNET) and compared it with CMPT. We show that ARTNET outperforms CMPT in terms of running time and accuracy. We also consider the case when there does not exist a 2-reticulated network for the input trees. We present an algorithm computing a maximum subset of the species set so that a new set of subtrees can be combined into a 2-reticulated network. © 2013 Springer-Verlag."
|
|
|
Leo van Iersel and
Steven Kelk. Kernelizations for the hybridization number problem on multiple nonbinary trees. In WG14, Vol. 8747:299-311 of LNCS, springer, 2014. Keywords: explicit network, from rooted trees, kernelization, minimum number, phylogenetic network, phylogeny, Program Treeduce, reconstruction. Note: http://arxiv.org/abs/1311.4045.
|
|
|
|
|
Ran Libeskind-Hadas,
Yi-Chieh Wu,
Mukul S. Bansal and
Manolis Kellis. Pareto-optimal phylogenetic tree reconciliation. In ISMB14, Vol. 30:i87-i95 of BIO, 2014. Keywords: duplication, lateral gene transfer, loss, phylogenetic network, phylogeny, polynomial, Program Xscape, reconstruction. Note: http://dx.doi.org/10.1093/bioinformatics/btu289.
Toggle abstract
"Motivation: Phylogenetic tree reconciliation is a widely used method for reconstructing the evolutionary histories of gene families and species, hosts and parasites and other dependent pairs of entities. Reconciliation is typically performed using maximum parsimony, in which each evolutionary event type is assigned a cost and the objective is to find a reconciliation of minimum total cost. It is generally understood that reconciliations are sensitive to event costs, but little is understood about the relationship between event costs and solutions. Moreover, choosing appropriate event costs is a notoriously difficult problem. Results: We address this problem by giving an efficient algorithm for computing Pareto-optimal sets of reconciliations, thus providing the first systematic method for understanding the relationship between event costs and reconciliations. This, in turn, results in new techniques for computing event support values and, for cophylogenetic analyses, performing robust statistical tests. We provide new software tools and demonstrate their use on a number of datasets from evolutionary genomic and cophylogenetic studies. © 2014 The Author. Published by Oxford University Press. All rights reserved."
|
|
|
|
|
Chris Whidden and
Norbert Zeh. A Unifying View on Approximation and FPT of Agreement Forests. In WABI09, Vol. 5724:390-402 of LNCS, Springer, 2009. Keywords: agreement forest, approximation, explicit network, FPT, minimum number, phylogenetic network, phylogeny, reconstruction. Note: https://www.cs.dal.ca/sites/default/files/technical_reports/CS-2009-02.pdf.
Toggle abstract
"We provide a unifying view on the structure of maximum (acyclic) agreement forests of rooted and unrooted phylogenies. This enables us to obtain linear- or O(n log n)-time 3-approximation and improved fixed-parameter algorithms for the subtree prune and regraft distance between two rooted phylogenies, the tree bisection and reconnection distance between two unrooted phylogenies, and the hybridization number of two rooted phylogenies. © 2009 Springer Berlin Heidelberg."
|
|
|
Louxin Zhang,
Yen Kaow Ng,
Taoyang Wu and
Yu Zheng. Network model and efficient method for detecting relative duplications or horizontal gene transfers. In ICCABS11, Pages 214-219, 2011. Keywords: dynamic programming, explicit network, from network, from rooted trees, from species tree, phylogenetic network, phylogeny, polynomial, reconstruction.
Toggle abstract
"Background: Horizontal gene transfer and gene duplication are two significant forces behind genome evolution. As more and more well-supported examples of HGTs are being revealed, there is a growing awareness that HGT is more widespread than previously thought, occurring often not only within bacteria, but also between species remotely related such as bacteria and plants or plants and animals. Although a substantial number of genomic sequences are known, HGT inference remains challenging. Parsimony-based inferences of HGT events are typically NP-hard under the framework of gene tree and species tree comparison; it is even more timeconsuming if the maximum likelihood approach is used. The fact that gene tree and species tree incongruence can be further confounded by gene duplication and gene loss events motivates us to incorporate considerations for these events into our inference of HGT events. Similarly, it will be beneficial if known HGT events are considered in the inference of gene duplications and gene losses. © 2011 IEEE."
|
|
|
Philippe Gambette,
Andreas Gunawan,
Anthony Labarre,
Stéphane Vialette and
Louxin Zhang. Locating a Tree in A Phylogenetic Network in Quadratic Time. In RECOMB15, Vol. 9029:96-107 of LNCS, Springer, 2015. Keywords: evaluation, explicit network, from network, from rooted trees, genetically stable network, nearly-stable network, phylogenetic network, phylogeny, polynomial, tree containment. Note: https://hal.archives-ouvertes.fr/hal-01116231/en.
|
|
|
|
|
Quan Nguyen and
Teemu Roos. Likelihood-based inference of phylogenetic networks from sequence data by PhyloDAG. In AlCoB15, Vol. 9199:126-140 of LNCS, springer, 2015. Keywords: BIC, explicit network, from sequences, likelihood, phylogenetic network, phylogeny, Program PhyloDAG, reconstruction, software. Note: http://www.cs.helsinki.fi/u/ttonteri/pub/alcob2015.pdf.
|
|
|
|
|
Jittat Fakcharoenphol,
Tanee Kumpijit and
Attakorn Putwattana. A Faster Algorithm for the Tree Containment Problem for Binary Nearly Stable Phylogenetic Networks. In Proceedings of the The 12th International Joint Conference on Computer Science and Software Engineering (JCSSE'15), Pages 337-342, IEEE, 2015. Keywords: dynamic programming, explicit network, from network, from rooted trees, nearly-stable network, phylogenetic network, phylogeny, polynomial, tree containment.
|
|
|
Andreas Gunawan,
Bhaskar DasGupta and
Louxin Zhang. Locating a Tree in a Reticulation-Visible Network in Cubic Time. In RECOMB16, Vol. 9649:266 of LNBI, Springer, 2016. Keywords: cluster containment, explicit network, from clusters, from network, from rooted trees, phylogenetic network, phylogeny, polynomial, reticulation-visible network, tree containment. Note: http://arxiv.org/abs/1507.02119.
|
|
|
Misagh Kordi and
Mukul S. Bansal. On the Complexity of Duplication-Transfer-Loss Reconciliation with Non-Binary Gene Trees. In ISBRA15, Vol. 9096:187-198 of LNCS, springer, 2015. Keywords: duplication, from rooted trees, from species tree, lateral gene transfer, loss, NP complete, phylogenetic network, phylogeny, reconstruction. Note: http://compbio.engr.uconn.edu/papers/Kordi_ISBRA2015.pdf.
|
|
|
Yun Yu and
Luay Nakhleh. A Distance-Based Method for Inferring Phylogenetic Networks in the Presence of Incomplete Lineage Sorting. In ISBRA15, Vol. 9096:378-389 of LNCS, springer, 2015. Keywords: bootstrap, explicit network, from distances, heuristic, incomplete lineage sorting, phylogenetic network, phylogeny, reconstruction. Note: http://bioinfo.cs.rice.edu/sites/bioinfo.cs.rice.edu/files/YuNakhleh-ISBRA15.pdf.
|
|
|
Philippe Gambette,
Andreas Gunawan,
Anthony Labarre,
Stéphane Vialette and
Louxin Zhang. Solving the Tree Containment Problem for Genetically Stable Networks in Quadratic Time. In IWOCA15, Vol. 9538:197-208 of LNCS, springer, 2016. Keywords: explicit network, from network, from rooted trees, genetically stable network, phylogenetic network, phylogeny, polynomial, tree containment. Note: https://hal-upec-upem.archives-ouvertes.fr/hal-01226035 .
|
|
|
|
|
Yun Yu and
Luay Nakhleh. A maximum pseudo-likelihood approach for phylogenetic networks. In RECOMB-CG15, Vol. 16(Suppl 10)(S10):1-10 of BMC Genomics, BioMed Central, 2015. Keywords: explicit network, from rooted trees, hybridization, incomplete lineage sorting, likelihood, phylogenetic network, phylogeny, Program PhyloNet, reconstruction, tripartition distance. Note: http://dx.doi.org/10.1186/1471-2164-16-S10-S10.
|
|
|
|
|
|
|
|
|
|
|
Jiafan Zhu,
Yun Yu and
Luay Nakhleh. In the Light of Deep Coalescence: Revisiting Trees Within Networks. In RECOMB-CG16, Vol. 17(suppl. 14):415.271-282 of BMCB, 2016. Keywords: branch length, evaluation, explicit network, incomplete lineage sorting, phylogenetic network, phylogeny, statistical model, tree-based network, weakly displaying. Note: http://arxiv.org/abs/1606.07350.
|
|
|
Mathias Weller. Linear-Time Tree Containment in Phylogenetic Networks. In RECOMB-CG18, Vol. 11183:309-323 of LNCS, Springer, 2018. Keywords: explicit network, from network, from rooted trees, nearly-stable network, phylogenetic network, phylogeny, polynomial, reconstruction, reticulation-visible network, tree containment. Note: https://arxiv.org/abs/1702.06364.
|
|
|
Hussein A. Hejase,
Natalie VandePol,
Gregory A. Bonito and
Kevin J. Liu. Fast and accurate statistical inference of phylogenetic networks using large-scale genomic sequence data. In RECOMB-CG18, Vol. 11183:242-259 of LNCS, Springer, 2018. Keywords: explicit network, from rooted trees, heuristic, phylogenetic network, phylogeny, Program FastNet, reconstruction. Note: http://biorxiv.org/content/early/2017/05/01/132795.
|
|
|
Bingxin Lu,
Louxin Zhang and
Hon Wai Leong. A program to compute the soft Robinson-Foulds distance between phylogenetic networks. In APBC17, Vol. 18(Suppl. 2):111 of BMC Genomics, 2017. Keywords: cluster containment, distance between networks, explicit network, exponential algorithm, from network, phylogenetic network, phylogeny, Program icelu-PhyloNetwork. Note: http://dx.doi.org/10.1186/s12864-017-3500-5.
|
|
|
Jesper Jansson,
Ramesh Rajaby and
Wing-Kin Sung. An Efficient Algorithm for the Rooted Triplet Distance Between Galled Trees. In AlCoB17, Vol. 10252:115-126 of LNCS, Springer, 2017. Keywords: distance between networks, from network, phylogenetic network, phylogeny, polynomial, reconstruction, triplet distance. Note: .
|
|
|
Andreas Gunawan. Solving the Tree Containment Problem for Reticulation-visible Networks in Linear Time. In AlCoB18, Vol. 10849:24-36 of LNCS, Springer, 2018. Keywords: explicit network, from network, from rooted trees, phylogenetic network, phylogeny, polynomial, reticulation-visible network, tree containment. Note: https://arxiv.org/abs/1702.04088.
|
|
|
Han Lai,
Maureen Stolzer and
Dannie Durand. Fast Heuristics for Resolving Weakly Supported Branches Using Duplication, Transfers, and Losses. In RECOMB-CG17, Vol. 10562:298-320 of LNCS, Springer, 2017. Keywords: duplication, explicit network, from rooted trees, from species tree, lateral gene transfer, loss, phylogenetic network, phylogeny, Program Notung, reconstruction.
|
|
|
Sebastien Roch and
Kun-Chieh Wang. Circular Networks from Distorted Metrics. In RECOMB18, Vol. 10812:167-176 of LNCS, Springer, 2018. Keywords: abstract network, circular split system, from distances, NeighborNet, phylogenetic network, phylogeny, reconstruction, split network. Note: https://arxiv.org/abs/1707.05722.
|
|
|
|
|
Kuang-Yu Chang,
Yun Cui,
Siu-Ming Yiu and
Wing-Kai Hon. Reconstructing One-Articulated Networks with Distance Matrices. In ISBRA17, Vol. 10330:34-45 of LNCS, Springer, 2017. Keywords: explicit network, from distances, k-reticulated, phylogenetic network, phylogeny, reconstruction. Note: https://link.springer.com/content/pdf/10.1007%2F978-3-319-59575-7.pdf#page=100.
|
|
|
|
|
Louxin Zhang. Recent Progresses in the Combinatorial and Algorithmic Study of Rooted Phylogenetic Networks. In WALCOM20, Vol. 12049:22-27 of LNCS, Springer, 2020. Keywords: cluster containment, galled network, galled tree, nearly-stable network, phylogenetic network, phylogeny, polynomial, reticulation-visible network, survey, time consistent network, tree containment, tree-based network, tree-child network.
|
|
|
Juan Wang and
Maozu Guo. IGNet: Constructing Rooted Phylogenetic Networks Based on Incompatible Graphs. In ICNC-FSKD19, Vol. 1075:894-900 of Advances in Intelligent Systems and Computing, Springer, 2019. Keywords: explicit network, from rooted trees, phylogenetic network, phylogeny, Program BIMLR, Program IGNet, Program LNetwork, reconstruction, software.
|
|
|
Leo van Iersel,
Remie Janssen,
Mark Jones,
Yukihiro Murakami and
Norbert Zeh. Polynomial-Time Algorithms for Phylogenetic Inference Problems. In AlCoB18, Vol. 10849:37-49 of LNCS, Springer, 2018. Keywords: hybridization, minimum number, parental hybridization, phylogenetic network, phylogeny, polynomial, reconstruction, weakly displaying. Note: https://research.tudelft.nl/files/53686721/10.1007_978_3_319_91938_6_4.pdf.
|
|
|
Remie Janssen,
Mark Jones and
Yukihiro Murakami. Combining Networks Using Cherry Picking Sequences. In AlCoB20, Vol. 12099:77-92 of LNCS, Springer, 2020. Keywords: cherry-picking, explicit network, FPT, from network, hybridization, orchard network, phylogenetic network, phylogeny, tree-child network.
|
|
|
|
|
Remie Janssen and
Yukihiro Murakami. Linear Time Algorithm for Tree-Child Network Containment. In AlCoB20, Vol. 12099:93-107 of LNCS, Springer, 2020. Keywords: explicit network, from network, isomorphism, phylogenetic network, phylogeny, polynomial, reconstruction, tree-child network, tree-child sequence. Note: https://doi.org/10.1007/978-3-030-42266-0_8.
|
|
|
Jesper Jansson,
Konstantinos Mampentzidis,
Ramesh Rajaby and
Wing-Kin Sung. Computing the Rooted Triplet Distance Between Phylogenetic Networks. In IWOCA19, Vol. 11638:290-303 of LNCS, Springer, 2019. Keywords: distance between networks, from network, phylogenetic network, phylogeny, polynomial, triplet distance.
|
|
|
|