|
Vineet Bafna and
Vikas Bansal. The Number of Recombination Events in a Sample History: Conflict Graph and Lower Bounds. In TCBB, Vol. 1(2):78-90, 2004. Keywords: ARG, bound, minimum number, phylogeny, recombination. Note: http://www-cse.ucsd.edu/users/vbafna/pub/tcbb04.pdf.
Toggle abstract
"We consider the following problem: Given a set of binary sequences, determine lower bounds on the minimum number of recombinations required to explain the history of the sample, under the infinite-sites model of mutation. The problem has implications for finding recombination hotspots and for the Ancestral Recombination Graph reconstruction problem. Hudson and Kaplan gave a lower bound based on the four-gamete test. In practice, their bound R m often greatly underestimates the minimum number of recombinations. The problem was recently revisited by Myers and Griffiths, who introduced two new lower bounds R h and R s which are provably better, and also yield good bounds in practice. However, the worst-case complexities of their procedures for computing R h and R s are exponential and super-exponential, respectively. In this paper, we show that the number of nontrivial connected components, Rc, in the conflict graph for a given set of sequences, computable in time O(nm 2), is also a lower bound on the minimum number of recombination events. We show that in many cases, R c is a better bound than R h. The conflict graph was used by Gusfield et al. to obtain a polynomial time algorithm for the galled tree problem, which is a special case of the Ancestral Recombination Graph (ARG) reconstruction problem. Our results also offer some insight into the structural properties of this graph and are of interest for the general Ancestral Recombination Graph reconstruction problem."
|
|
|
|
|
Yun S. Song,
Zhihong Ding,
Dan Gusfield,
Charles Langley and
Yufeng Wu. Algorithms to Distinguish the Role of Gene-Conversion from Single-Crossover Recombination in the Derivation of SNP Sequences in Populations. In JCB, Vol. 14(10):1273-1286, 2007. Keywords: ARG, from sequences, phylogenetic network, phylogeny, Program SHRUB, reconstruction. Note: http://dx.doi.org/10.1089/cmb.2007.0096.
Toggle abstract
"Meiotic recombination is a fundamental biological event and one of the principal evolutionary forces responsible for shaping genetic variation within species. In addition to its fundamental role, recombination is central to several critical applied problems. The most important example is "association mapping" in populations, which is widely hoped to help find genes that influence genetic diseases (Carlson et al., 2004; Clark, 2003). Hence, a great deal of recent attention has focused on problems of inferring the historical derivation of sequences in populations when both mutations and recombinations have occurred. In the algorithms literature, most of that recent work has been directed to single-crossover recombination. However, gene-conversion is an important, and more common, form of (two-crossover) recombination which has been much less investigated in the algorithms literature. In this paper, we explicitly incorporate gene-conversion into discrete methods to study historical recombination. We are concerned with algorithms for identifying and locating the extent of historical crossing-over and gene-conversion (along with single-nucleotide mutation), and problems of constructing full putative histories of those events. The novel technical issues concern the incorporation of gene-conversion into recently developed discrete methods (Myers and Griffiths, 2003; Song et al., 2005) that compute lower and upper-bound information on the amount of needed recombination without gene-conversion. We first examine the most natural extension of the lower bound methods from Myers and Griffiths (2003), showing that the extension can be computed efficiently, but that this extension can only yield weak lower bounds. We then develop additional ideas that lead to higher lower bounds, and show how to solve, via integer-linear programming, a more biologically realistic version of the lower bound problem. We also show how to compute effective upper bounds on the number of needed single-crossovers and gene-conversions, along with explicit networks showing a putative history of mutations, single-crossovers and gene-conversions. Both lower and upper bound methods can handle data with missing entries, and the upper bound method can be used to infer missing entries with high accuracy. We validate the significance of these methods by showing that they can be effectively used to distinguish simulation-derived sequences generated without gene-conversion from sequences that were generated with gene-conversion. We apply the methods to recently studied sequences of Arabidopsis thaliana, identifying many more regions in the sequences than were previously identified (Plagnol et al., 2006), where gene-conversion may have played a significant role. Demonstration software is available at www.csif.cs.ucdavis.edu/∼gusfield. © 2007 Mary Ann Liebert, Inc."
|
|
|
|
|
Julia Matsieva,
Steven Kelk,
Celine Scornavacca,
Chris Whidden and
Dan Gusfield. A Resolution of the Static Formulation Question for the Problem of Computing the History Bound. In TCBB, Vol. 14(2):404-417, 2017. Keywords: ARG, explicit network, from sequences, minimum number, phylogenetic network, phylogeny.
|
|
|
|
|
|