Dan Gusfield

Computer Science, UC Davis, California

"Efficient and Practical Algorithms for Studying the History of Recombination in Populations"

Monday February 19, 2007 04:00 PM
Location: 331, Daniels -- NOTE THE CHANGE ==>> NCSU Historical Campus
Abstract: The work discussed in this talk falls into the emerging area of Population Genomics. I will first introduce that area and then talk about specific problems and results involved in the inference of recombination from population data.

A phylogenetic network (or Ancestral Recombination Graph) is a generalization of a tree, allowing structural properties that are not tree-like. With the growth of genomic and population data (coming for example from the HAPMAP project) much of which does not fit ideal tree models, and the increasing appreciation of the genomic role of such phenomena as recombination (crossing-over and gene-conversion), recurrent and back mutation, horizontal gene transfer, and mobile genetic elements, there is greater need to understand the algorithmics and combinatorics of phylogenetic networks.

In this talk I will survey a range of our recent results on phylogenetic networks with recombination and show applications of these results to several issues in Population Genomics: Association Mapping; Finding Recombination Hotspots in genotype sequences; Imputing the values of missing haplotype data; Determining the extent of recombination in the evolution of LPL sequences; Distinguishing the role of crossing-over from gene-conversion in Arabidopsis; Characterizing some aspects of the haplotypes produced by the program PHASE; Studying the effect of using genotype data in place of haplotype data, etc.

Various parts of this work are joint work with Satish Eddhu, Chuck Langley, Dean Hickerson, Yun Song, Yufeng Wu, V. Bansal, V. Bafna and Zhihong Ding. Papers and associated software can be accesses at

Short Bio: Professor Gusfield's background is in Combinatorial Optimization, and various applications of Combinatorial Optimization. He has worked extensively on problems of network flow, matroid optimization, statistical data security, stable marriage and matching, string algorithms and sequence analysis, phylogenetic tree inference, haplotype inference, and inference of phylogenetic networks with homoplasy and recombination. He received his Ph.D. in 1980 from UC Berkeley, working with Richard Karp, and was an Assistant Professor at Yale University from 1980 to 1986.

Professor Gusfield moved to UC Davis in July 1986. Since then, he has mostly addressed problems in Computational Biology and Bioinformatics. He first addressed questions about building evolutionary trees, and then problems in molecular sequence analysis. He presently focuses mostly on optimization problems related to population genetics and population-scale genomics. Two particular problems are haplotype inference and inferences about historical recombination. His main support for work on computational biology and bioinformatics came initially from the Department of Energy Human Genome Project through the Lawrence Berkeley Labs Human Genome Center, then directly from DOE, Human Genome Project, but since then, his work in computational biology has been funded by the NSF. His book, ``Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology" has helped to define the intersection of computer science and bioinformatics. It has been translated into Russian, and a South Asian edition has recently been published. Professor Gusfield serves on the editorial board of the Journal of Computational Biology, and is the founding Editor-in-Chief of The IEEE/ACM Transactions on Computational Biology and Bioinformatics. The journal was presented the ``honorable mention" for Best New Journal in 2004 by the American Association of Publishers. Other notable service to the Computational Biology community consists of serving as Program Chair for the 2004 RECOMB conference.

At UCD, Professor Gusfield was chair of the Computer Science Department for four years, and wrote the bioinformatics section (one of three) of the Genomics/Bioinformatics initiative proposal that resulted in the creation of the UCD Genomics Center (which has hired 17 new faculty), and continues to serve on its internal Steering committee. He is currently co-chair of the UCD campus initiative on ``Computational Characterization and Exploitation of Biological Networks" (see, which will hire seven new faculty in this area over the next three years.

Host: Steffen Heber, Computer Science, NCSU

