Phyloseminar

phyloseminar -- a free online seminar about phylogenetics

URL

XML feed
http://phyloseminar.org/

Last update

1 hour 26 min ago

December 15, 2014

11:00

Often, the summary statistics of population genetics are framed in the setting of Kingman's coalescent or related models. These statistics can be alternatively thought of as descriptive statistics of the realized population pedigree-with-recombination, in a way that has become much more useful in the era of whole-genome sequencing. For instance, pairwise number of nucleotide differences is proportional to "effective population size", which is sometimes more usefully thought of as an estimate of the average length of the path through the pedigree to the most recent common ancestor at a randomly chosen locus (with an explicit standard error). Another example is the pairwise distribution of long tracts of IBD, which provides an estimate of a functional of the entire distribution of such paths.

11:00

Complex demographic histories shape the genealogies of contemporary individuals and thus have a substantial impact on the genetic variation observed today. These genealogies are commonly modeled by the ancestral recombination graph (ARG), and we developed a novel demography-aware conditional sampling distribution (CSD) to approximate these ARGs under general demographic models. We apply this CSD in an expectation-maximization framework for demographic inference. We show that this method can accurately recover biologically relevant demographic parameters like population divergence times, migration rates, or ancestral population sizes from simulated datasets. Furthermore, we apply the CSD to detect tracts of genetic material that introgressed from Neanderthal into modern humans. Our results are in general agreement with previously published results, and we will discuss the similarities and differences, and their biological implications.

11:00

Often, the summary statistics of population genetics are framed in the setting of Kingman's coalescent or related models. These statistics can be alternatively thought of as descriptive statistics of the realized population pedigree-with-recombination, in a way that has become much more useful in the era of whole-genome sequencing. For instance, pairwise number of nucleotide differences is proportional to "effective population size", which is sometimes more usefully thought of as an estimate of the average length of the path through the pedigree to the most recent common ancestor at a randomly chosen locus (with an explicit standard error). Another example is the pairwise distribution of long tracts of IBD, which provides an estimate of a functional of the entire distribution of such paths.

October 9, 2014

11:00

The fields of phylogenetics and population genetics share several important models including gene trees, species trees, ancestral recombination graphs (ARGs), and pedigrees. These models are all closely related and can be viewed as subgraphs of one another. Amongst them, the ARG is particularly central and if inferred efficiently can enable many applications such as inference of selection and demography. Here, I will review various helpful mathematical tools for working with ARGs, including what we call the threading algorithm, the branch graph, and the leaf trace visualization.

11:00

Phylodynamic methods are widely used to estimate demographic parameters and historical population dynamics from genealogies of individuals sampled from a population. In this phyloseminar, I will describe how we can understand genealogies in terms of basic demographic or ecological processes, and how these concepts can be used to develop statistical models for inference. In particular, I will discuss some similarities and differences between the two main modeling frameworks in phylodynamics: the coalescent and birth-death models. I will also briefly introduce some of the latest statistical methods currently used to fit these models to genealogies. I will end by discussing one of the main challenges facing the field---adequately representing the structure of complex, heterogenous populations in phylodynamic models.

October 2, 2014

11:00

Major recent advances in genome sequencing technology make it feasible that in future epidemics, a sequence will be available for every clinical case that can be identified. In some scenarios, such as agricultural epidemics (where farm-to-farm spread is of more interest than animal-to-animal), diseases such as HIV (where most infected individuals will eventually present themselves to clinicians), and epidemics occurring in well-monitored populations such as hospital inpatients, we will as a consequence be able to acquire a set of sequences representing the pathogens infecting most or all cases in the transmission chain. Genetic data therefore provides an important new tool for the investigation of epidemics, in particular the determination of the epidemic's transmission tree, which describes which case infected which others. As the genetic diversity in a set of sequences taken from the same epidemic will not be enormous even for fast-evolving RNA viruses, the best approach would be to combine both genetic and epidemiological data. I present here a new method for transmission tree reconstruction which is integrated into the Bayesian phylogenetics framework available in BEAST. It is based on the observation that if the phylogeny is know, there is a one-to-one correspondence between possible transmission trees and partitions of the internal nodes of the tree into connected subgraphs. The MCMC procedure in BEAST has been modified to sample from the space of trees with nodes partitioned in this way, simultaneously estimating both phylogenetic tree and transmission tree. Rather than assuming that the entire tree is generated by a single coalescent process, the posterior probability of a phylogeny is now calculated based on an individual-based model of disease transmission, which can take into account epidemiological characteristics of the host cases, such as spatial location. I will outline results using simulated data and sequences from the 2003 Dutch epidemic of H7N7 avian influenza.