There are currently 0 users and 47 guests online.
Last update1 hour 14 min ago
November 21, 2014
Meeting: NEXTGEN BIOINFORMATICS USER GROUP and SCOTTISH PHYLOGENY DISCUSSION GROUP
University of St Andrews, UK, 8 December 2014
Invited speaker -
Dr Jo DICKS (National Collection of Yeast Cultures http://www.ncyc.co.uk, Institute of Food Research):
"Estimating and exploiting yeast NGS-based phylogenies for industrial biotechnology".
Contributed talks -
Emma CARROLL: "Assessing the influence of migratory culture on connectivity in the southern right whale".
Deepali BASOYA: "Viral/host gene expression profiles in lymphoid and feather follicle epithelial (FFE) cells infected with Marek's disease virus".
Miguel PINHEIRO: "Determine dimorphic nature of the zoonotic parasite Plasmodium knowlesi".
Georgios KOUTSOVOULOS: "Reconstructing the phylogenetic relationships of nematodes using draft genomes and transcriptomes".
Joanne TAYLOR: "Environment and host genotype influence on fungal endophyte assemblages of Scots Pine".
Attendance is free, but please register in advance.
DETAILS AND REGISTRATION:
Daniel Barker firstname.lastname@example.org
November 17, 2014
November 9, 2014
[Paper] Towards more accurate ancestral protein genotype-phenotype reconstructions with the use of species tree-aware gene trees
Erick Matsen wrote:
Using species-tree aware gene trees for ancestral reconstruction is a good thing!www.ncbi.nlm.nih.gov Towards more accurate ancestral protein genotype-phenotype reconstructions with the use of species tree-aware gene trees. M Groussin, JK Hobbs, GJ Szöllősi, S Gribaldo, VL Arcus and M Gouy, Molecular biology and evolution, Nov 4 2014
The resurrection of ancestral proteins provides direct insight into how natural selection has shaped proteins found in nature. By tracing substitutions along a gene phylogeny, ancestral proteins can be reconstructed in silico and subsequently synthesized in vitro. This elegant strategy reveals the complex mechanisms responsible for the evolution of protein functions and structures. However, to date, all protein resurrection studies have used simplistic approaches for ancestral sequence reconstruction (ASR), including the assumption that a single sequence alignment alone is sufficient to accurately reconstruct the history of the gene family. The impact of such shortcuts on conclusions about ancestral functions has not been investigated. Here, we show with simulations that utilizing information on species history using a model that accounts for the duplication, horizontal transfer and loss (DTL) of genes statistically increases ASR accuracy. This underscores the importance of the tree topology in the inference of putative ancestors. We validate our in silico predictions using in vitro resurrection of the LeuB enzyme for the ancestor of the Firmicutes, a major and ancient bacterial phylum. With this particular protein, our experimental results demonstrate that information on the species phylogeny results in a biochemically more realistic and kinetically more stable ancestral protein. Additional resurrection experiments with different proteins are necessary to statistically quantify the impact of using species tree-aware gene trees on ancestral protein phenotypes. Nonetheless, our results suggest the need for incorporating both sequence and DTL information in future studies of protein resurrections to accurately define the genotype-phenotype space in which proteins diversify.
October 28, 2014
Erick Matsen wrote:
@cwhidden and I would like to sample from the subtree-prune-regraft (SPR) random walk on rooted phylogenetic trees. Does anyone know an easy way to do this? Chris could roll his own, but I'll bet that there is an easy solution out there.
If we wanted to sample from the random walk on unrooted trees, we could sample from the MrBayes prior. BEAST is rooted, which is nice, but has non-uniform priors on topologies. @mlandis would this be easy with revBayes?
October 15, 2014
Erick Matsen wrote:
New from @alexei_drummond and his postdoc:
The space of ultrametric phylogenetic trees by Alex Gavruskin, Alexei J. Drummond
We introduce two metric spaces on ultrametric phylogenetic trees and compare them with existing models of tree space. We formulate several formal requirements that a metric space on phylogenetic trees must possess in order to be a satisfactory space for statistical analysis, and justify them. We show that only a few known constructions of the space of phylogenetic trees satisfy these requirements. However, our results suggest that these basic requirements are not enough to distinguish between the two metric spaces we introduce and that the choice between metric spaces requires additional properties to be considered.
I haven't read it in detail, but it seems that the most version of the space that is most natural for time-trees (their t-space) has properties that make it mathematically difficult to analyze. The combinatorial machinery that helped out with the BHV space doesn't help here.
Theorem 8. The problem of computing geodesics in t-space is NP-hard. We will reduce the problem of computing NNI-distance to the problem of computing geodesics in t-space, but before going on to the proof of this result, we would like to develop some intuition of why t-space is so different from both BHV and τ -space. The key property for this difference is that the cone-path is rarely a geodesic in t-space. Indeed, in both BHV and τ - space the position of two cubes can result in a cone-path being the geodesic between any pair of trees from these cubes. Particularly, the measure of the set of pairs of trees between which the cone-path is a geodesic is positive. For example, if two trees T and R have topologies with no compatible splits, then the geodesic between T and R is a cone-path . A property such as this does not present in t-space. It will follow from the observations below that the measure of the set of pairs of trees between which the geodesic is a cone-path in t-space has measure 0.
I know @cwhidden has been reading it so perhaps he'll post some observations.
October 10, 2014
Erick Matsen wrote:
From Gascuel & co--www.ncbi.nlm.nih.gov Searching for virus phylotypes. F Chevenet, M Jung, M Peeters, T de Oliveira and O Gascuel, Bioinformatics (Oxford, England), Mar 2013 1
Large phylogenies are being built today to study virus evolution, trace the origin of epidemics, establish the mode of transmission and survey the appearance of drug resistance. However, no tool is available to quickly inspect these phylogenies and combine them with extrinsic traits (e.g. geographic location, risk group, presence of a given resistance mutation), seeking to extract strain groups of specific interest or requiring surveillance.
News to me!
October 6, 2014
Rob Lanfear wrote:
I'm wondering what software folks use for automatically aligning sequences and then manually editing those alignments on macs?
I know there's lots of software out there, but I'm wondering if there's something I've missed. In principle I like the offerings in Geneious (it includes various plug-ins for automated alignment, and a very serviceable manual editor), but the pricetag is a bit steep if that's all you want it for...
Fast log likelihood given a fixed alignment, fixed tree, and fixed huge arbitrary sparse transition rate matrix
I know some ways to compute this, but I wonder who has the best current implementation? This would just be a tool for methods development testing rather than for doing anything practical, for example it wouldn't estimate anything and it wouldn't need to know anything about biology.
September 30, 2014
Erick Matsen wrote:
New from @tanja_stadler and co:www.ncbi.nlm.nih.gov On age and species richness of higher taxa. T Stadler, DL Rabosky, RE Ricklefs and F Bokma, The American naturalist, Oct 2014
Abstract Many studies have tried to identify factors that explain differences in numbers of species between clades against the background assumption that older clades contain more species because they have had more time for diversity to accumulate. The finding in several recent studies that species richness of clades is decoupled from stem age has been interpreted as evidence for ecological limits to species richness. Here we demonstrate that the absence of a positive age-diversity relationship, or even a negative relationship, may also occur when taxa are defined based on time or some correlate of time such as genetic distance or perhaps morphological distinctness. Thus, inferring underlying processes from distributions of species across higher taxa requires caution concerning the way in which higher taxa are defined. When this definition is unclear, crown age is superior to stem age as a measure of clade age.
They were thinking about what models might not have a monotonically positive age-diversity relationship for clades:
Several studies have investigated relations between species richness and ages of higher taxa. Three methodological articles (Magallón and Sanderson 2001; Bokma 2003; Paradis 2003) prominently featuring the idea that E[n] = e(λ − μ)t have together been cited by more than 500 articles. Furthermore, Rabosky et al. (2012) investigated the behavior of a simple model where higher taxa originate under a Poisson process (see also Aldous et al. 2008; Maruvka et al. 2013). They found that such a model was expected to result in positive relationships between stem clade age and species richness, even when rates of species diversification varied among clades, provided that rates within clades were constant through time. As we have shown here, the expectation of a positive relationship between stem age and species richness may be incorrect, as it depends on the particular model of diversification and definition of higher taxa.
Many studies have identified young taxa as “unexpectedly” species rich, but our results show that such patterns can result from the manner in which higher taxa are delimited. For example, under scenarios i-b and ii-b, clades with young stem ages are expected to contain not fewer but more species than clades with old stem ages (table 1). In other words, studies may have incorrectly identified young taxa as unexpectedly species rich because they neglected how taxa were defined, and consequently incorrectly expected young taxa to be species poor.
Here is the model they consider:
September 29, 2014
Erick Matsen wrote:haldanessieve.org Author post: Predicting evolution from the shape of genealogical trees
Here's what I see as the essentials of their model:
September 26, 2014
September 19, 2014
Brian Foley wrote:
A user in another phylogenetics discussion group today had a question about analyzing more than 100 sequences each of more than 80,000 bases length, all from one gene. This lead me to assume the sequences were from closely related organisms because otherwise the introns could be too diverse to align while the exons were still alignable. This made me wonder, if we have 100 very long sequences from a single species of mammal (for example humans sampled around the world) what types of tests can be done to look for recombination, and how to measure the phylogentic signal to noise ratio in the data. The consistency index and retention index are two useful measurements, but I rarely see them reported for data sets, and most phylogenetic software packages to not compute them and display them with the results.
September 11, 2014
Alex Jeffries wrote:
I'm about to run my Final Year BSc (hons) molecular phylogenetics unit again and am looking for some inspiration. In the past I have used Trypanosome genes as a dataset for the coursework exercises, but it's starting to feel stale to me. Does anyone have any suggestions for an interesting phylogenetic question and dataset that would allow students to collect sequences, align, make inferences and thereby test some sort of hypothesis? Preferably (and this is the hard bit) not previously published so they can't just crib from the papers.
Many thanks in advance.
September 7, 2014
Krzysztof M. Kozak wrote:
I am building a pipeline to automatically generate gene trees for about 10,000 CDS alignments (all genes from an exome). The genes were sequenced for 150 individuals in multiple species. Some individuals are worse than others and occasionally have little data in some alignments, and end up on obviously artificially inflated branches. Is anyone aware of a tool to prune those automatically? (I will also use tools to get rid of poor sequence first, but that's a different topic.)
Many thanks, Krzysztof Kozak
The Genealogical World of Phylogenetic Networks
BMC Evolutionary Biology
Molecular Biology and Evolution