phylobabble.org

Latest topics

URL

XML feed
http://www.phylobabble.org/latest

Last update

1 hour 14 min ago

November 21, 2014

10:14

Db60 wrote:

Meeting: NEXTGEN BIOINFORMATICS USER GROUP and SCOTTISH PHYLOGENY DISCUSSION GROUP

University of St Andrews, UK, 8 December 2014

https://genomics.ed.ac.uk/ngbug/next-meeting-st-andrews

Invited speaker -

Dr Jo DICKS (National Collection of Yeast Cultures http://www.ncyc.co.uk, Institute of Food Research):

"Estimating and exploiting yeast NGS-based phylogenies for industrial biotechnology".

Contributed talks -

Emma CARROLL: "Assessing the influence of migratory culture on connectivity in the southern right whale".

Deepali BASOYA: "Viral/host gene expression profiles in lymphoid and feather follicle epithelial (FFE) cells infected with Marek's disease virus".

Miguel PINHEIRO: "Determine dimorphic nature of the zoonotic parasite Plasmodium knowlesi".

Georgios KOUTSOVOULOS: "Reconstructing the phylogenetic relationships of nematodes using draft genomes and transcriptomes".

Joanne TAYLOR: "Environment and host genotype influence on fungal endophyte assemblages of Scots Pine".

Attendance is free, but please register in advance.

DETAILS AND REGISTRATION:

https://genomics.ed.ac.uk/ngbug/next-meeting-st-andrews

Daniel Barker db60@st-andrews.ac.uk

Posts: 1

Participants: 1

Read full topic

November 17, 2014

18:52

Erick Matsen wrote:

The next series on http://phyloseminar.org will be about ancestral recombination graphs.

What would people like to see after that?

Posts: 1

Participants: 1

Read full topic

November 9, 2014

07:48

Erick Matsen wrote:

Using species-tree aware gene trees for ancestral reconstruction is a good thing!

www.ncbi.nlm.nih.gov Towards more accurate ancestral protein genotype-phenotype reconstructions with the use of species tree-aware gene trees. M Groussin, JK Hobbs, GJ Szöllősi, S Gribaldo, VL Arcus and M Gouy, Molecular biology and evolution, Nov 4 2014

The resurrection of ancestral proteins provides direct insight into how natural selection has shaped proteins found in nature. By tracing substitutions along a gene phylogeny, ancestral proteins can be reconstructed in silico and subsequently synthesized in vitro. This elegant strategy reveals the complex mechanisms responsible for the evolution of protein functions and structures. However, to date, all protein resurrection studies have used simplistic approaches for ancestral sequence reconstruction (ASR), including the assumption that a single sequence alignment alone is sufficient to accurately reconstruct the history of the gene family. The impact of such shortcuts on conclusions about ancestral functions has not been investigated. Here, we show with simulations that utilizing information on species history using a model that accounts for the duplication, horizontal transfer and loss (DTL) of genes statistically increases ASR accuracy. This underscores the importance of the tree topology in the inference of putative ancestors. We validate our in silico predictions using in vitro resurrection of the LeuB enzyme for the ancestor of the Firmicutes, a major and ancient bacterial phylum. With this particular protein, our experimental results demonstrate that information on the species phylogeny results in a biochemically more realistic and kinetically more stable ancestral protein. Additional resurrection experiments with different proteins are necessary to statistically quantify the impact of using species tree-aware gene trees on ancestral protein phenotypes. Nonetheless, our results suggest the need for incorporating both sequence and DTL information in future studies of protein resurrections to accurately define the genotype-phenotype space in which proteins diversify.

Posts: 1

Participants: 1

Read full topic

October 28, 2014

13:16

Erick Matsen wrote:

Fellow babblers,

@cwhidden and I would like to sample from the subtree-prune-regraft (SPR) random walk on rooted phylogenetic trees. Does anyone know an easy way to do this? Chris could roll his own, but I'll bet that there is an easy solution out there.

If we wanted to sample from the random walk on unrooted trees, we could sample from the MrBayes prior. BEAST is rooted, which is nice, but has non-uniform priors on topologies. @mlandis would this be easy with revBayes?

Thanks!

Posts: 10

Participants: 4

Read full topic

October 15, 2014

17:39

Erick Matsen wrote:

New from @alexei_drummond and his postdoc:

The space of ultrametric phylogenetic trees by Alex Gavruskin, Alexei J. Drummond

We introduce two metric spaces on ultrametric phylogenetic trees and compare them with existing models of tree space. We formulate several formal requirements that a metric space on phylogenetic trees must possess in order to be a satisfactory space for statistical analysis, and justify them. We show that only a few known constructions of the space of phylogenetic trees satisfy these requirements. However, our results suggest that these basic requirements are not enough to distinguish between the two metric spaces we introduce and that the choice between metric spaces requires additional properties to be considered.

http://arxiv.org/abs/1410.3544

I haven't read it in detail, but it seems that the most version of the space that is most natural for time-trees (their t-space) has properties that make it mathematically difficult to analyze. The combinatorial machinery that helped out with the BHV space doesn't help here.

Theorem 8. The problem of computing geodesics in t-space is NP-hard. We will reduce the problem of computing NNI-distance to the problem of computing geodesics in t-space, but before going on to the proof of this result, we would like to develop some intuition of why t-space is so different from both BHV and τ -space. The key property for this difference is that the cone-path is rarely a geodesic in t-space. Indeed, in both BHV and τ - space the position of two cubes can result in a cone-path being the geodesic between any pair of trees from these cubes. Particularly, the measure of the set of pairs of trees between which the cone-path is a geodesic is positive. For example, if two trees T and R have topologies with no compatible splits, then the geodesic between T and R is a cone-path [3]. A property such as this does not present in t-space. It will follow from the observations below that the measure of the set of pairs of trees between which the geodesic is a cone-path in t-space has measure 0.

I know @cwhidden has been reading it so perhaps he'll post some observations.

Posts: 1

Participants: 1

Read full topic

October 10, 2014

15:17

Erick Matsen wrote:

From Gascuel & co--

www.ncbi.nlm.nih.gov Searching for virus phylotypes. F Chevenet, M Jung, M Peeters, T de Oliveira and O Gascuel, Bioinformatics (Oxford, England), Mar 2013 1

Large phylogenies are being built today to study virus evolution, trace the origin of epidemics, establish the mode of transmission and survey the appearance of drug resistance. However, no tool is available to quickly inspect these phylogenies and combine them with extrinsic traits (e.g. geographic location, risk group, presence of a given resistance mutation), seeking to extract strain groups of specific interest or requiring surveillance.

http://lamarck.lirmm.fr/phylotype/

News to me!

Posts: 1

Participants: 1

Read full topic

October 6, 2014

22:41

Rob Lanfear wrote:

Hi All,

I'm wondering what software folks use for automatically aligning sequences and then manually editing those alignments on macs?

I know there's lots of software out there, but I'm wondering if there's something I've missed. In principle I like the offerings in Geneious (it includes various plug-ins for automated alignment, and a very serviceable manual editor), but the pricetag is a bit steep if that's all you want it for...

Cheers,

Rob

Posts: 7

Participants: 6

Read full topic

12:27

argriffing wrote:

I know some ways to compute this, but I wonder who has the best current implementation? This would just be a tool for methods development testing rather than for doing anything practical, for example it wouldn't estimate anything and it wouldn't need to know anything about biology.

Posts: 3

Participants: 3

Read full topic

September 30, 2014

14:28

Erick Matsen wrote:

New from @tanja_stadler and co:

www.ncbi.nlm.nih.gov On age and species richness of higher taxa. T Stadler, DL Rabosky, RE Ricklefs and F Bokma, The American naturalist, Oct 2014

Abstract Many studies have tried to identify factors that explain differences in numbers of species between clades against the background assumption that older clades contain more species because they have had more time for diversity to accumulate. The finding in several recent studies that species richness of clades is decoupled from stem age has been interpreted as evidence for ecological limits to species richness. Here we demonstrate that the absence of a positive age-diversity relationship, or even a negative relationship, may also occur when taxa are defined based on time or some correlate of time such as genetic distance or perhaps morphological distinctness. Thus, inferring underlying processes from distributions of species across higher taxa requires caution concerning the way in which higher taxa are defined. When this definition is unclear, crown age is superior to stem age as a measure of clade age.

They were thinking about what models might not have a monotonically positive age-diversity relationship for clades:

Several studies have investigated relations between species richness and ages of higher taxa. Three methodological articles (Magallón and Sanderson 2001; Bokma 2003; Paradis 2003) prominently featuring the idea that E[n] = e(λ − μ)t have together been cited by more than 500 articles. Furthermore, Rabosky et al. (2012) investigated the behavior of a simple model where higher taxa originate under a Poisson process (see also Aldous et al. 2008; Maruvka et al. 2013). They found that such a model was expected to result in positive relationships between stem clade age and species richness, even when rates of species diversification varied among clades, provided that rates within clades were constant through time. As we have shown here, the expectation of a positive relationship between stem age and species richness may be incorrect, as it depends on the particular model of diversification and definition of higher taxa.

Many studies have identified young taxa as “unexpectedly” species rich, but our results show that such patterns can result from the manner in which higher taxa are delimited. For example, under scenarios i-b and ii-b, clades with young stem ages are expected to contain not fewer but more species than clades with old stem ages (table 1). In other words, studies may have incorrectly identified young taxa as unexpectedly species rich because they neglected how taxa were defined, and consequently incorrectly expected young taxa to be species poor.

Here is the model they consider:

Pasted image1186x674 129 KB

Posts: 7

Participants: 3

Read full topic

September 29, 2014

September 26, 2014

14:57

Erick Matsen wrote:

Hopefully we will get some nice simple shells out of this mess.

Posts: 3

Participants: 1

Read full topic

September 19, 2014

05:04

Brian Foley wrote:

A user in another phylogenetics discussion group today had a question about analyzing more than 100 sequences each of more than 80,000 bases length, all from one gene. This lead me to assume the sequences were from closely related organisms because otherwise the introns could be too diverse to align while the exons were still alignable. This made me wonder, if we have 100 very long sequences from a single species of mammal (for example humans sampled around the world) what types of tests can be done to look for recombination, and how to measure the phylogentic signal to noise ratio in the data. The consistency index and retention index are two useful measurements, but I rarely see them reported for data sets, and most phylogenetic software packages to not compute them and display them with the results.

Posts: 1

Participants: 1

Read full topic

September 11, 2014

04:07

Alex Jeffries wrote:

I'm about to run my Final Year BSc (hons) molecular phylogenetics unit again and am looking for some inspiration. In the past I have used Trypanosome genes as a dataset for the coursework exercises, but it's starting to feel stale to me. Does anyone have any suggestions for an interesting phylogenetic question and dataset that would allow students to collect sequences, align, make inferences and thereby test some sort of hypothesis? Preferably (and this is the hard bit) not previously published so they can't just crib from the papers.

Many thanks in advance.

Posts: 7

Participants: 4

Read full topic

September 7, 2014

14:34

Krzysztof M. Kozak wrote:

Dear All,

I am building a pipeline to automatically generate gene trees for about 10,000 CDS alignments (all genes from an exome). The genes were sequenced for 150 individuals in multiple species. Some individuals are worse than others and occasionally have little data in some alignments, and end up on obviously artificially inflated branches. Is anyone aware of a tool to prune those automatically? (I will also use tools to get rid of poor sequence first, but that's a different topic.)

Many thanks, Krzysztof Kozak

Posts: 3

Participants: 3

Read full topic