Latest topics


XML feed

Last update

1 hour 5 min ago

October 28, 2014


Erick Matsen wrote:

Fellow babblers,

@cwhidden and I would like to sample from the subtree-prune-regraft (SPR) random walk on rooted phylogenetic trees. Does anyone know an easy way to do this? Chris could roll his own, but I'll bet that there is an easy solution out there.

If we wanted to sample from the random walk on unrooted trees, we could sample from the MrBayes prior. BEAST is rooted, which is nice, but has non-uniform priors on topologies. @mlandis would this be easy with revBayes?


Posts: 4

Participants: 3

Read full topic

October 15, 2014


Erick Matsen wrote:

New from @alexei_drummond and his postdoc:

The space of ultrametric phylogenetic trees by Alex Gavruskin, Alexei J. Drummond

We introduce two metric spaces on ultrametric phylogenetic trees and compare them with existing models of tree space. We formulate several formal requirements that a metric space on phylogenetic trees must possess in order to be a satisfactory space for statistical analysis, and justify them. We show that only a few known constructions of the space of phylogenetic trees satisfy these requirements. However, our results suggest that these basic requirements are not enough to distinguish between the two metric spaces we introduce and that the choice between metric spaces requires additional properties to be considered.

I haven't read it in detail, but it seems that the most version of the space that is most natural for time-trees (their t-space) has properties that make it mathematically difficult to analyze. The combinatorial machinery that helped out with the BHV space doesn't help here.

Theorem 8. The problem of computing geodesics in t-space is NP-hard. We will reduce the problem of computing NNI-distance to the problem of computing geodesics in t-space, but before going on to the proof of this result, we would like to develop some intuition of why t-space is so different from both BHV and τ -space. The key property for this difference is that the cone-path is rarely a geodesic in t-space. Indeed, in both BHV and τ - space the position of two cubes can result in a cone-path being the geodesic between any pair of trees from these cubes. Particularly, the measure of the set of pairs of trees between which the cone-path is a geodesic is positive. For example, if two trees T and R have topologies with no compatible splits, then the geodesic between T and R is a cone-path [3]. A property such as this does not present in t-space. It will follow from the observations below that the measure of the set of pairs of trees between which the geodesic is a cone-path in t-space has measure 0.

I know @cwhidden has been reading it so perhaps he'll post some observations.

Posts: 1

Participants: 1

Read full topic

October 10, 2014


Erick Matsen wrote:

From Gascuel & co-- Searching for virus phylotypes. F Chevenet, M Jung, M Peeters, T de Oliveira and O Gascuel, Bioinformatics (Oxford, England), Mar 2013 1

Large phylogenies are being built today to study virus evolution, trace the origin of epidemics, establish the mode of transmission and survey the appearance of drug resistance. However, no tool is available to quickly inspect these phylogenies and combine them with extrinsic traits (e.g. geographic location, risk group, presence of a given resistance mutation), seeking to extract strain groups of specific interest or requiring surveillance.

News to me!

Posts: 1

Participants: 1

Read full topic

October 6, 2014


Rob Lanfear wrote:

Hi All,

I'm wondering what software folks use for automatically aligning sequences and then manually editing those alignments on macs?

I know there's lots of software out there, but I'm wondering if there's something I've missed. In principle I like the offerings in Geneious (it includes various plug-ins for automated alignment, and a very serviceable manual editor), but the pricetag is a bit steep if that's all you want it for...



Posts: 6

Participants: 6

Read full topic


argriffing wrote:

I know some ways to compute this, but I wonder who has the best current implementation? This would just be a tool for methods development testing rather than for doing anything practical, for example it wouldn't estimate anything and it wouldn't need to know anything about biology.

Posts: 3

Participants: 3

Read full topic

September 30, 2014


Erick Matsen wrote:

New from @tanja_stadler and co: On age and species richness of higher taxa. T Stadler, DL Rabosky, RE Ricklefs and F Bokma, The American naturalist, Oct 2014

Abstract Many studies have tried to identify factors that explain differences in numbers of species between clades against the background assumption that older clades contain more species because they have had more time for diversity to accumulate. The finding in several recent studies that species richness of clades is decoupled from stem age has been interpreted as evidence for ecological limits to species richness. Here we demonstrate that the absence of a positive age-diversity relationship, or even a negative relationship, may also occur when taxa are defined based on time or some correlate of time such as genetic distance or perhaps morphological distinctness. Thus, inferring underlying processes from distributions of species across higher taxa requires caution concerning the way in which higher taxa are defined. When this definition is unclear, crown age is superior to stem age as a measure of clade age.

They were thinking about what models might not have a monotonically positive age-diversity relationship for clades:

Several studies have investigated relations between species richness and ages of higher taxa. Three methodological articles (Magallón and Sanderson 2001; Bokma 2003; Paradis 2003) prominently featuring the idea that E[n] = e(λ − μ)t have together been cited by more than 500 articles. Furthermore, Rabosky et al. (2012) investigated the behavior of a simple model where higher taxa originate under a Poisson process (see also Aldous et al. 2008; Maruvka et al. 2013). They found that such a model was expected to result in positive relationships between stem clade age and species richness, even when rates of species diversification varied among clades, provided that rates within clades were constant through time. As we have shown here, the expectation of a positive relationship between stem age and species richness may be incorrect, as it depends on the particular model of diversification and definition of higher taxa.

Many studies have identified young taxa as “unexpectedly” species rich, but our results show that such patterns can result from the manner in which higher taxa are delimited. For example, under scenarios i-b and ii-b, clades with young stem ages are expected to contain not fewer but more species than clades with old stem ages (table 1). In other words, studies may have incorrectly identified young taxa as unexpectedly species rich because they neglected how taxa were defined, and consequently incorrectly expected young taxa to be species poor.

Here is the model they consider:

Pasted image1186x674 129 KB

Posts: 6

Participants: 3

Read full topic

September 29, 2014

September 26, 2014


Erick Matsen wrote:

Hopefully we will get some nice simple shells out of this mess.

Posts: 3

Participants: 1

Read full topic

September 19, 2014


Brian Foley wrote:

A user in another phylogenetics discussion group today had a question about analyzing more than 100 sequences each of more than 80,000 bases length, all from one gene. This lead me to assume the sequences were from closely related organisms because otherwise the introns could be too diverse to align while the exons were still alignable. This made me wonder, if we have 100 very long sequences from a single species of mammal (for example humans sampled around the world) what types of tests can be done to look for recombination, and how to measure the phylogentic signal to noise ratio in the data. The consistency index and retention index are two useful measurements, but I rarely see them reported for data sets, and most phylogenetic software packages to not compute them and display them with the results.

Posts: 1

Participants: 1

Read full topic

September 11, 2014


Alex Jeffries wrote:

I'm about to run my Final Year BSc (hons) molecular phylogenetics unit again and am looking for some inspiration. In the past I have used Trypanosome genes as a dataset for the coursework exercises, but it's starting to feel stale to me. Does anyone have any suggestions for an interesting phylogenetic question and dataset that would allow students to collect sequences, align, make inferences and thereby test some sort of hypothesis? Preferably (and this is the hard bit) not previously published so they can't just crib from the papers.

Many thanks in advance.

Posts: 7

Participants: 4

Read full topic

September 7, 2014


Krzysztof M. Kozak wrote:

Dear All,

I am building a pipeline to automatically generate gene trees for about 10,000 CDS alignments (all genes from an exome). The genes were sequenced for 150 individuals in multiple species. Some individuals are worse than others and occasionally have little data in some alignments, and end up on obviously artificially inflated branches. Is anyone aware of a tool to prune those automatically? (I will also use tools to get rid of poor sequence first, but that's a different topic.)

Many thanks, Krzysztof Kozak

Posts: 3

Participants: 3

Read full topic

August 17, 2014


Db60 wrote:

Does anyone know a good way to convert the alignment within a BEAST XML file to PHYLIP (or Nexus, Fasta, etc)?

I could script it myself, but I assume the problem has already been addressed by others.

I realise the non-sequence data within the BEAST XML file will be lost, but for my purposes that's OK.

Thank you very much, in advance.

Daniel Barker

Posts: 2

Participants: 2

Read full topic

August 14, 2014


Scott Handley wrote:

Hello Phylobabble community!

I am assisting in the organization of a Workshop on Molecular Evolution which will be held in Cesky Krumlov, Czech Republic in January 2015. I have helped to organize this event before, but this year we are renewing the program and I am working with several new people to design something that we believe will be of interest to many in the phyologenetics/molecular evolution communities. More details below!

We also organize a Workshop on Genomics immediately prior to the Molecular Evolution Workshop for those interested in those sorts of topics:

2015 Workshop on Molecular Evolution, Český Krumlov, Czech Republic

Dates: 25 January - 7 February, 2015

Application Deadline: 15 October, 2014 is the preferred application deadline, after which time people will be admitted to the course following application review by the admissions committee. However, later applications will certainly be considered for admittance or for placement on a waiting list.

Registration Fee: $1500 USD. Fee includes opening reception and access to all course material, but does not include other meals or housing. Special discounted pricing has been arranged for hotels, pensions and hostels. Information regarding housing and travel will be made to applicants following acceptance.


Useful Links: Direct Link to the Full Workshop Schedule: General Workshop information: Frequently Asked Questions (FAQ) about the Workshop and Český Krumlov can be found here:

Workshop Overview:

The 2015 Workshop on Molecular Evolution brings together an international collection of faculty members and Workshop participants to study and discuss current ideas and techniques for exploring molecular evolution. The Workshop on Molecular Evolution consists of a series of lectures, demonstrations and computer laboratories that cover theoretical and conceptual aspects of molecular evolution with a strong emphasis on data analysis.

The Workshop has a strong focus on molecular phylogenetics, and covers all aspects of phylogenetic workflows, including marker selection, phylogeny reconstruction, time-calibration, as well as detection of natural selection, phylogeography, diversification rates, and trait evolution patterns. A majority of the schedule is dedicated to hands-on learning activities designed by faculty and the workshop team. This interactive experience provides Workshop participants with the practical experience required to meet the challenges presented by modern evolutionary sciences.

Co-directors: Walter Salzburger, Michael Matschiner, Jan Stefka and Scott Handley

For more information and online application see the Workshop web site -

Posts: 1

Participants: 1

Read full topic