There are currently 0 users and 43 guests online.
Last update7 min 1 sec ago
April 16, 2015
I am putting together a collection of alignments with metadata (https://github.com/roblanf/PartitionedAlignments), and I'm looking for advice on file formats. The point of the collection is to make it simpler to test software and compare methods, by providing a well-annotated, tested set of published alignments that are all CC0.
The problem is formats. Each dataset has an alignment, various definitions of sites (i.e. which locus and genome each site comes from), taxon sets (e.g. outgroups), and other metadata (e.g. DOIs for the original study and data set, estimate of the age of the root of the tree, etc). Alignment formats are notoriously varied, so I'd like to stick with one of the standard formats (Nexus, phylip, FASTA), plus at most one more file for metadata (e.g. YAML, CSV).
I'd be happy to hear anyone's thoughts on the various pros and cons of any options.
April 8, 2015
Would someone be able to point me to a generalisation of the Fitch algorithm to calculate parsimonious length for a topology, but which works for non-binary trees?
Actually our goal is to calculate consistency index correctly, in the face of (possibly) ambiguous DNA base symbols (R, Y, S, V, M, etc).
Getting consistency index involves knowing the minimum conceivable length of a tree, calculated for each character individually.
This part of the problem seems equivalent to calculating tree length, for each character separately, for a 'bush' topology in which all terminals are connected directly to the root.
I'm just not quite sure how to do that. But I'm sure it is known, in general.
Thanks a lot,
Is it possible to normalize the number of observed SNPs to the size of the genome a good quality call can be made
Hello, I am trying to use BEAST to infer the ancestry of intra-specific isolates (we have ~200). I currently have a list of SNPs called from illumina sequencing data. Since Im basing my BEAST analysis on these SNPs, is there a way to normalize the number of mutational events observed in each isolate to the size of the genome over which a high quality call is possible? Would this be even necessary? I assume one way of doing this would be to do some bootstrapping, but this would be too PC time consuming
March 25, 2015
Interesting blog post on single precision arithmetic, compiler flags, and FastTree via Jonathan Eisen on Facebook (no, really) http://darlinglab.org/blog/2015/03/23/not-so-fast-fasttree.html
March 4, 2015
It looks like David Maddison, Rob Knight, and some others got an NSF grant to build out a website and have some meetings.
They are trying to stimulate some discussion around these topics:
March 3, 2015
“**20th International Bioinformatics Workshop on Virus Evolution and Molecular Epidemiology
The University of the West Indies (UWI), St. Augustine Campus, Trinidad and Tobago Sunday, August 9 - Friday, August 14, 2015
We are announcing the organization of the international workshop on Virus Evolution and Molecular Epidemiology (VEME) in 2015, hosted by the University of the West Indies (UWI), St. Augustine, Trinidad and Tobago on behalf of our main sponsor the International Committee for Genetic Engineering and Biotechnology. The workshop is co-organised by the University of Leuven (Belgium) and the J. Craig Venter Institute (USA).
We plan to organize a 'Phylogenetic Inference' module that offers the theoretical background and hands-on experience in phylogenetic analysis for those who have little or no prior expertise in sequence analysis. An 'Evolutionary Hypothesis Testing' is targeted to participants who are well familiar with alignments and phylogenetic trees, and would like to extend their expertise to likelihood and Bayesian inference in phylogenetics, coalescent and phylogeographic analyses ('phylodynamics') and molecular adaptation. A 'Large Dataset Analysis' module will cover the more complex analysis of full genomes, huge datasets of pathogens including Next Generation Sequencing data, and combined analyses of pathogen and host. Practical sessions in these modules will involve software like, PHYLIP, PAUP*, PHYML, MEGA, PAML or HYPHY, TREE-PUZZLE, SplitsTree, BEAST, MrBayes Simplot and RDP3.
We recommend participants to buy The Phylogenetic Handbook as a guide during the workshop, and to bring their own data set.
The abstract and application deadline is March 15th
Selections will be made by the beginning of May.
The registration fee of 850 Euro covers attendance, lunches and coffee breaks. Participation is limited to 30 scientists in each module and is dependent on a selection procedure based on the submitted abstract and statement of motivation. A limited number of grants are available for scientists who experience difficulties to attend because of financial reasons.
Selection criteria: (in order of importance)
Quality of the abstract: abstracts will be reviewed and priority will be given to applicants who are first author on the abstract.Letter of motivation: how urgent/important is your need for training?Each module is preferably restricted to 1 participant from the same lab.Priority will be given to participants from countries with limited resources.
Grant criteria: (in order of importance) Priority to countries with limited resources.Ranking according to the abstract quality.
Additional information and application forms are available on our website: http://www.rega.kuleuven.be/cev/veme-workshop/2015
We are confident that this course meets the needs of many molecular virologists and epidemiologists, and hope we can assist you in your search for training in Bioinformatics methods.
Christine Carrington, Karen Nelson and Annemie Vandamme Organizers of the workshop
March 2, 2015
The Genealogical World of Phylogenetic Networks
BMC Evolutionary Biology