There are currently 0 users and 41 guests online.
August 17, 2015
A binary phylogenetic network may or may not be obtainable from a tree by the addition of directed edges (arcs) between tree arcs. Here, we establish a precise and easily tested criterion (based on "2-SAT") that efficiently determines whether or not any given network can be realized in this way. Moreover, the proof provides a polynomial-time algorithm for finding one or more trees (when they exist) on which the network can be based. A number of interesting consequences are presented as corollaries; these lead to some further relevant questions and observations, which we outline in the conclusion.
Current Methods for Automated Filtering of Multiple Sequence Alignments Frequently Worsen Single-Gene Phylogenetic Inference
Phylogenetic inference is generally performed on the basis of multiple sequence alignments (MSA). Because errors in an alignment can lead to errors in tree estimation, there is a strong interest in identifying and removing unreliable parts of the alignment. In recent years several automated filtering approaches have been proposed, but despite their popularity, a systematic and comprehensive comparison of different alignment filtering methods on real data has been lacking. Here, we extend and apply recently introduced phylogenetic tests of alignment accuracy on a large number of gene families and contrast the performance of unfiltered versus filtered alignments in the context of single-gene phylogeny reconstruction. Based on multiple genome-wide empirical and simulated data sets, we show that the trees obtained from filtered MSAs are on average worse than those obtained from unfiltered MSAs. Furthermore, alignment filtering often leads to an increase in the proportion of well-supported branches that are actually wrong. We confirm that our findings hold for a wide range of parameters and methods. Although our results suggest that light filtering (up to 20% of alignment positions) has little impact on tree accuracy and may save some computation time, contrary to widespread practice, we do not generally recommend the use of current alignment filtering methods for phylogenetic inference. By providing a way to rigorously and systematically measure the impact of filtering on alignments, the methodology set forth here will guide the development of better filtering algorithms.
Species-Level Phylogeny and Polyploid Relationships in Hordeum (Poaceae) Inferred by Next-Generation Sequencing and In Silico Cloning of Multiple Nuclear Loci
Polyploidization is an important speciation mechanism in the barley genus Hordeum. To analyze evolutionary changes after allopolyploidization, knowledge of parental relationships is essential. One chloroplast and 12 nuclear single-copy loci were amplified by polymerase chain reaction (PCR) in all Hordeum plus six out-group species. Amplicons from each of 96 individuals were pooled, sheared, labeled with individual-specific barcodes and sequenced in a single run on a 454 platform. Reference sequences were obtained by cloning and Sanger sequencing of all loci for nine supplementary individuals. The 454 reads were assembled into contigs representing the 13 loci and, for polyploids, also homoeologues. Phylogenetic analyses were conducted for all loci separately and for a concatenated data matrix of all loci. For diploid taxa, a Bayesian concordance analysis and a coalescent-based dated species tree was inferred from all gene trees. Chloroplast matK was used to determine the maternal parent in allopolyploid taxa. The relative performance of different multilocus analyses in the presence of incomplete lineage sorting and hybridization was also assessed. The resulting multilocus phylogeny reveals for the first time species phylogeny and progenitor-derivative relationships of all di- and polyploid Hordeum taxa within a single analysis. Our study proves that it is possible to obtain a multilocus species-level phylogeny for di- and polyploid taxa by combining PCR with next-generation sequencing, without cloning and without creating a heavy load of sequence data.
Exploring Tree-Like and Non-Tree-Like Patterns Using Genome Sequences: An Example Using the Inbreeding Plant Species Arabidopsis thaliana (L.) Heynh
Genome sequence data contain abundant information about genealogical history, but methods for extracting and interpreting this information are not yet fully developed. We analyzed genome sequences for multiple accessions of the selfing plant, Arabidopsis thaliana, with the goal of better understanding its genealogical history. As expected from accessions of the same species, we found much discordance between nuclear gene trees. Nonetheless, we inferred the optimal population tree under the assumption that all discordance is due to incomplete lineage sorting. To cope with the size of the data (many genes and many taxa), our pipeline is based on parallel computing and divides the problem into four-taxon trees. However, just because a population tree can be estimated does not mean that the assumptions of the multispecies coalescent model hold. Therefore, we implemented a new, nonparametric test to evaluate whether a population tree adequately explains the observed quartet frequencies (the frequencies of gene trees with each resolution of each four-taxon set). This test also considers other models: panmixia and a partially resolved population tree, that is, a tree in which some nodes are collapsed into local panmixia. We found that a partially resolved population tree provides the best fit to the data, providing evidence for tree-like structure within A. thaliana, qualitatively similar to what might be expected between different, closely related species. Further, we show that the pattern of deviation from expectations can be used to identify instances of introgression and detect one clear case of reticulation among ecotypes that have come into contact in the United Kingdom. Our study illustrates how we can use genome sequence data to evaluate whether phylogenetic relationships are strictly tree-like or reticulating.
Topological heterogeneity among gene trees is widely observed in phylogenomic analyses and some of this variation is likely caused by systematic error in gene tree estimation. Systematic error can be mitigated by improving models of sequence evolution to account for all evolutionary processes relevant to each gene or identifying those genes whose evolution best conforms to existing models. However, the best method for identifying such genes is not well established. Here, we ask if filtering genes according to their clock-likeness or posterior predictive effect size (PPES, an inference-based measure of model violation) improves phylogenetic reliability and congruence. We compared these approaches to each other, and to the common practice of filtering based on rate of evolution, using two different metrics. First, we compared gene-tree topologies to accepted reference topologies. Second, we examined topological similarity among gene trees in filtered sets. Our results suggest that filtering genes based on clock-likeness and PPES can yield a collection of genes with more reliable phylogenetic signal. For the two exemplar data sets we explored, from yeast and amniotes, clock-likeness and PPES outperformed rate-based filtering in both congruence and reliability.
Two characters are stratigraphically compatible if some phylogenies indicate that their combinations (state-pairs) evolved without homoplasy and in an order consistent with the fossil record. Simulations assuming independent character change indicate that we expect approximately 95% of compatible character pairs to also be stratigraphically compatible over a wide range of sampling regimes and general evolutionary models. However, two general models of rate heterogeneity elevate expected stratigraphic incompatibility: "early burst" models, where rates of change are higher among early members of a clade than among later members of that clade, and "integration" models, where the evolution of characters is correlated in some manner. Both models have important theoretical and methodological implications. Therefore, we examine 259 metazoan clades for deviations from expected stratigraphic compatibility. We do so first assuming independent change with equal rates of character change through time. We then repeat the analysis assuming independent change with separate "early" and "late" rates (with "early" = the first third of taxa in a clade), with the early and late rates chosen to maximize the probability of the observed compatibility among the early taxa and then the whole clade. We single out Cambrian trilobites as a possible "control" group because morphometric studies suggest that integration patterns are not conserved among closely related species. Even allowing for early bursts, we see excess stratigraphic incompatibility (i.e., negative deviations) in significantly more clades than expected at 0.50, 0.25, and 0.05 P values. This pattern is particularly strong in chordates, echinoderms, and arthropods. However, stratigraphic compatibility among Cambrian trilobites matches the expectations of integration studies, as they (unlike post-Cambrian trilobites) do not deviate from the expectations of independent change with no early bursts. Thus, these results suggest that processes such as integration strongly affect the data that paleontologists use to study phylogeny, disparity, and rates.
Fossils provide the principal basis for temporal calibrations, which are critical to the accuracy of divergence dating analyses. Translating fossil data into minimum and maximum bounds for calibrations is the most important—often least appreciated—step of divergence dating. Properly justified calibrations require the synthesis of phylogenetic, paleontological, and geological evidence and can be difficult for nonspecialists to formulate. The dynamic nature of the fossil record (e.g., new discoveries, taxonomic revisions, updates of global or local stratigraphy) requires that calibration data be updated continually lest they become obsolete. Here, we announce the Fossil Calibration Database (http://fossilcalibrations.org), a new open-access resource providing vetted fossil calibrations to the scientific community. Calibrations accessioned into this database are based on individual fossil specimens and follow best practices for phylogenetic justification and geochronological constraint. The associated Fossil Calibration Series, a calibration-themed publication series at Palaeontologia Electronica, will serve as a key pipeline for peer-reviewed calibrations to enter the database.
Current science evaluation still relies on citation performance, despite criticisms of purely bibliometric research assessments. Biological taxonomy suffers from a drain of knowledge and manpower, with poor citation performance commonly held as one reason for this impediment. But is there really such a citation impediment in taxonomy? We compared the citation numbers of 306 taxonomic and 2291 non-taxonomic research articles (2009–2012) on mosses, orchids, ciliates, ants, and snakes, using Web of Science (WoS) and correcting for journal visibility. For three of the five taxa, significant differences were absent in citation numbers between taxonomic and non-taxonomic papers. This was also true for all taxa combined, although taxonomic papers received more citations than non-taxonomic ones. Our results show that, contrary to common belief, taxonomic contributions do not generally reduce a journal's citation performance and might even increase it. The scope of many journals rarely featuring taxonomy would allow editors to encourage a larger number of taxonomic submissions. Moreover, between 1993 and 2012, taxonomic publications accumulated faster than those from all biological fields. However, less than half of the taxonomic studies were published in journals in WoS. Thus, editors of highly visible journals inviting taxonomic contributions could benefit from taxonomy's strong momentum. The taxonomic output could increase even more than at its current growth rate if: (i) taxonomists currently publishing on other topics returned to taxonomy and (ii) non-taxonomists identifying the need for taxonomic acts started publishing these, possibly in collaboration with taxonomists. Finally, considering the high number of taxonomic papers attracted by the journal Zootaxa, we expect that the taxonomic community would indeed use increased chances of publishing in WoS indexed journals. We conclude that taxonomy's standing in the present citation-focused scientific landscape could easily improve—if the community becomes aware that there is no citation impediment in taxonomy.
Heterogeneous Rates of Molecular Evolution and Diversification Could Explain the Triassic Age Estimate for Angiosperms
Dating analyses based on molecular data imply that crown angiosperms existed in the Triassic, long before their undisputed appearance in the fossil record in the Early Cretaceous. Following a re-analysis of the age of angiosperms using updated sequences and fossil calibrations, we use a series of simulations to explore the possibility that the older age estimates are a consequence of (i) major shifts in the rate of sequence evolution near the base of the angiosperms and/or (ii) the representative taxon sampling strategy employed in such studies. We show that both of these factors do tend to yield substantially older age estimates. These analyses do not prove that younger age estimates based on the fossil record are correct, but they do suggest caution in accepting the older age estimates obtained using current relaxed-clock methods. Although we have focused here on the angiosperms, we suspect that these results will shed light on dating discrepancies in other major clades.
Support for Amborella as the sole survivor of an evolutionary lineage that is sister to all other angiosperms comes from positions in DNA multiple-sequence alignments that have a poor fit to time-reversible substitution models. These sites exhibit significant levels of homoplasy, compositional heterogeneity, and strong heterotachy. We report phylogenetic analyses with observed, randomized, and simulated data which show there is little or no expectation that these sites provide useful information for understanding relationships among basal angiosperms. Their inclusion in phylogenetic analyses leads to a long-branch attraction artifact that favors Amborella as sister to other angiosperms in reconstructed phylogenies. Using parametric simulations, we show that sites in chloroplast sequences that exhibit less homoplasy between angiosperms and gymnosperms provide more reliable information for inferring basal angiosperm relationships. We confirm our earlier findings that the basal angiosperm Amborella is most closely related to aquatic herbs. Our current and previously reported (Goremykin et al. 2013) analyses highlight an essential aspect of the total evidence approach to phylogenetic inference. They suggest that data partitioning aimed at identifying components of the data that better fit evolutionary models is a more reliable approach to phylogeny reconstruction at deep taxonomic levels.
August 16, 2015
Dear Colleagues, I would greatly appreciate your assistance with advertising a few research opportunities in Evolutionary Ecology and Ecological Genetics/Genomics with the Colautti Lab (bit.ly/colautti) at Queen’s University in Canada (www.queensu.ca). These internerships are for 12 weeks in the spring/summer and are targeted at undergraduates who wish to gain research experience and explore the possibility of doing a MSc or PhD degree in Canada. Three positions are available in our lab for senior undergraduate students in the following countries: Australia, Brazil, China, France, India, Mexico, Saudi Arabia, Tunisia, and Vietnam Interested students should apply by September 24, 4pm PDT, directly through the Mitacs website: http://bit.ly/1fjKGtw Many Thanks, Rob Dr. Robert I. Colautti Biology Department Queen’s University Biosciences Complex 116 Barrie St. Kingston, ON Canada K7L 3N6 firstname.lastname@example.org Phone: 613-533-2353 Fax: 613-533-6617 http://bit.ly/colauttiwww.queensu.ca). These internerships are for 12 weeks in the spring/summer and are targeted at undergraduates who wish to gain research experience and explore the possibility of doing a MSc or PhD degree in Canada. Three positions are available in our lab for senior undergraduate students in the following countries: Australia, Brazil, China, France, India, Mexico, Saudi Arabia, Tunisia, and Vietnam Interested students should apply by September 24, 4pm PDT, directly through the Mitacs website: http://bit.ly/1fjKGtw Many Thanks, Rob Dr. Robert I. Colautti Biology Department Queen’s University Biosciences Complex 116 Barrie St. Kingston, ON Canada K7L 3N6 email@example.com Phone: 613-533-2353 Fax: 613-533-6617 http://bit.ly/colautti firstname.lastname@example.org via Gmail
Intrasexual selection drives sensitivity to pitch, formants and duration in the competitive calls of fallow bucks
Background: Mammal vocal parameters such as fundamental frequency (or pitch; f o ) and formant dispersion often provide information about quality traits of the producer (e.g. dominance and body size), suggesting that they are sexually selected. However, little experimental evidence exists demonstrating the importance of these cues in intrasexual competition, particularly f o . Male Fallow deer (bucks) produce an extremely low pitched groan. Bucks have a descended larynx and generate f o well below what is expected for animals of their size. Groan parameters are linked to caller dominance, body size and condition, suggesting that groans are the product of sexual selection. Using a playback experiment, we presented bucks with groans that had been manipulated to alter vocal cues to these male characteristics and compared the response to the same, non-modified (natural) groans. Results: We experimentally examined the ability of bucks to utilise putative cues to dominance (f o ), body size (formant frequencies) and condition (groan duration), when assessing competitors. We found that bucks treated groans with lowered f o (more dominant), and lowered formant frequencies (larger caller) as more threatening. By contrast, groans with raised formant frequencies (smaller caller), and shorter durations (more fatigued caller) were treated as less threatening. Conclusions: Our results indicate that intrasexual selection is driving groans to concurrently convey caller dominance, body size and condition. They represent the first experimental demonstration of the importance of f o in male competition in non-human mammals, and show that bucks have advanced perception abilities that allow them to extract information based on relatively small changes in key parameters.
Bioinformatics lies at the nexus of the biological sciences and the computational sciences. Therefore it is sometimes worth comparing these two disciplines.
Marcus Beck at the R is My Friend blog has looked at doctoral dissertation lengths via the digital archives at the University of Minnesota. His data are shown in this box plot. You can search through it for your own favorite discipline (click on the image to make it larger).
He also has several other graphical views in his blog post, including data on masters theses.
August 15, 2015
TO ALL STUDENTS AND RESEARCHERS IN FOREST ENTOMOLOGY, ECOLOGY, AND RELATED FIELDS: The Forest Entomology Lab at the University of Florida is pleased to invite you to the SECOND Bark & Ambrosia Beetle Academy. This comprehensive, fun and nerdy workshop on the most intriguing forest pests will be held in Gainesville, FL, on May 2-6, 2016. Are you a researcher or a student interested in bark and ambrosia beetles? Do you need to know more about the beetle identification, classification, ecology or damage? Learn from international experts through hands-on labs, field demonstration, lectures, and fun socializing. Choose one or both modules: Applied and Academic. See details at http://bit.ly/1ULEP07. Sign up soon! Last year all 30 seats were taken in a few weeks. Jiri Hulcr, Assistant Professor University of Florida | School of Forest Resources and Conservation 352-273-0299 | http://bit.ly/1hMunCS “Hulcr,Jiri” via Gmail
Background: Squaliform sharks represent approximately 27 % of extant shark diversity, comprising more than 130 species with a predominantly deep-dwelling lifestyle. Many Squaliform species are highly specialized, including some that are bioluminescent, a character that is reported exclusively from Squaliform sharks within Chondrichthyes. The interfamiliar relationships within the order are still not satisfactorily resolved. Herein we estimate the phylogenetic interrelationships of a generic level sampling of “squaloid” sharks and closely related taxa using aligned sequences derived from a targeted gene capture approach. The resulting phylogenetic estimate is further used to evaluate the age of first occurrence of bioluminescence in Squaliformes. Results: Our dataset comprised 172 putative ortholog exon sequences. Phylogenetic estimates result in a fully resolved tree supporting a monophyletic lineage of Squaliformes excluding Echinorhinus. Non-luminous Squalidae are inferred to be the sister to a clade comprising all remaining Squaliform families. Our results suggest that the origin of photophores is coincident with an elevated diversification rate and the splitting of families Dalatiidae, Etmopteridae, Oxynotidae and Somniosidae at the transition of the Lower to the Upper Cretaceous. The presence of luminous organs was confirmed for the Sleeper shark genus Zameus. These results indicate that bioluminescence in sharks is not restricted solely to the families Etmopteridae and Dalatiidae as previously believed. Conclusions: The sister-clade to non-luminous Squalidae comprises five families. The presence of photophores is reported for extant members of three out of these five families based on results of this study, i.e. Lantern sharks (Etmopteridae), Kitefin sharks (Dalatiidae) and Sleeper sharks (Somniosidae). Our results suggest that the origin of luminous organs arose during the rapid diversification event that gave rise to the extant Squaliform families. These inferences are consistent with the idea of diversification of Squaliform sharks being associated with the emergence of new deep-sea habitats in the Lower Cretaceous, which may have been facilitated by the evolution of bioluminescence.
Phylogeny and biogeography of Primula sect. Armerina: implications for plant evolution under climate change and the uplift of the Qinghai-Tibet Plateau
Background: The historical orogenesis and associated climatic changes of mountain areas have been suggested to partly account for the occurrence of high levels of biodiversity and endemism. However, their effects on dispersal, differentiation and evolution of many groups of plants are still unknown. In this study, we examined the detailed diversification history of Primula sect. Armerina, and used biogeographic analysis and macro-evolutionary modeling to investigate a series of different questions concerning the evolution of the geographical and ecological distribution of the species in this section. Results: We sequenced five chloroplast and one nuclear genes for species of Primula sect. Armerina. Neither chloroplast nor nuclear trees support the monophyly of the section. The major incongruences between the two trees occur among closely related species and may be explained by hybridization. Our dating analyses based on the chloroplast dataset suggest that this section began to diverge from its relatives around 3.55 million years ago, largely coinciding with the last major uplift of the Qinghai-Tibet Plateau (QTP). Biogeographic analysis supports the origin of the section in the Himalayan Mountains and dispersal from the Himalayas to Northeastern QTP, Western QTP and Hengduan Mountains. Furthermore, evolutionary models of ecological niches show that the two P. fasciculata clades have significantly different climatic niche optima and rates of niche evolution, indicating niche evolution under climatic changes and further providing evidence for explaining their biogeographic patterns. Conclusion: Our results support the hypothesis that geologic and climatic events play important roles in driving biological diversification of organisms in the QTP area. The Pliocene uplift of the QTP and following climatic changes most likely promoted both the inter- and intraspecific divergence of Primula sect. Armerina. This study also illustrates how niche evolution under climatic changes influences biogeographic patterns.
August 14, 2015
Yet another barely thought out project, although this one has some crude code. If some 16,000 new taxonomic names are published each year, then that is roughly 40 per day. We don't have a single place that aggregates these, so any major biodiversity projects is by definition out of date. GBIF itself hasn't had an update list of fungi or plant names for several years, and at present doesn't have an up to date list of animal names. You just have to follow the Twitter feeds of ZooKeys and Zootaxa to feel swamped in new names.
And yet, most nomenclators are pumping out RSS feeds of new names, or have APIs that support time-based queries (i.e., send me the names added in the last month). Won't it be great to have a single aggregator that took these "name streams", augmented them by adding links to the literature (it could, for example, harvest RSS feeds and Twitter streams of the relevant journals), and provided the biodiversity community with a feed of new names and associated supporting information. We could watch new discoveries of new biodiversity unfold in near real time, as well as provide a stream of data for projects such as GBIF and others to ingest and keep their databases up to date.
I need more time to sketch this out fully, but I think a case can be made for a taxonomy-centric (or, perhaps more usefully, a biodiversity-centric) clone of PubMed Central.
Here are some reasons:
The Genealogical World of Phylogenetic Networks
BMC Evolutionary Biology