Molecular Biology and Evolution

Molecular Biology and Evolution - RSS feed of current issue

URL

XML feed
http://mbe.oxfordjournals.org

Last update

24 min 2 sec ago

March 20, 2014

23:37
23:37

Gonadotropin-releasing hormone (GnRH) is a critical reproductive regulator in vertebrates. Homologous peptides are also found in invertebrates, with a variety of characterized functions. In the amphioxus, an invertebrate that provides the best model for the transition to vertebrates, four GnRH receptors (GnRHRs) were previously described, but their native ligands were not identified. Using a more sensitive search methodology with hidden Markov models, we identified the first GnRH-like peptide confirmed in the amphioxus Branchiostoma floridae. This peptide specifically activated one of the four GnRHRs. Although the primary structure of this peptide was divergent from any previously isolated GnRH peptide, the minimal conserved residues found in all other GnRH superfamily members were retained. The peptide was immunolocalized in proximity of the central canal of the anterior nerve cord, a region where other neuropeptides and receptors have been found. Additionally, the amphioxus GnRH-like gene was positioned in a locus surrounded by syntenic homologs of the human GnRH paralogon. The amphioxus GnRH-like peptide, with its distinct primary structure, activated a receptor with equal potency to multiple ligands that span the GnRH superfamily.

23:37

Standard protein phylogenetic models use fixed rate matrices of amino acid interchange derived from analyses of large databases. Differences between the stationary amino acid frequencies of these rate matrices from those of a data set of interest are typically adjusted for by matrix multiplication that converts the empirical rate matrix to an exchangeability matrix which is then postmultiplied by the amino acid frequencies in the alignment. The result is a time-reversible rate matrix with stationary amino acid frequencies equal to the data set frequencies. On the basis of population genetics principles, we develop an amino acid substitution-selection model that parameterizes the fitness of an amino acid as the logarithm of the ratio of the frequency of the amino acid to the frequency of the same amino acid under no selection. The model gives rise to a different sequence of matrix multiplications to convert an empirical rate matrix to one that has stationary amino acid frequencies equal to the data set frequencies. We incorporated the substitution-selection model with an improved amino acid class frequency mixture (cF) model to partially take into account site-specific amino acid frequencies in the phylogenetic models. We show that 1) the selection models fit data significantly better than corresponding models without selection for most of the 21 test data sets; 2) both cF and cF selection models favored the phylogenetic trees that were inferred under current sophisticated models and methods for three difficult phylogenetic problems (the positions of microsporidia and breviates in eukaryote phylogeny and the position of the root of the angiosperm tree); and 3) for data simulated under site-specific residue frequencies, the cF selection models estimated trees closer to the generating trees than a standard model or cF without selection. We also explored several ways of estimating amino acid frequencies under neutral evolution that are required for these selection models. By better modeling the amino acid substitution process, the cF selection models will be valuable for phylogenetic inference and evolutionary studies.

23:37

Rafflesia is a genus of holoparasitic plants endemic to Southeast Asia that has lost the ability to undertake photosynthesis. With short-read sequencing technology, we assembled a draft sequence of the mitochondrial genome of Rafflesia lagascae Blanco, a species endemic to the Philippine island of Luzon, with ~350x sequencing depth coverage. Using multiple approaches, however, we were only able to identify small fragments of plastid sequences at low coverage depth (

23:37

The long time scale of adaptive evolution makes it difficult to directly observe the spread of most beneficial mutations through natural populations. Therefore, inferring attributes of beneficial mutations by studying the genomic signals left by directional selection is an important component of population genetics research. One kind of signal is a trough in nearby neutral genetic variation due to selective fixation of initially rare alleles, a phenomenon known as "genetic hitchhiking." Accumulated evidence suggests that a considerable fraction of substitutions in the Drosophila genome results from positive selection, most of which are expected to have small selection coefficients and influence the population genetics of sites in the immediate vicinity. Using Drosophila melanogaster population genomic data, we found that the heterogeneity in synonymous polymorphism surrounding different categories of coding fixations is readily observable even within 25 bp of focal substitutions, which we interpret as the result of small-scale hitchhiking effects. The strength of natural selection on different sites appears to be quite heterogeneous. Particularly, neighboring fixations that changed amino acid polarities in a way that maintained the overall polarities of a protein were under stronger selection than other categories of fixations. Interestingly, we found that substitutions in slow-evolving genes are associated with stronger hitchhiking effects. This is consistent with the idea that adaptive evolution may involve few substitutions with large effects or many substitutions with small effects. Because our approach only weakly depends on the numbers of recent nonsynonymous substitutions, it can provide a complimentary view to the adaptive evolution inferred by other divergence-based evolutionary genetic methods.

23:37

Phylogenetic reconstruction of the evolutionary history of closely related organisms may be difficult because of the presence of unsorted lineages and of a relatively high proportion of heterozygous sites that are usually not handled well by phylogenetic programs. Genomic data may provide enough fixed polymorphisms to resolve phylogenetic trees, but the diploid nature of sequence data remains analytically challenging. Here, we performed a phylogenomic reconstruction of the evolutionary history of the common vole (Microtus arvalis) with a focus on the influence of heterozygosity on the estimation of intraspecific divergence times. We used genome-wide sequence information from 15 voles distributed across the European range. We provide a novel approach to integrate heterozygous information in existing phylogenetic programs by repeated random haplotype sampling from sequences with multiple unphased heterozygous sites. We evaluated the impact of the use of full, partial, or no heterozygous information for tree reconstructions on divergence time estimates. All results consistently showed four deep and strongly supported evolutionary lineages in the vole data. These lineages undergoing divergence processes split only at the end or after the last glacial maximum based on calibration with radiocarbon-dated paleontological material. However, the incorporation of information from heterozygous sites had a significant impact on absolute and relative branch length estimations. Ignoring heterozygous information led to an overestimation of divergence times between the evolutionary lineages of M. arvalis. We conclude that the exclusion of heterozygous sites from evolutionary analyses may cause biased and misleading divergence time estimates in closely related taxa.

23:37

The evolutionary origin of eukaryotes is a question of great interest for which many different hypotheses have been proposed. These hypotheses predict distinct patterns of evolutionary relationships for individual genes of the ancestral eukaryotic genome. The availability of numerous completely sequenced genomes covering the three domains of life makes it possible to contrast these predictions with empirical data. We performed a systematic analysis of the phylogenetic relationships of ancestral eukaryotic genes with archaeal and bacterial genes. In contrast with previous studies, we emphasize the critical importance of methods accounting for statistical support, horizontal gene transfer, and gene loss, and we disentangle the processes underlying the phylogenomic pattern we observe. We first recover a clear signal indicating that a fraction of the bacteria-like eukaryotic genes are of alphaproteobacterial origin. Then, we show that the majority of bacteria-related eukaryotic genes actually do not point to a relationship with a specific bacterial taxonomic group. We also provide evidence that eukaryotes branch close to the last archaeal common ancestor. Our results demonstrate that there is no phylogenetic support for hypotheses involving a fusion with a bacterium other than the ancestor of mitochondria. Overall, they leave only two possible interpretations, respectively, based on the early-mitochondria hypotheses, which suppose an early endosymbiosis of an alphaproteobacterium in an archaeal host and on the slow-drip autogenous hypothesis, in which early eukaryotic ancestors were particularly prone to horizontal gene transfers.

23:37

The Drosophila pseudoobscura dot chromosome acquired genes from the ancestral Drosophila Y chromosome in a Y-to-dot translocation event that occurred between 12.7 and 20.8 Ma. The formerly Y-linked genes mostly retained their testis-specific expression but shrank drastically in size, mostly through intron reduction, since becoming part of the dot chromosome in this species. We investigated the impact of this translocation on the evolution of the both the Y-to-dot translocated region and the original segments of the dot chromosome in D. pseudoobscura. Our survey of polymorphism and divergence across the chromosome reveals a reduction in variation, a deletion polymorphism segregating at high frequency, and a shift in the frequency spectra, all consistent with a history of recent selective sweeps in the Y-to-dot translocated region but not on the rest of the dot chromosome. We do find evidence for recombination primarily as gene conversion on the dot chromosome; however, predicted recombination events are restricted to the part of the dot chromosome outside the translocation. It therefore appears that recombination has resulted in a degree of decoupling between the ancestral Y region and the conserved region of the dot chromosome.

23:37

Bacteria confined to intracellular environments experience extensive genome reduction. In extreme cases, insect endosymbionts have evolved genomes that are so gene-poor that they blur the distinction between bacteria and endosymbiotically derived organelles such as mitochondria and plastids. To understand the host’s role in this extreme gene loss, we analyzed gene content and expression in the nuclear genome of the psyllid Pachypsylla venusta, a sap-feeding insect that harbors an ancient endosymbiont (Carsonella) with one of the most reduced bacterial genomes ever identified. Carsonella retains many genes required for synthesis of essential amino acids that are scarce in plant sap, but most of these biosynthetic pathways have been disrupted by gene loss. Host genes that are upregulated in psyllid cells housing Carsonella appear to compensate for endosymbiont gene losses, resulting in highly integrated metabolic pathways that mirror those observed in other sap-feeding insects. The host contribution to these pathways is mediated by a combination of native eukaryotic genes and bacterial genes that were horizontally transferred from multiple donor lineages early in the evolution of psyllids, including one gene that appears to have been directly acquired from Carsonella. By comparing the psyllid genome to a recent analysis of mealybugs, we found that a remarkably similar set of functional pathways have been shaped by independent transfers of bacterial genes to the two hosts. These results show that horizontal gene transfer is an important and recurring mechanism driving coevolution between insects and their bacterial endosymbionts and highlight interesting similarities and contrasts with the evolutionary history of mitochondria and plastids.

23:37

The question of how genetic variation in a population influences phenotypic variation and evolution is of major importance in modern biology. Yet much is still unknown about the relative functional importance of different forms of genome variation and how they are shaped by evolutionary processes. Here we address these questions by population level sequencing of 42 strains from the budding yeast Saccharomyces cerevisiae and its closest relative S. paradoxus. We find that genome content variation, in the form of presence or absence as well as copy number of genetic material, is higher within S. cerevisiae than within S. paradoxus, despite genetic distances as measured in single-nucleotide polymorphisms being vastly smaller within the former species. This genome content variation, as well as loss-of-function variation in the form of premature stop codons and frameshifting indels, is heavily enriched in the subtelomeres, strongly reinforcing the relevance of these regions to functional evolution. Genes affected by these likely functional forms of variation are enriched for functions mediating interaction with the external environment (sugar transport and metabolism, flocculation, metal transport, and metabolism). Our results and analyses provide a comprehensive view of genomic diversity in budding yeast and expose surprising and pronounced differences between the variation within S. cerevisiae and that within S. paradoxus. We also believe that the sequence data and de novo assemblies will constitute a useful resource for further evolutionary and population genomics studies.

23:37

Anopheles gambiae s.l. are important malaria vectors, but little is known about their genomic variation in the wild. Here, we present inter- and intraspecies analysis of genome-wide RADseq data, in three Anopheles gambiae s.l. species collected from East Africa. The mosquitoes fall into three genotypic clusters representing described species (A. gambiae, A. arabiensis, and A. merus) with no evidence of cryptic breeding units. Anopheles merus is the most divergent of the three species, supporting a recent new phylogeny based on chromosomal inversions. Even though the species clusters are well separated, there is extensive shared polymorphism, particularly between A. gambiae and A. arabiensis. Divergence between A. gambiae and A. arabiensis does not vary across the autosomes but is higher in X-linked inversions than elsewhere on X or on the autosomes, consistent with the suggestion that this inversion (or a gene within it) is important in reproductive isolation between the species. The 2La/2L+a inversion shows no more evidence of introgression between A. gambiae and A. arabiensis than the rest of the autosomes. Population differentiation within A. gambiae and A. arabiensis is weak over approximately 190–270 km, implying no strong barriers to dispersal. Analysis of Tajima’s D and the allele frequency spectrum is consistent with modest population increases in A. arabiensis and A. merus, but a more complex demographic history of expansion followed by contraction in A. gambiae. Although they are less than 200 km apart, the two A. gambiae populations show evidence of different demographic histories.

23:37

Upstream regulatory sequences that control gene expression evolve rapidly, yet the expression patterns and functions of most genes are typically conserved. To address this paradox, we have reconstructed computationally and resurrected in vivo the cis-regulatory regions of the ancestral Drosophila eve stripe 2 element and evaluated its evolution using a mathematical model of promoter function. Our feed-forward transcriptional model predicts gene expression patterns directly from enhancer sequence. We used this functional model along with phylogenetics to generate a set of possible ancestral eve stripe 2 sequences for the common ancestors of 1) D. simulans and D. sechellia; 2) D. melanogaster, D. simulans, and D. sechellia; and 3) D. erecta and D. yakuba. These ancestral sequences were synthesized and resurrected in vivo. Using a combination of quantitative and computational analysis, we find clear support for functional compensation between the binding sites for Bicoid, Giant, and Krüppel over the course of 40–60 My of Drosophila evolution. We show that this compensation is driven by a coupling interaction between Bicoid activation and repression at the anterior and posterior border necessary for proper placement of the anterior stripe 2 border. A multiplicity of mechanisms for binding site turnover exemplified by Bicoid, Giant, and Krüppel sites, explains how rapid sequence change may occur while maintaining the function of the cis-regulatory element.

23:37

Diversity of the mammalian olfactory receptor (OR) repertoire has been globally reshaped by niche specialization. However, little is known about the variability of the OR repertoire at a shallower evolutionary timeframe. The vast bat radiation exhibits an extraordinary variety of trophic and sensory specializations. Unlike other mammals, bats possess a unique and diverse OR gene repertoire. We elucidated whether the evolution of the OR gene repertoire can be linked to ecological niche specializations, such as sensory modalities and diet. The OR gene repertoires of 27 bat species spanning the chiropteran radiation were amplified and sequenced. For each species, intact and nonfunctional genes were assessed, and the OR gene abundances in each gene family were analyzed and compared. We identified a unique OR pattern linked to the frugivorous diet of New World fruit-eating bats and a similar convergent pattern in the Old World fruit-eating bats. Our results show a strong association between niche specialization and OR repertoire diversity even at a shallow evolutionary timeframe.

23:37

The acquisition by parasites of the capacity to infect resistant host genotypes, that is, resistance-breaking, is predicted to be hindered by across-host fitness trade-offs. All analyses of costs of resistance-breaking in plant viruses have focused on within-host multiplication without considering other fitness components, which may limit understanding of virus evolution. We have reported that host range expansion of tobamoviruses on L-gene resistant pepper genotypes was associated with severe within-host multiplication penalties. Here, we analyze whether resistance-breaking costs might affect virus survival in the environment by comparing tobamovirus pathotypes differing in infectivity on L-gene resistance alleles. We predicted particle stability from structural models, analyzed particle stability in vitro, and quantified virus accumulation in different plant organs and virus survival in the soil. Survival in the soil differed among tobamovirus pathotypes and depended on differential stability of virus particles. Structure model analyses showed that amino acid changes in the virus coat protein (CP) responsible for resistance-breaking affected the strength of the axial interactions among CP subunits in the rod-shaped particle, thus determining its stability and survival. Pathotypes ranked differently for particle stability/survival and for within-host accumulation. Resistance-breaking costs in survival add to, or subtract from, costs in multiplication according to pathotype. Hence, differential pathotype survival should be considered along with differential multiplication to understand the evolution of the virus populations. Results also show that plant resistance, in addition to selecting for resistance-breaking and for decreased multiplication, also selects for changes in survival, a trait unrelated to the host–pathogen interaction that may condition host range expansion.

23:37

The DNA damage response (DDR) is a crucial signaling network that preserves the integrity of the genome. This network is an ensemble of distinct but often overlapping subnetworks, where different components fulfill distinct functions in precise spatial and temporal scenarios. To understand how these elements have been assembled together in humans, we performed comparative genomic analyses in 47 selected species to trace back their emergence using systematic phylogenetic analyses and estimated gene ages. The emergence of the contribution of posttranslational modifications to the complex regulation of DDR was also investigated. This is the first time a systematic analysis has focused on the evolution of DDR subnetworks as a whole. Our results indicate that a DDR core, mostly constructed around metabolic activities, appeared soon after the emergence of eukaryotes, and that additional regulatory capacities appeared later through complex evolutionary process. Potential key posttranslational modifications were also in place then, with interacting pairs preferentially appearing at the same evolutionary time, although modifications often led to the subsequent acquisition of new targets afterwards. We also found extensive gene loss in essential modules of the regulatory network in fungi, plants, and arthropods, important for their validation as model organisms for DDR studies.

23:37

In filamentous fungi, allorecognition takes the form of heterokaryon incompatibility, a cell death reaction triggered when genetically distinct hyphae fuse. Heterokaryon incompatibility is controlled by specific loci termed het-loci. In this article, we analyzed the natural variation in one such fungal allorecognition determinant, the het-c heterokaryon incompatibility locus of the filamentous ascomycete Podospora anserina. The het-c locus determines an allogenic incompatibility reaction together with two unlinked loci termed het-d and het-e. Each het-c allele is incompatible with a specific subset of the het-d and het-e alleles. We analyzed variability at the het-c locus in a population of 110 individuals, and in additional isolates from various localities. We identified a total of 11 het-c alleles, which define 7 distinct incompatibility specificity classes in combination with the known het-d and het-e alleles. We found that the het-c allorecognition gene of P. anserina is under diversifying selection. We find a highly unequal allele distribution of het-c in the population, which contrasts with the more balanced distribution of functional groups of het-c based on their allorecognition function. One explanation for the observed het-c diversity in the population is its function in allorecognition. However, alleles that are most efficient in allorecognition are rare. An alternative and not exclusive explanation for the observed diversity is that het-c is involved in pathogen recognition. In Arabidopsis thaliana, a homolog of het-c is a pathogen effector target, supporting this hypothesis. We hypothesize that the het-c diversity in P. anserina results from both its functions in pathogen-defense, and allorecognition.

23:37

Lactase persistence (LP) is a genetically determined trait whereby the enzyme lactase is expressed throughout adult life. Lactase is necessary for the digestion of lactose—the main carbohydrate in milk—and its production is downregulated after the weaning period in most humans and all other mammals studied. Several sources of evidence indicate that LP has evolved independently, in different parts of the world over the last 10,000 years, and has been subject to strong natural selection in dairying populations. In Europeans, LP is strongly associated with, and probably caused by, a single C to T mutation 13,910 bp upstream of the lactase (LCT) gene (-13,910*T). Despite a considerable body of research, the reasons why LP should provide such a strong selective advantage remain poorly understood. In this study, we examine one of the most widely cited hypotheses for selection on LP—that fresh milk consumption supplemented the poor vitamin D and calcium status of northern Europe’s early farmers (the calcium assimilation hypothesis). We do this by testing for natural selection on -13,910*T using ancient DNA data from the skeletal remains of eight late Neolithic Iberian individuals, whom we would not expect to have poor vitamin D and calcium status because of relatively high incident UVB light levels. None of the eight samples successfully typed in the study had the derived T-allele. In addition, we reanalyze published data from French Neolithic remains to both test for population continuity and further examine the evolution of LP in the region. Using simulations that accommodate genetic drift, natural selection, uncertainty in calibrated radiocarbon dates, and sampling error, we find that natural selection is still required to explain the observed increase in allele frequency. We conclude that the calcium assimilation hypothesis is insufficient to explain the spread of LP in Europe.