Popular contentSyndicateCategories
User loginNavigationWho's onlineThere are currently 0 users and 24 guests online.
|
Molecular Biology and EvolutionMolecular Biology and Evolution - RSS feed of current issue URLhttp://mbe.oxfordjournals.orgLast update1 hour 55 min agoMay 9, 201307:10
07:10
07:10
07:10
07:10
Transposable elements (TEs) are mobile genetic sequences that can jump around the genome from one location to another, behaving as genomic parasites. TEs have been particularly effective in colonizing mammalian genomes, and such heavy TE load is expected to have conditioned genome evolution. Indeed, studies conducted both at the gene and genome levels have uncovered TE insertions that seem to have been co-opted—or exapted—by providing transcription factor binding sites (TFBSs) that serve as promoters and enhancers, leading to the hypothesis that TE exaptation is a major factor in the evolution of gene regulation. Here, we critically review the evidence for exaptation of TE-derived sequences as TFBSs, promoters, enhancers, and silencers/insulators both at the gene and genome levels. We classify the functional impact attributed to TE insertions into four categories of increasing complexity and argue that so far very few studies have conclusively demonstrated exaptation of TEs as transcriptional regulatory regions. We also contend that many genome-wide studies dealing with TE exaptation in recent lineages of mammals are still inconclusive and that the hypothesis of rapid transcriptional regulatory rewiring mediated by TE mobilization must be taken with caution. Finally, we suggest experimental approaches that may help attributing higher-order functions to candidate exapted TEs. 07:10
Computational predictions have become indispensable for evaluating the disease-related impact of nonsynonymous single-nucleotide variants discovered in exome sequencing. Many such methods have their roots in molecular evolution, as they use information derived from multiple sequence alignments. We show that the performance of current methods (e.g., PolyPhen-2 and SIFT) is improved significantly by optimizing their statistical models on evolutionarily balanced training data, where equal numbers of positive and negative controls within each evolutionary conservation class are used. Evolutionary balancing significantly reduces the false-positive rates for variants observed at highly conserved sites and false-negative rates for variants observed at fast evolving sites. Use of these improved methods enables more accurate forecasting when concordant diagnosis from multiple methods is regarded as a more reliable indicator of the prediction. Applied to a large exome variation data set, we find that the current methods produce concordant predictions for less than half of the population variants. These advances are implemented in a web resource for use in practical applications (www.mypeg.info, last accessed March 13, 2013). 07:10
Large-scale, genome-level molecular phylogenetic analyses present both opportunities and challenges for bacterial evolutionary and ecological studies. We constructed a phylum-level bacterial phylogenetic marker database by surveying all complete bacterial genomes and identifying single-copy genes that were widely distributed in each of the 20 bacterial phyla. We showed that phylum trees made using these markers were highly resolved and were more robust than the bacterial genome tree based on 31 universal bacterial marker genes. In addition, using the Global Ocean Sampling data set as an example, we demonstrated that the expanded marker database greatly increased the power of metagenomic phylotyping. We incorporated the database into an automated phylogenomic inference application (Phyla-AMPHORA) and made it publicly available. We believe that this centralized resource should have broad applicability in bacterial systematics, phylogenetics, and metagenomic studies. 07:10
Gene duplication generates genetic novelty and redundancy and is a major mechanism of evolutionary change in bacteria and eukaryotes. To date, however, gene duplication has been reported only rarely in RNA viruses. Using a conservative BLAST approach we systematically screened for the presence of duplicated (i.e., paralogous) proteins in all RNA viruses for which full genome sequences are publicly available. Strikingly, we found only nine significantly supported cases of gene duplication, two of which are newly described here—in the 25 and 26 kDa proteins of Beet necrotic yellow vein virus (genus Benyvirus) and in the U1 and U2 proteins of Wongabel virus (family Rhabdoviridae). Hence, gene duplication has occurred at a far lower frequency in the recent evolutionary history of RNA viruses than in other organisms. Although the rapidity of RNA virus evolution means that older gene duplication events will be difficult to detect through sequence-based analyses alone, it is likely that specific features of RNA virus biology, and particularly intrinsic constraints on genome size, reduce the likelihood of the fixation and maintenance of duplicated genes. 07:10
Markov models of codon substitution naturally incorporate the structure of the genetic code and the selection intensity at the protein level, providing a more realistic representation of protein-coding sequences compared with nucleotide or amino acid models. Thus, for protein-coding genes, phylogenetic inference is expected to be more accurate under codon models. So far, phylogeny reconstruction under codon models has been elusive due to computational difficulties of dealing with high dimension matrices. Here, we present a fast maximum likelihood (ML) package for phylogenetic inference, CodonPhyML offering hundreds of different codon models, the largest variety to date, for phylogeny inference by ML. CodonPhyML is tested on simulated and real data and is shown to offer excellent speed and convergence properties. In addition, CodonPhyML includes most recent fast methods for estimating phylogenetic branch supports and provides an integral framework for models selection, including amino acid and DNA models. 07:10
The Candida Gene Order Browser (CGOB) was developed as a tool to visualize and analyze synteny relationships in multiple Candida species, and to provide an accurate, manually curated set of orthologous Candida genes for evolutionary analyses. Here, we describe major improvements to CGOB. The underlying structure of the database has been changed significantly. Genomic features are now based directly on genome annotations rather than on protein sequences, which allows non-protein features such as centromere locations in Candida albicans and tRNA genes in all species to be included. The data set has been expanded to 13 species, including genomes of pathogens (C. albicans, C. parapsilosis, C. tropicalis, and C. orthopsilosis), and those of xylose-degrading species with important biotechnological applications (C. tenuis, Scheffersomyces stipitis, and Spathaspora passalidarum). Updated annotations of C. parapsilosis, C. dubliniensis, and Debaryomyces hansenii have been incorporated. We discovered more than 1,500 previously unannotated genes among the 13 genomes, ranging in size from 29 to 3,850 amino acids. Poorly conserved and rapidly evolving genes were also identified. Re-analysis of the mating type loci of the xylose degraders suggests that C. tenuis is heterothallic, whereas both Spa. passalidarum and S. stipitis are homothallic. As well as hosting the browser, the CGOB website (http://cgob.ucd.ie) gives direct access to all the underlying genome annotations, sequences, and curated orthology data. 07:10
It is currently unclear whether the amino acid substitutions that occur during protein evolution are primarily driven by adaptation, or reflect the random accumulation of neutral changes. When estimated from genomic data, the proportion of adaptive amino acid substitutions, called α, was found to vary greatly across species, from nearly zero in humans to above 0.5 in Drosophila. These variations have been interpreted as reflecting differences in effective population size, adaptation being supposedly more efficient in large populations. Here, we investigate the influence of effective population size and other biological parameters on the rate of adaptive evolution by simulating the evolution of a coding sequence under Fisher’s geometric formalism. We explicitly model recurrent environmental changes and the subsequent adaptive walks, followed by periods of stasis during which purifying selection dominates. We show that, under a variety of conditions, the effective population size has only a moderate influence on α, and an even weaker influence on the per generation rate of selective sweeps, modifying the prevalent view in current literature. The rate of environmental change and, interestingly, the dimensionality of the phenotypic space (organismal complexity) affect the adaptive rate more deeply than does the effective population size. We discuss the reasons why verbal arguments have been misleading on that subject and revisit the empirical evidence. Our results question the relevance of the "α" parameter as an indicator of the efficiency of molecular adaptation. 07:10
Escherichia coli K12 is a commensal bacteria and one of the best-studied model organisms. Salmonella enterica serovar Typhimurium, on the other hand, is a facultative intracellular pathogen. These two prokaryotic species can be considered related phylogenetically, and they share a large amount of their genetic material, which is commonly termed the "core genome." Despite their shared core genome, both species display very different lifestyles, and it is unclear to what extent the core genome, apart from the species-specific genes, plays a role in this lifestyle divergence. In this study, we focus on the differences in expression domains for the orthologous genes in E. coli and S. Typhimurium. The iterative comparison of coexpression methodology was used on large expression compendia of both species to uncover the conservation and divergence of gene expression. We found that gene expression conservation occurs mostly independently from amino acid similarity. According to our estimates, at least more than one quarter of the orthologous genes has a different expression domain in E. coli than in S. Typhimurium. Genes involved with key cellular processes are most likely to have conserved their expression domains, whereas genes showing diverged expression are associated with metabolic processes that, although present in both species, are regulated differently. The expression domains of the shared "core" genome of E. coli and S. Typhimurium, consisting of highly conserved orthologs, have been tuned to help accommodate the differences in lifestyle and the pathogenic potential of Salmonella. 07:10
Evolution of sequences mostly involves independent changes at different sites. However, substitutions at neighboring sites may co-occur as multinucleotide replacement events (MNRs). Here, we compare noncoding sequences of several species of primates, and of three species of Drosophila fruit flies, in a phylogenetic analysis of the replacements that occurred between species at nearby nucleotide sites. Both in primates and in Drosophila, the frequency of single-nucleotide replacements is substantially elevated within 10 nucleotides from other replacements that occurred on the same lineage but not on another lineage. The data imply that dinucleotide replacements (DNRs) affecting sites at distances of up to 10 nucleotides from each other are responsible for 2.3% of single-nucleotide replacements in primate genomes and for 5.6% in Drosophila genomes. Among these DNRs, 26% and 69%, respectively, are in fact parts of replacements of three or more trinucleotide replacements (TNRs). The plurality of MNRs affect nearby nucleotides, so that at least six times as many DNRs affect two adjacent nucleotide sites than sites 10 nucleotides apart. Still, approximately 60% of DNRs, and approximately 90% of TNRs, span distances more than two (or three) nucleotides. MNRs make a major contribution to the observed clustering of substitutions: In the human–chimpanzee comparison, DNRs are responsible for 50% of cases when two nearby replacements are observed on the human lineage, and TNRs are responsible for 83% of cases when three replacements at three immediately adjacent sites are observed on the human lineage. The prevalence of MNRs matches that is observed in data on de novo mutations and is also observed in the regions with the lowest sequence conservation, suggesting that MNRs mainly have mutational origin; however, epistatic selection and/or gene conversion may also play a role. 07:10
Tuberculosis (TB) is a global health problem estimated to kill 1.4 million people per year. Recent advances in the genomics of the causative agents of TB, bacteria known as the Mycobacterium tuberculosis complex (MTBC), have allowed a better comprehension of its population structure and provided the foundation for molecular evolution analyses. These studies are crucial for a better understanding of TB, including the variation of vaccine efficacy and disease outcome, together with the emergence of drug resistance. Starting from the analysis of 73 publicly available genomes from all the main MTBC lineages, we have screened for evidences of positive selection, a set of 576 genes previously associated with drug resistance or encoding membrane proteins. As expected, because antibiotics constitute strong selective pressure, some of the codons identified correspond to the position of confirmed drug-resistance-associated substitutions in the genes embB, rpoB, and katG. Furthermore, we identified diversifying selection in specific codons of the genes Rv0176 and Rv1872c coding for MCE1-associated transmembrane protein and a putative l-lactate dehydrogenase, respectively. Amino acid sequence analyses showed that in Rv0176, sites undergoing diversifying selection were in a predicted antigen region that varies between "modern" lineages and "ancient" MTBC/BCG strains. In Rv1872c, some of the sites under selection are predicted to impact protein function and thus might result from metabolic adaptation. These results illustrate that diversifying selection in MTBC is happening as a consequence of both antibiotic treatment and other evolutionary pressures. 07:10
Zymoseptoria tritici is an important fungal pathogen on wheat that originated in the Fertile Crescent. Its closely related sister species Z. pseudotritici and Z. ardabiliae infect wild grasses in the same region. This recently emerged host–pathogen system provides a rare opportunity to investigate the evolutionary processes shaping the genome of an emerging pathogen. Here, we investigate genetic signatures in plant cell wall degrading enzymes (PCWDEs) that are likely affected by or driving coevolution in plant-pathogen systems. We hypothesize four main evolutionary scenarios and combine comparative genomics, transcriptomics, and selection analyses to assign the majority of PCWDEs in Z. tritici to one of these scenarios. We found widespread differential transcription among different members of the same gene family, challenging the idea of functional redundancy and suggesting instead that specialized enzymatic activity occurs during different stages of the pathogen life cycle. We also find that natural selection has significantly affected at least 19 of the 48 identified PCWDEs. The majority of genes showed signatures of purifying selection, typical for the scenario of conserved substrate optimization. However, six genes showed diversifying selection that could be attributed to either host adaptation or host evasion. This study provides a powerful framework to better understand the roles played by different members of multigene families and to determine which genes are the most appropriate targets for wet laboratory experimentation, for example, to elucidate enzymatic function during relevant phases of a pathogen’s life cycle. 07:10
The orphan nuclear receptor gene knirps and its relatives encode a small family of highly conserved proteins. We take advantage of the conservation of the family, using the recent prevalence of genomic data, to reconstruct its evolutionary history, identifying duplication events and tracing the intron–exon structure of the genes over evolution. Many arthropod species have two or three members of this family, but the orthology between members is unclear. We have analyzed the protein coding sequences of members of this family from 15 arthropod species covering all four main arthropod classes, including a total of 28 genes. All members of the family encode a highly conserved 94 amino acid core sequence, part of which is encoded by a single invariant exon. We find that many of the automated predictions of these genes contain errors, while some copies of the gene were not uncovered by automated pipelines, requiring manual corrections and curation. We use the coding sequences to present a phylogenetic analysis of the knirps family. Our analysis indicates that there was a duplication of a single ancestral gene in the lineage leading to insects, which gave rise to two paralogs, eagle and knirps-related. Descendants of this duplication can be identified by the presence or absence of a short protein-coding motif. Independent, lineage-specific duplications occurred in the two crustaceans we sampled. Within the insects, the knirps-related gene underwent further lineage-specific duplications, giving rise to—among others—the Drosophila gap gene knirps. 07:10
Change in gene expression is a major facilitator of phenotypic evolution. Understanding the evolutionary potential of gene expression requires taking into account complex systems of regulatory networks, the structure of which could potentially bias evolutionary trajectories. We analyzed the evolutionary potential and divergence of multigene expression in three well-characterized signaling pathways in Drosophila, the mitogen-activated protein kinase (MapK), the Toll, and the insulin receptor/Foxo (InR/Foxo or InR/TOR) pathways in a multivariate quantitative genetic framework. Gene expression data from a natural population of D. melanogaster were used to estimate the genetic variance–covariance matrices (G) for each network. Although most genes within each pathway exhibited significant genetic variance, the number of independent dimensions of multivariate genetic variance was fewer than the number of genes analyzed. However, for expression, the reduction in dimensionality was not as large as seen for other trait types such as morphology. We then tested whether gene expression divergence between D. melanogaster and an additional six species of the Drosophila genus was biased along the major axes of standing variation observed in D. melanogaster. In many cases, divergence was restricted to directions of phenotypic space harboring above average levels of genetic variance in D. melanogaster, indicating that genetic covariances between genes within pathways have biased interspecific divergence. We tested whether co-expression of genes in both sexes has also biased the pattern of divergence. Including cross-sex genetic covariances increased the degree to which divergence was biased along major axes of genetic variance, suggesting that the co-expression of genes in males and females can generate further constraints on divergence across the Drosophila phylogeny. In contrast to patterns seen for morphological traits in vertebrates, transcriptional constraints do not appear to break down as divergence time between species increases, instead they persist over tens of millions of years of divergence. 07:10
Saccharomyces cerevisiae and S. uvarum are two domesticated species of the Saccharomyces sensu stricto clade that diverged around 100 Ma after whole-genome duplication. Both have retained many duplicated genes associated with glucose fermentation and are characterized by the ability to achieve grape must fermentation. Nevertheless, these two species differ for many other traits, indicating that they underwent different evolutionary histories. To determine how the evolutionary histories of S. cerevisiae and S. uvarum are mirrored on the proteome, we analyzed the genetic variability of the proteomes of domesticated strains of these two species by quantitative mass spectrometry. Overall, 445 proteins were quantified. Massive variations of protein abundances were found, that clearly differentiated the two species. Abundance variations in specific metabolic pathways could be related to phenotypic traits known to discriminate the two species. In addition, proteins encoded by duplicated genes were shown to be differently recruited in each species. Comparing the strain differentiation based on the proteome variability to those based on the phenotypic and genetic variations further revealed that the strains of S. uvarum and some strains of S. cerevisiae displayed similar fermentative performances despite strong proteomic and genomic differences. Altogether, these results indicate that the ability of S. cerevisae and S. uvarum to complete grape must fermentation arose through different evolutionary roads, involving different metabolic pathways and duplicated genes. 07:10
A functional understanding of processes involved in adaptive divergence is one of the awaiting opportunities afforded by high-throughput transcriptomic technologies. Functional analysis of coexpressed genes has succeeded in the biomedical field in identifying key drivers of disease pathways. However, in ecology and evolutionary biology, functional interpretation of transcriptomic data is still limited. Here, we used Weighted Gene Co-Expression Network Analysis (WGCNA) to identify modules of coexpressed genes in muscle and brain tissue of a lake whitefish backcross progeny. Modules were connected to gradients of known adaptive traits involved in the ecological speciation process between benthic and limnetic ecotypes. Key drivers, that is, hub genes of functional modules related to reproduction, growth, and behavior were identified, and module preservation was assessed in natural populations. Using this approach, we identified modules of coexpressed genes involved in phenotypic divergence and their key drivers, and further identified a module part specifically rewired in the backcross progeny. Functional analysis of transcriptomic data can significantly contribute to the understanding of the mechanisms underlying ecological speciation. Our findings point to bone morphogenetic protein and calcium signaling as common pathways involved in coordinated evolution of trophic behavior, trophic morphology (gill rakers), and reproduction. Results also point to pathways implicating hemoglobins and constitutive stress response (HSP70) governing growth in lake whitefish. 07:10
Activation of the contact system leads to the cleavage of kininogen by plasma kallikrein resulting in kinin release and in the initiation of the intrinsic pathway of coagulation. Proteolysis of kininogen also generates antimicrobial peptides (AMPs) and can be induced by diverse pathogens. Thus, the contact system is regarded as a branch of innate immunity. We performed an evolutionary analysis of contact system genes by analyzing both inter- and intraspecies diversity. Results indicated that mammalian kininogen genes evolved adaptively. Positively selected sites are located in all protein domains with the exclusion of the bradykinin region and also involve AMP sequences (including the highly effective NAT26 peptide); positively selected sites also occur at alternative cleavage sites for neutrophil-released kinins. Population genetic analysis in humans indicated that a region of the kininogen gene (KNG1) has been a target of long-standing multiallelic balancing selection and that the coalescence time of the haplotype phylogeny dates back to the split between the humans and chimpanzees. No selection signature was detected in the Pan troglodytes KNG1 gene or in human genes encoding other components of the contact system. The selection targets in human KNG1 might be accounted for by variants with transcriptional regulatory activity. Results herein indicate a continuum in selective pressure acting on different timescales and targeting KNG1. This is in line with evidences suggesting a central role for kininogen in modulating of immune response and with its being a target of an extremely diverse array of pathogen species. |
Latest issue
EVOLDIRThe Barcode of LifeiPhyloPhyloseminarSystematics AssociationNESCentThe Genealogical World of Phylogenetic NetworksCiteULike PhylogenyEvolutionary BioinformaticsCladisticsBMC Evolutionary Biology
|