There are currently 0 users and 41 guests online.
February 13, 2015
Population Diversity and Adaptive Evolution in Keratinization Genes: Impact of Environment in Shaping Skin Phenotypes
Several studies have demonstrated the role of climatic factors in shaping skin phenotypes, particularly pigmentation. Keratinization is another well-designed feature of human skin, which is involved in modulating transepidermal water loss (TEWL). Although this physiological process is closely linked to climate, presently it is not clear whether genetic diversity is observed in keratinization and whether this process also responds to the environmental pressure. To address this, we adopted a multipronged approach, which involved analysis of 1) copy number variations in diverse Indian and HapMap populations from varied geographical regions; 2) genetic association with geoclimatic parameters in 61 populations of dbCLINE database in a set of 549 genes from four processes namely keratinization, pigmentation, epidermal differentiation, and housekeeping functions; 3) sequence divergence in 4,316 orthologous promoters and corresponding exonic regions of human and chimpanzee with macaque as outgroup, and 4) protein sequence divergence (Ka/Ks) across nine vertebrate classes, which differ in their extent of TEWL. Our analyses demonstrate that keratinization and epidermal differentiation genes are under accelerated evolution in the human lineage, relative to pigmentation and housekeeping genes. We show that this entire pathway may have been driven by environmental selection pressure through concordant functional polymorphisms across several genes involved in skin keratinization. Remarkably, this underappreciated function of skin may be a crucial determinant of adaptation to diverse environmental pressures across world populations.
Late Pleistocene Australian Marsupial DNA Clarifies the Affinities of Extinct Megafaunal Kangaroos and Wallabies
Understanding the evolution of Australia’s extinct marsupial megafauna has been hindered by a relatively incomplete fossil record and convergent or highly specialized morphology, which confound phylogenetic analyses. Further, the harsh Australian climate and early date of most megafaunal extinctions (39–52 ka) means that the vast majority of fossil remains are unsuitable for ancient DNA analyses. Here, we apply cross-species DNA capture to fossils from relatively high latitude, high altitude caves in Tasmania. Using low-stringency hybridization and high-throughput sequencing, we were able to retrieve mitochondrial sequences from two extinct megafaunal macropodid species. The two specimens, Simosthenurus occidentalis (giant short-faced kangaroo) and Protemnodon anak (giant wallaby), have been radiocarbon dated to 46–50 and 40–45 ka, respectively. This is significantly older than any Australian fossil that has previously yielded DNA sequence information. Processing the raw sequence data from these samples posed a bioinformatic challenge due to the poor preservation of DNA. We explored several approaches in order to maximize the signal-to-noise ratio in retained sequencing reads. Our findings demonstrate the critical importance of adopting stringent processing criteria when distant outgroups are used as references for mapping highly fragmented DNA. Based on the most stringent nucleotide data sets (879 bp for S. occidentalis and 2,383 bp for P. anak), total-evidence phylogenetic analyses confirm that macropodids consist of three primary lineages: Sthenurines such as Simosthenurus (extinct short-faced kangaroos), the macropodines (all other wallabies and kangaroos), and the enigmatic living banded hare-wallaby Lagostrophus fasciatus (Lagostrophinae). Protemnodon emerges as a close relative of Macropus (large living kangaroos), a position not supported by recent morphological phylogenetic analyses.
Trans-Splicing and Operons in Metazoans: Translational Control in Maternally Regulated Development and Recovery from Growth Arrest
Polycistronic mRNAs transcribed from operons are resolved via the trans-splicing of a spliced-leader (SL) RNA. Trans-splicing also occurs at monocistronic transcripts. The phlyogenetically sporadic appearance of trans-splicing and operons has made the driving force(s) for their evolution in metazoans unclear. Previous work has proposed that germline expression drives operon organization in Caenorhabditis elegans, and a recent hypothesis proposes that operons provide an evolutionary advantage via the conservation of transcriptional machinery during recovery from growth arrested states. Using a modified cap analysis of gene expression protocol we mapped sites of SL trans-splicing genome-wide in the marine chordate Oikopleura dioica. Tiled microarrays revealed the expression dynamics of trans-spliced genes across development and during recovery from growth arrest. Operons did not facilitate recovery from growth arrest in O. dioica. Instead, we found that trans-spliced transcripts were predominantly maternal. We then analyzed data from C. elegans and Ciona intestinalis and found that an enrichment of trans-splicing and operon gene expression in maternal mRNA is shared between all three species, suggesting that this may be a driving force for operon evolution in metazoans. Furthermore, we found that the majority of known terminal oligopyrimidine (TOP) mRNAs are trans-spliced in O. dioica and that the SL contains a TOP-like motif. This suggests that the SL in O. dioica confers nutrient-dependent translational control to trans-spliced mRNAs via the TOR-signaling pathway. We hypothesize that SL-trans-splicing provides an evolutionary advantage in species that depend on translational control for regulating early embryogenesis, growth and oocyte production in response to nutrient levels.
Although population-level genomic sequence data have been gathered extensively for humans, similar data from our closest living relatives are just beginning to emerge. Examination of genomic variation within great apes offers many opportunities to increase our understanding of the forces that have differentially shaped the evolutionary history of hominid taxa. Here, we expand upon the work of the Great Ape Genome Project by analyzing medium to high coverage whole-genome sequences from 14 western lowland gorillas (Gorilla gorilla gorilla), 2 eastern lowland gorillas (G. beringei graueri), and a single Cross River individual (G. gorilla diehli). We infer that the ancestors of western and eastern lowland gorillas diverged from a common ancestor approximately 261 ka, and that the ancestors of the Cross River population diverged from the western lowland gorilla lineage approximately 68 ka. Using a diffusion approximation approach to model the genome-wide site frequency spectrum, we infer a history of western lowland gorillas that includes an ancestral population expansion of 1.4-fold around 970 ka and a recent 5.6-fold contraction in population size 23 ka. The latter may correspond to a major reduction in African equatorial forests around the Last Glacial Maximum. We also analyze patterns of variation among western lowland gorillas to identify several genomic regions with strong signatures of recent selective sweeps. We find that processes related to taste, pancreatic and saliva secretion, sodium ion transmembrane transport, and cardiac muscle function are overrepresented in genomic regions predicted to have experienced recent positive selection.
MicroRNAs (miRNAs) mediate gene regulation posttranscriptionally through pairing of their seed (2–7 nt) to 3'-untranslated regions (3'-UTRs) or coding regions (coding sequences [CDSs]) of their target genes. CDS target sites generally show weaker repression effects than 3'-UTR sites. However, little is known about the conservation of the function, that is, repression effect, for these two groups of target sites. In addition, no systematic analysis of the evolutionary constraint on CDS sites exists to date. To address these questions, we performed RNA-sequencing to quantify the regulatory effect of miR-15a/miR-16 and miR-92a on their CDS and 3'-UTR targets in human and macaque cells. These miRs were knocked down transiently so the repression effect could be tracked immediately. Although on average CDS targets are less derepressed than 3'-UTR targets in both species, both the 3'-UTR targets and the CDS targets are functionally conserved. The evolutionary analysis of miRNA target sites shows that CDS sites are more conserved than nontarget control, albeit to a lesser extent than 3'-UTR sites. In conclusion, CDS target sites are functional, even though they are subject to less functional constraint than 3'-UTR target sites.
Mitochondrial genomes of lycophytes are surprisingly diverse, including strikingly different transfer RNA (tRNA) gene complements: No mitochondrial tRNA genes are present in the spikemoss Selaginella moellendorffii, whereas 26 tRNAs are encoded in the chondrome of the clubmoss Huperzia squarrosa. Reinvestigating the latter we found that trnL(gag) and trnS(gga) had never before been identified in any other land plant mitochondrial DNA. Sensitive sequence comparisons showed these two tRNAs as well as trnN(guu) and trnS(gcu) to be very similar to their respective counterparts in chlamydial bacteria. We identified homologs of these chlamydial-type tRNAs also in other lycophyte, fern, and gymnosperm DNAs, suggesting horizontal gene transfer (HGT) into mitochondria in the early vascular plant stem lineages. These findings extend plant mitochondrial HGT to affect individual tRNA genes, to include bacterial donors, and suggest that Chlamydiae on top of their recently proposed key role in primary chloroplast establishment may also have participated in early tracheophyte genome evolution.
Zebrafish (Danio rerio) is an important model for vertebrate development, genomics, physiology, behavior, toxicology, and disease. Additionally, work on numerous Danio species is elucidating evolutionary mechanisms for morphological development. Yet, the relationships of zebrafish and its closest relatives remain unclear possibly due to incomplete lineage sorting, speciation with gene flow, and interspecies hybridization. To clarify these relationships, we first constructed phylogenomic data sets from 30,801 restriction-associated DNA (RAD)-tag loci (483,026 variable positions) with clear orthology to a single location in the sequenced zebrafish genome. We then inferred a well-supported species tree for Danio and tested for gene flow during the diversification of the genus. An approach independent of the sequenced zebrafish genome verified all inferred relationships. Although identification of the sister taxon to zebrafish has been contentious, multiple RAD-tag data sets and several analytical methods provided strong evidence for Danio aesculapii as the most closely related extant zebrafish relative studied to date. Data also displayed patterns consistent with gene flow during speciation and postspeciation introgression in the lineage leading to zebrafish. The incorporation of biogeographic data with phylogenomic analyses put these relationships in a phylogeographic context and supplied additional support for D. aesculapii as the sister species to D. rerio. The clear resolution of this study establishes a framework for investigating the evolutionary biology of Danio and the heterogeneity of genome evolution in the recent history of a model organism within an emerging model genus for genetics, development, and evolution.
Understanding the genetic structure of human populations has important implications for the design and interpretation of disease mapping studies and reconstructing human evolutionary history. To date, inferences of human population structure have primarily been made with common variants. However, recent large-scale resequencing studies have shown an abundance of rare variation in humans, which may be particularly useful for making inferences of fine-scale population structure. To this end, we used an information theory framework and extensive coalescent simulations to rigorously quantify the informativeness of rare and common variation to detect signatures of fine-scale population structure. We show that rare variation affords unique insights into patterns of recent population structure. Furthermore, to empirically assess our theoretical findings, we analyzed high-coverage exome sequences in 6,515 European and African American individuals. As predicted, rare variants are more informative than common polymorphisms in revealing a distinct cluster of European–American individuals, and subsequent analyses demonstrate that these individuals are likely of Ashkenazi Jewish ancestry. Our results provide new insights into the population structure using rare variation, which will be an important factor to account for in rare variant association studies.
The Y-Chromosome Tree Bursts into Leaf: 13,000 High-Confidence SNPs Covering the Majority of Known Clades
Many studies of human populations have used the male-specific region of the Y chromosome (MSY) as a marker, but MSY sequence variants have traditionally been subject to ascertainment bias. Also, dating of haplogroups has relied on Y-specific short tandem repeats (STRs), involving problems of mutation rate choice, and possible long-term mutation saturation. Next-generation sequencing can ascertain single nucleotide polymorphisms (SNPs) in an unbiased way, leading to phylogenies in which branch-lengths are proportional to time, and allowing the times-to-most-recent-common-ancestor (TMRCAs) of nodes to be estimated directly. Here we describe the sequencing of 3.7 Mb of MSY in each of 448 human males at a mean coverage of 51x, yielding 13,261 high-confidence SNPs, 65.9% of which are previously unreported. The resulting phylogeny covers the majority of the known clades, provides date estimates of nodes, and constitutes a robust evolutionary framework for analyzing the history of other classes of mutation. Different clades within the tree show subtle but significant differences in branch lengths to the root. We also apply a set of 23 Y-STRs to the same samples, allowing SNP- and STR-based diversity and TMRCA estimates to be systematically compared. Ongoing purifying selection is suggested by our analysis of the phylogenetic distribution of nonsynonymous variants in 15 MSY single-copy genes.
The Evolution and Adaptive Potential of Transcriptional Variation in Sticklebacks--Signatures of Selection and Widespread Heritability
Evidence implicating differential gene expression as a significant driver of evolutionary novelty continues to accumulate, but our understanding of the underlying sources of variation in expression, both environmental and genetic, is wanting. Heritability in particular may be underestimated when inferred from genetic mapping studies, the predominant "genetical genomics" approach to the study of expression variation. Such uncertainty represents a fundamental limitation to testing for adaptive evolution at the transcriptomic level. By studying the inheritance of expression levels in 10,495 genes (10,527 splice variants) in a threespine stickleback pedigree consisting of 563 individuals, half of which were subjected to a thermal treatment, we show that 74–98% of transcripts exhibit significant additive genetic variance. Dominance variance is also prevalent (41–99% of transcripts), and genetic sources of variation seem to play a more significant role in expression variance in the liver than a key environmental variable, temperature. Among-population comparisons suggest that the majority of differential expression in the liver is likely due to neutral divergence; however, we also show that signatures of directional selection may be more prevalent than those of stabilizing selection. This predominantly aligns with the neutral model of evolution for gene expression but also suggests that natural selection may still act on transcriptional variation in the wild. As genetic variation both within- and among-populations ultimately defines adaptive potential, these results indicate that broad adaptive potential may be found within the transcriptome.
Comparative Transcriptomics of Convergent Evolution: Different Genes but Conserved Pathways Underlie Caste Phenotypes across Lineages of Eusocial Insects
An area of great interest in evolutionary genomics is whether convergently evolved traits are the result of convergent molecular mechanisms. The presence of queen and worker castes in insect societies is a spectacular example of convergent evolution and phenotypic plasticity. Multiple insect lineages have evolved environmentally induced alternative castes. Given multiple origins of eusociality in Hymenoptera (bees, ants, and wasps), it has been proposed that insect castes evolved from common genetic "toolkits" consisting of deeply conserved genes. Here, we combine data from previously published studies on fire ants and honey bees with new data for Polistes metricus paper wasps to assess the toolkit idea by presenting the first comparative transcriptome-wide analysis of caste determination among three major hymenopteran social lineages. Overall, we found few shared caste differentially expressed transcripts across the three social lineages. However, there is substantially more overlap at the levels of pathways and biological functions. Thus, there are shared elements but not on the level of specific genes. Instead, the toolkit appears to be relatively "loose," that is, different lineages show convergent molecular evolution involving similar metabolic pathways and molecular functions but not the exact same genes. Additionally, our paper wasp data do not support a complementary hypothesis that "novel" taxonomically restricted genes are related to caste differences.
Episodic Nucleotide Substitutions in Seasonal Influenza Virus H3N2 Can Be Explained by Stochastic Genealogical Process without Positive Selection
Nucleotide substitutions in the HA1 domain of seasonal influenza virus H3N2 occur in temporal clusters, which was interpreted as a result of recurrent selective sweeps underlying antigenic drift. However, classical theory by Watterson suggests that episodic substitutions are mainly due to stochastic genealogy combined with unique genetic structure of influenza virus: High mutation rate over a nonrecombining viral segment. This explains why even larger variance in the number of allelic fixations per year is observed in nonantigenic gene segments of H3N2 than in antigenic (hemagglutinin and neuraminidase) segments. Using simulation, we confirm that allelic substitutions at nonrecombining segments with high mutation rate become temporally clustered without selection. We conclude that temporal clustering of fixations, as it is primarily caused by inherent randomness in genealogical process at linked sites, cannot be used as an evidence of positive selection in the H3N2 population. This effect of linkage and high mutation rate should be carefully considered in analyzing the genomic patterns of allelic substitutions in asexually reproducing systems in general.
We investigated diverse genomic selections using high-density single nucleotide polymorphism data of five distinct cattle breeds. Based on allele frequency differences, we detected hundreds of candidate regions under positive selection across Holstein, Angus, Charolais, Brahman, and N'Dama. In addition to well-known genes such as KIT, MC1R, ASIP, GHR, LCORL, NCAPG, WIF1, and ABCA12, we found evidence for a variety of novel and less-known genes under selection in cattle, such as LAP3, SAR1B, LRIG3, FGF5, and NUDCD3. Selective sweeps near LAP3 were then validated by next-generation sequencing. Genome-wide association analysis involving 26,362 Holsteins confirmed that LAP3 and SAR1B were related to milk production traits, suggesting that our candidate regions were likely functional. In addition, haplotype network analyses further revealed distinct selective pressures and evolution patterns across these five cattle breeds. Our results provided a glimpse into diverse genomic selection during cattle domestication, breed formation, and recent genetic improvement. These findings will facilitate genome-assisted breeding to improve animal production and health.
The origin of the eukaryotic cell is one of the most important transitions in the history of life. However, the emergence and early evolution of eukaryotes remains poorly understood. Recent data have shown that the last eukaryotic common ancestor (LECA) was much more complex than previously thought. The LECA already had the genetic machinery encoding the endomembrane apparatus, spliceosome, nuclear pore, and myosin and kinesin cytoskeletal motors. It is unclear, however, when the functional regulation of these cellular components evolved. Here, we address this question by analyzing the origin and evolution of the ubiquitin (Ub) signaling system, one of the most important regulatory layers in eukaryotes. We delineated the evolution of the whole Ub, Small-Ub-related MOdifier (SUMO), and Ub-fold modifier 1 (Ufm1) signaling networks by analyzing representatives from all major eukaryotic, bacterial, and archaeal lineages. We found that the Ub toolkit had a pre-eukaryotic origin and is present in three extant archaeal groups. The pre-eukaryotic Ub toolkit greatly expanded during eukaryogenesis, through massive gene innovation and diversification of protein domain architectures. This resulted in a LECA with essentially all of the Ub-related genes, including the SUMO and Ufm1 Ub-like systems. Ub and SUMO signaling further expanded during eukaryotic evolution, especially labeling and delabeling enzymes responsible for substrate selection. Additionally, we analyzed protein domain architecture evolution and found that multicellular lineages have the most complex Ub systems in terms of domain architectures. Together, we demonstrate that the Ub system predates the origin of eukaryotes and that a burst of innovation during eukaryogenesis led to a LECA with complex posttranslational regulation.
Nematocytes, the stinging cells of cnidarians, are the most evolutionarily ancient venom apparatus. These nanosyringe-like weaponry systems reach pressures of approximately 150 atmospheres before discharging and punching through the outer layer of the prey or predator at accelerations of more than 5 million g, making them one of the fastest biomechanical events known. To gain better understanding of the function of the complex, phylum-specific nematocyst organelle, and its venom payload, we compared the soluble nematocyst’s proteome from the sea anemone Anemonia viridis, the jellyfish Aurelia aurita, and the hydrozoan Hydra magnipapillata, each belonging to one of the three basal cnidarian lineages which diverged over 600 Ma. Although the basic morphological and functional characteristics of the nematocysts of the three organisms are similar, out of hundreds of proteins identified in each organism, only six are shared. These include structural proteins, a chaperone which may help maintain venon activity over extended periods, and dickkopf, an enigmatic Wnt ligand which may also serve as a toxin. Nevertheless, many protein domains are shared between the three organisms’ nematocyst content suggesting common proteome functionalities. The venoms of Hydra and Aurelia appear to be functionally similar and composed mainly of cytotoxins and enzymes, whereas the venom of the Anemonia is markedly unique and based on peptide neurotoxins. Cnidarian venoms show evidence for functional recruitment, yet evidence for diversification through positive selection, common to other venoms, is lacking. The final injected nematocyst payload comprises a mixture of dynamically evolving proteins involved in the development, maturation, maintenance, and discharge of the nematocysts, which is unique to each organism and potentially to each nematocyst type.
Venom-Related Transcripts from Bothrops jararaca Tissues Provide Novel Molecular Insights into the Production and Evolution of Snake Venom
Attempts to reconstruct the evolutionary history of snake toxins in the context of their co-option to the venom gland rarely account for nonvenom snake genes that are paralogous to toxins, and which therefore represent important connectors to ancestral genes. In order to reevaluate this process, we conducted a comparative transcriptomic survey on body tissues from a venomous snake. A nonredundant set of 33,000 unigenes (assembled transcripts of reference genes) was independently assembled from six organs of the medically important viperid snake Bothrops jararaca, providing a reference list of 82 full-length toxins from the venom gland and specific products from other tissues, such as pancreatic digestive enzymes. Unigenes were then screened for nontoxin transcripts paralogous to toxins revealing 1) low level coexpression of approximately 20% of toxin genes (e.g., bradykinin-potentiating peptide, C-type lectin, snake venom metalloproteinase, snake venom nerve growth factor) in body tissues, 2) the identity of the closest paralogs to toxin genes in eight classes of toxins, 3) the location and level of paralog expression, indicating that, in general, co-expression occurs in a higher number of tissues and at lower levels than observed for toxin genes, and 4) strong evidence of a toxin gene reverting back to selective expression in a body tissue. In addition, our differential gene expression analyses identify specific cellular processes that make the venom gland a highly specialized secretory tissue. Our results demonstrate that the evolution and production of venom in snakes is a complex process that can only be understood in the context of comparative data from other snake tissues, including the identification of genes paralogous to venom toxins.
The Genealogical World of Phylogenetic Networks
BMC Evolutionary Biology