Selection in Europeans on Fatty Acid Desaturases Associated with Dietary Changes

Farhad Hormozdiari and Eleazar Eskin recently applied an extension of CAVIAR to assess signal selection in European ancestry. CAVIAR is a probabilistic method for detecting a confidence set of SNPs containing all the causal variants in a locus that are within a predefined probability (e.g., 90% or 95%)—while taking into account biases generated by linkage disequilibrium. Farhad, now a post-doctoral scholar at Boston University, developed CAVIAR while a PhD student at UCLA.

This project was led by Matthew T. Buckley and Fernando Racimo at the University of California, Berkeley, and Morten E. Allentoft at the University of Copenhagen. Alleles with strong selection signals have been recently selected for and are thought to carry an evolutionary advantage for individuals in the population. Identifying these alleles helps expand our understanding of the selective pressures that shaped historic populations.

Allele frequency changes across FADS region. For more information, see our full paper.

In order to analyze the selective processes in Europeans across space and time, the project compared sequencing data from FADS genes obtained from present-day and Bronze Age (5000 to 3000 years ago) Europeans. We focused on FADS genes because prior studies indicate they are subjected to strong positive selection in Africa, South Asia, Greenland, and Europe. FADS genes encode fatty acid desaturases that are important for the conversion of short chain polyunsaturated fatty acids (PUFAs) to long chain fatty acids. In other words, selective pressure in the FADS genes may be linked to dietary adaptations.

Other analyses conducted by the project show that alleles in the FAD2 gene display the strongest changes in allele frequency since the Bronze Age, and this change shows associations with expression changes and multiple lipid-related phenotypes. Farhad and Eleazar used CAVIAR to look for presence of allelic heterogeneity, an adaptive process in which different mutations at the same locus cause the same phenotype. In an evolutionary context, presence suggests that a strong pressure selective pressure likely acted upon the population.

Application of CAVIAR to genomic data from the 1000 Genomes Project and 54 Bronze Age Europeans revealed that specific causal variants within the FADS2 gene have been subjected to selective pressure. In particular, FADS2 shows evidence of allelic heterogeneity in three tissue types: transformed fibroblast cells (Pr(2 causal variants) = 0.72), left heart ventricle (Pr(2 causal variants) = 0.74), and whole blood (Pr(3 causal variants) = 0.74).

The project’s comparison of modern to Bronze Age European genomic data show that selection has indeed strongly acted on the FADS gene cluster over the past 3000 years. The selective patterns observed in European data may be driven by a change in the dietary composition of fatty acids following the human transition from hunting-and-gathering to agriculture. As Europeans obtained more lipids from plants, rather than from fish and mammals, their genes adapted to optimize metabolism of these cereal-based lipids.

For more information, see our paper, which is available for download through Molecular Biology and Evolution: https://www.ncbi.nlm.nih.gov/pubmed/28333262.

The full citation to our paper is: 

Buckley, M.T., Racimo, F., Allentoft, M.E., Jensen, M.K., Jonsson, A., Huang, H., Hormozdiari, F., Sikora, M., Marnetto, D., Eskin, E. and Jørgensen, M.E., 2017. Selection in Europeans on fatty acid desaturases associated with dietary changes. Molecular biology and evolution.

This project used a method introduced in a previous publication: 

Hormozdiari, Farhad; Kostem, Emrah ; Kang, Eun Yong ; Pasaniuc, Bogdan ; Eskin, Eleazar

Identifying causal variants at Loci with multiple signals of association. Journal Article

In: Genetics, 198 (2), pp. 497-508, 2014, ISSN: 1943-2631.

Abstract | Links | BibTeX

CAVIAR was created by Farhad HormozdiariEmrah KostemEun Yong KangBogdan Pasaniuc, and Eleazar Eskin. Visit the following page to download CAVIAR and eCAVIAR: http://genetics.cs.ucla.edu/caviar/.

Incorporating prior information into association studies

Genome-wide association studies (GWAS) seek to identify genetic variants involved in specific traits. GWAS are advantageous for linking variants with traits, because they interrogate the genome in a uniform way. In other words, they examine the whole genome without a preconceived notion of where the associations may lie.

However, we now know a lot about the putative function of genetic variants due to tremendous progress in functional genomics. In many cases, we even know which variants are more likely to be involved in disease when compared to others. Advancements in our understanding of functional genomics motivate the strategic incorporation of prior information in GWAS.

Our group has been interested in this problem for many years. One challenge to addressing this problem is that the widely utilized approach for GWAS involves evaluating an association statistic at each single nucleotide polymorphism (SNP), and these methods take into account only one SNP at a time. The results are then adjusted for multiple testing, and an association is identified if a statistic exceeds a certain threshold. This approach can be described as a frequentist approach. On the other hand, one can incorporate prior information on which SNPs are likely to be the causal variants affecting the trait. This approach is inherently a Bayesian concept. Reconciling these two approaches is not straightforward.

Average power under varying relative risks. For more information, see our paper.

In a 2008 paper published in Genome Research, our group proposed a modification of the multiple testing framework to address this problem. Instead of using the same specific threshold for all of the association statistics, we use a different threshold for each association statistic, where the thresholds are adjusted based on the prior information. Our method takes advantage of the correlation structure by considering multiple markers within a region. In our paper, we demonstrate how to set the thresholds in order to optimally utilize prior information and maximize statistical power.

Using prior information in genetic association studies increases power over traditional association studies while maintaining the same overall false-positive rate. Compared to standard methods, our approach is equally simple to apply to association studies, produces interpretable results as p-values, and is optimal in its use of prior information in regards to statistical power.

In 2012, we extended this work to use only tag SNPs for the putative causal variant. This project was developed by Gregory Darnell (then UCLA undergraduate, now PhD student at Princeton University), Dat Duong (then UCLA undergraduate, now UCLA PhD student), and Buhm Han.

More recently, we have applied this framework to incorporate functional information in analysis of eQTL data. In this case, incorporating genomic annotation of variants significantly increases the statistical power of existing eQTL methods and detects more eGenes in comparison to standard approaches. Read the blog post on this paper, and download the full article.

For more information on our general approach, see our paper, which is available for download through Bioinformatics:
https://academic.oup.com/bioinformatics/article/28/12/i147/269880/Incorporating-prior-information-into-association
In addition, the open source implementation of our 2012 paper, MASA, which was developed by Greg Darnell and Dat Duong, is freely available for download at http://masa.cs.ucla.edu/.

The full citations to our papers on this topic are:

Darnell, Gregory; Duong, Dat ; Han, Buhm ; Eskin, Eleazar

Incorporating prior information into association studies. Journal Article

In: Bioinformatics, 28 (12), pp. i147-i153, 2012, ISSN: 1367-4811.

Abstract | Links | BibTeX


Eleazar Eskin. “Increasing Power in Association Studies by using Linkage Disequilibrium
Structure and Molecular Function as Prior Information.” Genome Research.
18(4):653-60 Special Issue Proceedings of the 12th Annual Conference on Research
in Computational Biology (RECOMB-2008), 2008.

Hypothalamic transcriptomes of 99 mouse strains reveal trans eQTL hotspots, splicing QTLs and novel non-coding genes

In a recent project, Farhad Hormozdiari and Eleazar Eskin contributed data analysis and interpretation to a project identifying new genes and genomic regions associated with metabolic function in mice. Our paper presents a comprehensive picture of the transcriptome of the mouse hypothalamus and its genetic variation and regulation. This project, which was published in eLife, was led by fellow UCLA researchers Yehudit Hasin-Brumshtein, Jake Lusis, and Desmond Smith.

Mice and humans share virtually the same set of genes; thus, mapping the mouse genome is an important step toward understanding genetic factors in common, complex human diseases such as obesity, heart disease, and diabetes. In metabolic tissues, the integration of genome-wide expression profiles with genetic and phenotypic variance can provide valuable insight into a disease’s underlying molecular mechanism. Measuring gene activity can reveal new molecules that clinical translation efforts may target to treat metabolic disorders.

Our project uses RNA-Seq to characterize transcriptome in 99 inbred strains of mice from the Hybrid Mouse Diversity Panel (HMDP), a reference resource population for cardiovascular and metabolic traits. Mice were fed a high, high sugar diet, and all strains were comprehensively genotyped and phenotyped for 150 metabolic traits. Our study examines tissues relevant to the hypothalmus, the brain region that controls metabolism and regulates body weight and appetite.

We sequenced 285 samples from all 99 strains of the HMDP. Using methods described in our paper, we identified thousands of new isoforms and >400 new genes. The HMDP allowed us to map Quantitative Trait Loci (eQTLs) with high resolution and power, identifying both local and trans acting variants—or, variants that affect a molecule from within and from outside, respectively.

Groups of genes are associated with multiple related phenotypes in HMDP, although not necessarily enriched for GO ontology or specific pathways. For more information, see our paper.

We report numerous novel transcripts supported by proteomic analyses, as well as novel non-coding RNAs. High resolution genetic mapping of transcript levels in HMDP reveals both local and trans expression eQTLs, identifying two trans eQTL ’hotspots’ associated with expression of hundreds of genes. We also report thousands of alternative splicing events regulated by genetic variants. We further showed that the genes associated with trans eQTL hotspots correlate to physiological phenotypes, such as HDL and triglyceride levels. This discovery provides insight into the mechanism behind correlation of these genotypes with complex traits.

Our data capture the various non-neuronal cell types, such as microglia or astrocytes, which are often overlooked in the mostly neuron focused studies of the hypothalamus. These cells are important mediators of hypothalamic inflammation and other processes induced by a high fat diet. Regulation of gene expression in these cell types impacts every aspect of metabolism, and our data provide a robust framework recapitulating transcriptional processes affecting multiple cell populations. Our approach is thus complementary to on-going cell type-specific transcriptomic efforts.

For more information, see our paper, which is available for download through eLife: https://elifesciences.org/content/5/e15614.

The full citation to our paper is: 

Hasin-Brumshtein, Yehudit; Hormozdiari, Farhad ; Martin, Lisa ; van Nas, Atila ; Eskin, Eleazar ; Lusis, Aldons J; Drake, Thomas A

Allele-specific expression and eQTL analysis in mouse adipose tissue. Journal Article

In: BMC Genomics, 15 (1), pp. 471, 2014, ISSN: 1471-2164.

Abstract | Links | BibTeX

See our blog post on a recent paper reviewing the HMDP data set: http://zarlab.cs.ucla.edu/the-hybrid-mouse-diversity-panel-a-resource-for-systems-genetics-analyses-of-metabolic-and-cardiovascular-traits/