Discovering Genetic Variation that Affects Expression in Multiple Tissues

Over the past several years, Genome Wide Association Studies (GWAS) have discovered hundreds of genetic variants involved in complex diseases(10.1056/NEJMra0905980).  The vast majority of these variants do not lie in the protein coding regions of genes and thus do not affect what the gene produces, but instead likely affect how the genes are regulated.  For this reason, the study of how genetic variation affect gene activity levels (referred to as expression levels) has been a major focus of research for many years.  Genetic variation that affects gene expression are referred to as expression quantitative trait loci (eQTL)(10.1038/nrg2969).

Several studies collect expression from multiple tissues which leads to the question of whether or not the same genetic variants affect expression in multiple tissues(10.1038/ng.2653).  Another way to ask this question is: Are eQTLs tissue specific or not tissue specific?

A challenge in this type of analysis is that an eQTL may affect expression in multiple tissues, but because of small sample sizes, the eQTL will only be detected in one of the tissues.  Thus, traditional techniques for eQTLs will systematically be biased against detecting eQTLs in multiple tissues.

Jae-Hoon Sul and Buhm Han in our group developed a method to address this issue which builds upon recent methods in random effects meta-analysis(10.1016/j.ajhg.2011.04.014),(10.1371/journal.pgen.1002555).  To apply these methods we first analyze each tissue separately and then use the meta-analysis method to combine the results of each tissue.  Since our methods are specifically designed to handle “heterogeneity” which is that the effect size can be different in each study, our method is able to perform well when the effect is present in all of the tissues or just some of the tissues.  More information about our meta-analysis research is here.

The full citation of our paper is here:

Sorry, no publications matched your criteria.

Over the past few years, our group has published several papers on methods for eQTL analysis.  Our other paper on eQTL analysis include:

Sorry, no publications matched your criteria.

Bibliography

How much does part of a genome contribute to a trait?

Both genetic and environmental factors contribute to a trait.  The genetic factors which contribute to a trait are typically spread over the genome.  Emrah Kostem in our group recently published a paper on estimating how much a specific genomic region (such as a single chromosome) contributes to a trait(10.1016/j.ajhg.2013.03.010) and released a software for performing this analysis called HEIDI which is available at http://genetics.cs.ucla.edu/heritability/.  This type of analysis is referred to as “partitioning heritability into the contributions of genomic regions.”

Estimating the heritability of a trait, e.g., measuring the influence of nature vs. nurture, has been a fundamental question in genetics. Traditionally, heritabilities were estimated using related individuals with known pedigrees such as twins or family cohorts. With the availability of high-throughput genomic technologies, it has been shown that heritabilities to those similar to the traditionally estimated can be obtained from genome-wide association study (GWAS) datasets utilizing unrelated individuals(10.1038/ng.608). In these approaches, the genetic similarities, or kinships, among the individuals are computed from the observed spectrum of the SNPs rather than inferring them from a given pedigree data.

Additionally, high-throughput SNP data makes it also possible to estimate local genetic similarities, which has recently been used to partition the heritability of a trait into the contributions of genomic regions(10.1038/ng.823). A naive approach estimates the heritability contributions using a linear mixed model (LMM) approach, where each region is modeled using a separate variance component.

We presented a method called HEIDI (Heritability Estimations Distributed) to improve the accuracy and computational efficiency of partitioning the heritability of a trait into the contributions of genomic regions. We show that the naive approach is not accurate for large number of regions and also does not scale for more than several partitions per chromosome in a study with 5000 individuals. We proposed an alternative approach, where the heritability contribution of a region is obtained using a model that includes the region and its genetic complement, or the rest of the genome. The advantage of using a two-component model is that it is computationally efficient and fast to fit. Additionally, it also makes it possible to parallelize the heritability estimations, where the computation of each region can be performed separately across computers.

We show the estimates of heritability contributions is inflated when the region and its genetic complement have SNPs that are in linkage disequilibrium (LD) and introduce a normalization procedure to mitigate the effect of LD. We normalize the contributions of the chromosomes such that their sum equals to the genome-wide heritability estimate and in each chromosome the regions’ contributions are normalized that sum up to the chromosome contribution.

The full citation to the paper is:

Sorry, no publications matched your criteria.

Bibliography

Correcting Population Structure using Mixed Models Webcast

Golden Helix yesterday hosted a excellent webcast on correcting population structure in association studies using mixed models and they highlighted our EMMA(10.1534/genetics.107.080101) and EMMAX(10.1038/ng.548) algorithms.  The webcast was given by Greta Peterson and is available at http://www.goldenhelix.com/Events/recordings/mlm/index.html.  The webcast is a great overview of mixed models applied to population structure in general as well as specifically how to use the Golden Helix software to use mixed models in association studies.

A interesting aspect of the story is that we found out about the webcast from an email advertising that they will cover the EMMAX algorithm.  It turns out that there were 863 people registered for the webcast which surpassed their previous record (for a webcast on NGS) by almost 100!  It is exciting to see how much interest there is in mixed models and in our EMMA paper which we published in 2008.

On our website, we have a bunch of resources for mixed models including the EMMA, EMMAX and ICE softwares.  We recently posted an overview of mixed models here.  Below is a list of our papers related to mixed models.

Sorry, no publications matched your criteria.

Bibliography