Both genetic and environmental factors contribute to a trait. The genetic factors which contribute to a trait are typically spread over the genome. Emrah Kostem in our group recently published a paper on estimating how much a specific genomic region (such as a single chromosome) contributes to a trait(10.1016/j.ajhg.2013.03.010) and released a software for performing this analysis called HEIDI which is available at http://genetics.cs.ucla.edu/heritability/. This type of analysis is referred to as “partitioning heritability into the contributions of genomic regions.”
Estimating the heritability of a trait, e.g., measuring the influence of nature vs. nurture, has been a fundamental question in genetics. Traditionally, heritabilities were estimated using related individuals with known pedigrees such as twins or family cohorts. With the availability of high-throughput genomic technologies, it has been shown that heritabilities to those similar to the traditionally estimated can be obtained from genome-wide association study (GWAS) datasets utilizing unrelated individuals(10.1038/ng.608). In these approaches, the genetic similarities, or kinships, among the individuals are computed from the observed spectrum of the SNPs rather than inferring them from a given pedigree data.
Additionally, high-throughput SNP data makes it also possible to estimate local genetic similarities, which has recently been used to partition the heritability of a trait into the contributions of genomic regions(10.1038/ng.823). A naive approach estimates the heritability contributions using a linear mixed model (LMM) approach, where each region is modeled using a separate variance component.
We presented a method called HEIDI (Heritability Estimations Distributed) to improve the accuracy and computational efficiency of partitioning the heritability of a trait into the contributions of genomic regions. We show that the naive approach is not accurate for large number of regions and also does not scale for more than several partitions per chromosome in a study with 5000 individuals. We proposed an alternative approach, where the heritability contribution of a region is obtained using a model that includes the region and its genetic complement, or the rest of the genome. The advantage of using a two-component model is that it is computationally efficient and fast to fit. Additionally, it also makes it possible to parallelize the heritability estimations, where the computation of each region can be performed separately across computers.
We show the estimates of heritability contributions is inflated when the region and its genetic complement have SNPs that are in linkage disequilibrium (LD) and introduce a normalization procedure to mitigate the effect of LD. We normalize the contributions of the chromosomes such that their sum equals to the genome-wide heritability estimate and in each chromosome the regions’ contributions are normalized that sum up to the chromosome contribution.
The full citation to the paper is:
Am J Hum Genet, 92 (4), pp. 558-64, 2013, ISSN: 1537-6605.