Review Article: GWAS and Missing Heritability

cacm-coverA couple of years ago I was asked to write a review article on the progress of my field (computational genetics) targeted toward computer scientists. My article “Discovering Genes Involved in Disease and the Mystery of Missing Heritability” was just published on the cover of the Communications of the ACM. This article is written to be an introduction to the field as well as describe the rapid progress over the past decade in terms of the discovery of large number of variants involved in common human diseases. The article is written assuming no background in biology and is designed to be accessible to researchers and students outside the field. I hope that it will encourage other computational researchers to get involved in genetics.  The journal also made a video highlighting this article which is available here:

Discovering Genes Involved in Disease and the Mystery of Missing Heritability from CACM on Vimeo.

The full citation to the article is:

Eskin, Eleazar

Discovering Genes Involved in Disease and the Mystery of Missing Heritability Journal Article

In: Commun. ACM, 58 (10), pp. 80-87, 2015, ISSN: 0001-0782.

Abstract | Links | BibTeX

Review Article: Mixed Models and Population Structure

mixed-model-figureMixed models are now widely used for association studies in order to correct for population structure.  A simple intuitive description of how and why they work is provided in our Mouse GWAS review(10.1038/nrg3335) paper published in Nature Genetics as a Box 1 on page 812:

A challenge in mouse genome-wide association studies (GWASs) is the complex genetic relationships between strains included in the study. Some of these differences stem from the distinct ancestral origins of the mice, such as the differences between wild-derived strains and classical inbred strains, which are primarily descended from domesticated mice(10.1038/nature06067),(10.1038/ng2087),(10.1038/ng.847). Additionally, among strains, there is variability in the degree to which particular genomic regions are shared owing to the complex breeding history. Traditional association statistical tests make the assumption that the phenotypes of individuals in an association are independent. However, owing to the complex genetic relationships, this assumption is violated for mouse GWASs. Closely related strains will have more similar phenotype values than more distant strains. This phenomenon, which is termed population structure, causes spurious associations in GWASs. Recently, statistical methods have been developed to address this problem, including efficient mixed-model association (EMMA)(18385116) and resample model averaging (RMA)(10.1534/genetics.109.100727), which are widely used in mouse GWASs, and EIGENSTRAT(10.1038/ng1847) and EMMAX(20208533), which are widely used in human studies. The figure demonstrates this problem for mouse GWASs. Panel a shows body-weight data for 38 inbred strains from the Mouse Phenome Database as analysed in Kang et al., (2008) (18385116). A phylogeny of the strains is shown, demonstrating a clear genetic distinction between the wild-derived strains and the classical inbred strains. Note that all wild-derived strains have a lower body weight than classical inbred strains. Panel b shows a Manhattan plot with the association results for 140,000 SNPs(20439770) and body weight. Almost every locus appears to be associated with body weight as each of the many SNPs that differentiate the wild-derived and classical inbred strains appears to be associated with body weight. A visualization of the cause of the spurious associations is shown panel c. Many SNPs and the phenotype are both correlated with the genetic relatedness or population structure among the strains. Statistical techniques can take into account the genetic relationships between the strains to correct for population structure, thus minimizing spurious associations. In this example, EMMA was applied to the data (panel d). The highest peak, although not genome-wide significant, occurs on chromosome 8 and is near the logarithm of the odds (lod) peak of a previously known body weight quantitative trait locus Bwq3(11515095). Panels b and d are reproduced, with permission, from Kang et al., (2008) (18385116) © (2008) Genetics Society of America.


Review Article: Mouse Genetics

We recently published two reviews on mouse genetics which are a great place to start for anyone interesting in learning about the interesting recent developments in the area.  While the genetic cross has been the main mouse genetic study design for decades, over the past several years, several novel mouse study designs have been demonstrated to have advantages over the cross.

The first review was written by Jonathan Flint and Eleazar Eskin covers all the major novel strategies (10.1038/nrg3335). The image which is the Nature Reviews Genetics journal cover, ‘Chromatic’ by Patrick Morgan, was inspired by the Review.

The second review covers the Hybrid Mouse Diversity Panel (HMDP) which is a study design developed at UCLA jointly between the Lusis and Eskin groups (10.1007/s00335-012-9411-5).

Full citations:

Flint, J. & Eskin, E., 2012, Genome-wide association studies in mice, Nature reviews. Genetics.


Genome-wide association studies (GWASs) have transformed the field of human genetics and have led to the discovery of hundreds of genes that are implicated in human disease. The technological advances that drove this revolution are now poised to transform genetic studies in model organisms, including mice. However, the design of GWASs in mouse strains is fundamentally different from the design of human GWASs, creating new challenges and opportunities. This Review gives an overview of the novel study designs for mouse GWASs, which dramatically improve both the statistical power and resolution compared to classical gene-mapping approaches.

Ghazalpour, A., Rau, C.D., Farber, C.R., Bennett, B.J., Orozco, L.D., van Nas, A., Pan, C., Allayee, H., Beaven, S.W., Civelek, M., Davis, R.C., Drake, T.A., Friedman, R.A., Furlotte, N., Hui, S.T., Jentsch, J.D., Kostem, E., Kang, H.M., Kang, E.Y., Joo, J.W., Korshunov, V.A., Laughlin, R.E., Martin, L.J., Ohmen, J.D., Parks, B.W., Pellegrini, M., Reue, K., Smith, D.J., Tetradis, S., Wang, J., Wang, Y., Weiss, J.N., Kirchgessner, T., Gargalovic, P.S., Eskin, E., Lusis, A.J. & Leboeuf, R.C., 2012, Hybrid mouse diversity panel: a panel of inbred mouse strains suitable for analysis of complex genetic traits, Mammalian genome : official journal of the International Mammalian Genome Society.


We have developed an association-based approach using classical inbred strains of mice in which we correct for population structure, which is very extensive in mice, using an efficient mixed-model algorithm. Our approach includes inbred parental strains as well as recombinant inbred strains in order to capture loci with effect sizes typical of complex traits in mice (in the range of 5 % of total trait variance). Over the last few years, we have typed the hybrid mouse diversity panel (HMDP) strains for a variety of clinical traits as well as intermediate phenotypes and have shown that the HMDP has sufficient power to map genes for highly complex traits with resolution that is in most cases less than a megabase. In this essay, we review our experience with the HMDP, describe various ongoing projects, and discuss how the HMDP may fit into the larger picture of common diseases and different approaches.