Review Article: GWAS and Missing Heritability

cacm-coverA couple of years ago I was asked to write a review article on the progress of my field (computational genetics) targeted toward computer scientists. My article “Discovering Genes Involved in Disease and the Mystery of Missing Heritability” was just published on the cover of the Communications of the ACM. This article is written to be an introduction to the field as well as describe the rapid progress over the past decade in terms of the discovery of large number of variants involved in common human diseases. The article is written assuming no background in biology and is designed to be accessible to researchers and students outside the field. I hope that it will encourage other computational researchers to get involved in genetics.  The journal also made a video highlighting this article which is available here:

Discovering Genes Involved in Disease and the Mystery of Missing Heritability from CACM on Vimeo.

The full citation to the article is:
 

Eskin, Eleazar

Discovering Genes Involved in Disease and the Mystery of Missing Heritability Journal Article

In: Commun. ACM, 58 (10), pp. 80-87, 2015, ISSN: 0001-0782.

Abstract | Links | BibTeX

Video Tutorial: An Introduction to Read Mapping and Next Generation Sequencing

We teach a course called “Computational Genetics” each year at UCLA. This course is taken by both graduate and undergraduate students from both the Computer Science department and the many biology and medical school programs. In this course we cover both topics related to genome wide association studies (GWAS) and topics related to next generation sequencing studies. One lecture that is given each year is an introductory lecture to sequencing and read mapping. The video of this lecture is available here. Please excuse the poor cinematography. This lecture was recorded from the back of the classroom.

Thesis Defense: Dr. Eun Yong Kang

Eun Yong Kang in our group defended his thesis on Monday Nov 25th, 2013. 2:30pm – 4:30pm in 4760 Boelter Hall.

The title of his defense was “Computational Genetic Approaches for Understanding the Genetic Architecture of Complex Traits”. The video of this defense is now available here. Fortunately for the lab, Eun is now a post-doc in the group.

The abstract of his thesis defense was:
Recent advances in genotyping and sequencing technology have enabled researchers to collect an enormous amount of high-dimensional genotype data. These large scale genomic data provide unprecedented opportunity for researchers to study and analyze the genetic factors of human complex traits. One of the major challenges in analyzing these high-dimensional genomic data is requiring effective and efficient computational methodologies. In this talk, I will focus on three problems that I have worked on. First, I will introduce a method for inferring biological networks from high-throughput data containing both genetic variation and gene expression profiles from genetically distinct strains of an organism. For this problem, I use causal inference techniques to infer the presence or absence of causal relationships between yeast gene expressions in the framework of graphical causal models. Second, I introduce efficient pairwise identity by descent (IBD) association mapping method, which utilizes importance sampling to improve efficiency and enable approximation of extremely small p-values. Using the WTCCC type 1 diabetes data, I show that Fast-Pairwise cansuccessfully pinpoint a gene known to be associated to the disease within the MHC region. Finally, I introduce a novel meta analytic approach (Meta-GxE) to identify gene-by-environment interactions by aggregating the multiple studies with varying environmental conditions. Meta-GxE approach jointly analyze multiple studies with varying environmental conditions using a meta-analytic approach based on a random effects model to identify loci involved in gene-by-environment interactions. This approach is motivated by the observation that methods for discovering gene-by-environment interactions are closely related to random effects models for meta-analysis. We show that interactions can be interpreted as heterogeneity and can be detected without utilizing the traditional uni- or multi-variate approaches for discovery of gene-by-environment interactions. Application of this approach to 17 mouse studies identify 26 significant loci involved in High-density lipoprotein (HDL) cholesterol, many of which show significant evidence of involvement in gene-by-environment interactions.

Eun’s talk covered the following papers:

Kang, Eun Yong; Han, Buhm; Furlotte, Nicholas; Joo, Jong Wha J; Shih, Diana; Davis, Richard C; Lusis, Aldons J; Eskin, Eleazar

Meta-Analysis Identifies Gene-by-Environment Interactions as Demonstrated in a Study of 4,965 Mice Journal Article

In: PLoS Genet, 10 (1), pp. e1004022, 2014, ISSN: 1553-7404.

Abstract | Links | BibTeX

Han, Buhm; Kang, Eun Yong ; Raychaudhuri, Soumya ; de Bakker, Paul I W; Eskin, Eleazar

Fast Pairwise IBD Association Testing in Genome-wide Association Studies. Journal Article

In: Bioinformatics, 2013, ISSN: 1367-4811.

Abstract | Links | BibTeX

Kang, Eun Yong; Ye, Chun ; Shpitser, Ilya ; Eskin, Eleazar

Detecting the presence and absence of causal relationships between expression of yeast genes with very few samples. Journal Article

In: J Comput Biol, 17 (3), pp. 533-46, 2010, ISSN: 1557-8666.

Abstract | Links | BibTeX