B.I.G. Summer in ZarLab

This summer, six young adults engaged in a unique eight-week learning experience with ZarLab, learning practical skills in genomics and bioinformatics while conducting research on large-scale human genetic datasets. These four undergraduate students participated in the Bruins-In-Genomics (B.I.G.) Summer Program, an intensive laboratory and seminar program aimed at providing real-world experience for students who are interested in pursuing interdisciplinary graduate education in the quantitative and biological sciences. In addition, two Los Angeles-area high school students participated in laboratory activities as volunteer researchers.

Eleazar Eskin, co-organizer of the summer program, and Serghei Mangul, post-doctoral scholar, hosted the young scholars in ZarLab, a UCLA computational genetics group affiliated with both the Computer Science Department and the Human Genetics Department. Mangul supervised a group of students who collaborated on a project aimed at developing computational methods for the study of the human immune system and microbiome. Working with data from one of the largest sequencing projects in the world, the Genotype-Tissue Expression (GTEx) study, the students analyzed more than 8,000 samples obtained from 544 individuals and representing 53 different tissue types. In doing so, they gained familiarization with current approaches to studying how changes in our genes contribute to common human diseases.

During a poster session on August 12, 2016, the B.I.G. participants presented the results of their work on GTEx:

  • Jeremy Rotman: “Studying the microbiome by analyzing the coverage of sequencing reads mapped to viruses, eukaryotes, and bacteria”
  • Benjamin Statz: “An improved method for analysis of variable domain of B and T cell receptors”
  • William Van Der Wey: “Functional profiling of microbial communities across multiple human tissues”
  • Kevin Wesel: “Profiling repeat elements across multiple human tissues”

In addition to mentoring B.I.G. Program students in ZarLab, Mangul developed and presented a three-part series of workshops introducing students to UNIX earlier during the program.

Eskin and Mangul also hosted a B.I.G. Program student, Samantha Jenson, who collaborated with Jonathan Flint, a world-renowned authority on the genetics of depression and co-director of UCLA’s Depression Grand Challenge. This year, Eskin facilitated a Neurogenetics working group and weekly neurogenetics seminar series for the B.I.G. Program. Participants in this group gained first-hand experience in the process of developing methods for mapping the underlying genetic causes of Major Depression Disorder. Jenson presented her work on “Structural variant discovery in Major Depression Disorder” during the August 12th poster session.

The annual B.I.G. Program is a collaboration between multiple labs and includes next generation sequencing analysis workshops, weekly science talks by researchers, a weekly student journal club, professional development seminars, social activities, concluding poster sessions, and an optional GRE test prep course. Participants also benefited from relevant workshops and research talks presented during the UCLA Computational Genomics Summer Institute (CGSI).

Congratulations to Benjamin, Jeremy, Kevin, Samantha, and William on their acceptance to and success in the B.I.G. Summer Program!

This slideshow requires JavaScript.

We thank the following generous institutions that made this year’s B.I.G. Summer Program a big success:

  • National Institutes of Health grant MH109172
  • UCOP for a UC-HBCU partnership Program in Genomics and Systems
  • NIH NIBIB for NGS Data Analysis Skills for the Biosciences Pipeline  R25EB022364
  • NIH NIMH for Undergraduate Research Experience in Neuropsychiatric Genomics R25MH109172-01

Learn more about the B.I.G. Program:
UCLA Newsroom: UCLA hosts summer program for future biosciences leaders
http://newsroom.ucla.edu/releases/ucla-hosts-summer-program-for-future-biosciences-leaders

Review Article: Mixed Models and Population Structure

mixed-model-figureMixed models are now widely used for association studies in order to correct for population structure.  A simple intuitive description of how and why they work is provided in our Mouse GWAS review(10.1038/nrg3335) paper published in Nature Genetics as a Box 1 on page 812:

A challenge in mouse genome-wide association studies (GWASs) is the complex genetic relationships between strains included in the study. Some of these differences stem from the distinct ancestral origins of the mice, such as the differences between wild-derived strains and classical inbred strains, which are primarily descended from domesticated mice(10.1038/nature06067),(10.1038/ng2087),(10.1038/ng.847). Additionally, among strains, there is variability in the degree to which particular genomic regions are shared owing to the complex breeding history. Traditional association statistical tests make the assumption that the phenotypes of individuals in an association are independent. However, owing to the complex genetic relationships, this assumption is violated for mouse GWASs. Closely related strains will have more similar phenotype values than more distant strains. This phenomenon, which is termed population structure, causes spurious associations in GWASs. Recently, statistical methods have been developed to address this problem, including efficient mixed-model association (EMMA)(18385116) and resample model averaging (RMA)(10.1534/genetics.109.100727), which are widely used in mouse GWASs, and EIGENSTRAT(10.1038/ng1847) and EMMAX(20208533), which are widely used in human studies. The figure demonstrates this problem for mouse GWASs. Panel a shows body-weight data for 38 inbred strains from the Mouse Phenome Database as analysed in Kang et al., (2008) (18385116). A phylogeny of the strains is shown, demonstrating a clear genetic distinction between the wild-derived strains and the classical inbred strains. Note that all wild-derived strains have a lower body weight than classical inbred strains. Panel b shows a Manhattan plot with the association results for 140,000 SNPs(20439770) and body weight. Almost every locus appears to be associated with body weight as each of the many SNPs that differentiate the wild-derived and classical inbred strains appears to be associated with body weight. A visualization of the cause of the spurious associations is shown panel c. Many SNPs and the phenotype are both correlated with the genetic relatedness or population structure among the strains. Statistical techniques can take into account the genetic relationships between the strains to correct for population structure, thus minimizing spurious associations. In this example, EMMA was applied to the data (panel d). The highest peak, although not genome-wide significant, occurs on chromosome 8 and is near the logarithm of the odds (lod) peak of a previously known body weight quantitative trait locus Bwq3(11515095). Panels b and d are reproduced, with permission, from Kang et al., (2008) (18385116) © (2008) Genetics Society of America.

Bibliography

Review Article: Mouse Genetics

We recently published two reviews on mouse genetics which are a great place to start for anyone interesting in learning about the interesting recent developments in the area.  While the genetic cross has been the main mouse genetic study design for decades, over the past several years, several novel mouse study designs have been demonstrated to have advantages over the cross.

The first review was written by Jonathan Flint and Eleazar Eskin covers all the major novel strategies (10.1038/nrg3335). The image which is the Nature Reviews Genetics journal cover, ‘Chromatic’ by Patrick Morgan, was inspired by the Review.

The second review covers the Hybrid Mouse Diversity Panel (HMDP) which is a study design developed at UCLA jointly between the Lusis and Eskin groups (10.1007/s00335-012-9411-5).

Full citations:

Flint, J. & Eskin, E., 2012, Genome-wide association studies in mice, Nature reviews. Genetics.

Abstract:

Genome-wide association studies (GWASs) have transformed the field of human genetics and have led to the discovery of hundreds of genes that are implicated in human disease. The technological advances that drove this revolution are now poised to transform genetic studies in model organisms, including mice. However, the design of GWASs in mouse strains is fundamentally different from the design of human GWASs, creating new challenges and opportunities. This Review gives an overview of the novel study designs for mouse GWASs, which dramatically improve both the statistical power and resolution compared to classical gene-mapping approaches.

Ghazalpour, A., Rau, C.D., Farber, C.R., Bennett, B.J., Orozco, L.D., van Nas, A., Pan, C., Allayee, H., Beaven, S.W., Civelek, M., Davis, R.C., Drake, T.A., Friedman, R.A., Furlotte, N., Hui, S.T., Jentsch, J.D., Kostem, E., Kang, H.M., Kang, E.Y., Joo, J.W., Korshunov, V.A., Laughlin, R.E., Martin, L.J., Ohmen, J.D., Parks, B.W., Pellegrini, M., Reue, K., Smith, D.J., Tetradis, S., Wang, J., Wang, Y., Weiss, J.N., Kirchgessner, T., Gargalovic, P.S., Eskin, E., Lusis, A.J. & Leboeuf, R.C., 2012, Hybrid mouse diversity panel: a panel of inbred mouse strains suitable for analysis of complex genetic traits, Mammalian genome : official journal of the International Mammalian Genome Society.

Abstract:

We have developed an association-based approach using classical inbred strains of mice in which we correct for population structure, which is very extensive in mice, using an efficient mixed-model algorithm. Our approach includes inbred parental strains as well as recombinant inbred strains in order to capture loci with effect sizes typical of complex traits in mice (in the range of 5 % of total trait variance). Over the last few years, we have typed the hybrid mouse diversity panel (HMDP) strains for a variety of clinical traits as well as intermediate phenotypes and have shown that the HMDP has sufficient power to map genes for highly complex traits with resolution that is in most cases less than a megabase. In this essay, we review our experience with the HMDP, describe various ongoing projects, and discuss how the HMDP may fit into the larger picture of common diseases and different approaches.

Bibliography