Jae Hoon Sul successfully defended his thesis on Wednesday September 19th. His talk is posted on our YouTube Channel ZarlabUCLA. Jae Hoon’s talk discusses several projects including using mixed model to correct for population structure, rare variant association studies and a meta-analysis approach for detecting multi-tissue eQTLs. Fortunately for the lab, Jae Hoon is staying at UCLA for another year as a post-doc.
More details about what he talks about in his talk are available in the papers he discusses:
Sul, Jae Hoon; Han, Buhm ; Ye, Chun ; Choi, Ted ; Eskin, Eleazar Effectively Identifying eQTLs from Multiple Tissues by Combining Mixed Model and Meta-analytic Approaches Journal Article In: PLoS Genet, 9 (6), pp. e1003491, 2013, ISSN: 1553-7404. @article{10.1371/journal.pgen.1003491, title = {Effectively Identifying eQTLs from Multiple Tissues by Combining Mixed Model and Meta-analytic Approaches}, author = { Jae Hoon Sul and Buhm Han and Chun Ye and Ted Choi and Eleazar Eskin}, url = {http://dx.doi.org/10.1371%2Fjournal.pgen.1003491}, issn = {1553-7404}, year = {2013}, date = {2013-01-01}, journal = {PLoS Genet}, volume = {9}, number = {6}, pages = {e1003491}, publisher = {Public Library of Science}, address = {United States}, abstract = {Author Summary The combination of gene expression and genetic variation data has enabled the identification of genetic variants that affect gene expression levels. It has been shown that some variants influence gene expression in only one tissue while others influence gene expression in multiple tissues. However, an analysis of multiple tissue data using traditional statistical methods typically fails to identify those variants that affect multiple tissues because each tissue is treated independently and due to low statistical power, the effect in a given tissue may be missed. Building on recent advances in statistical methods for meta-analysis and mixed models, we present a novel method that combines information from multiple tissues to identify genetic variation that affects multiple tissues. We show that our method detects more genetic variation that influences multiple tissues than traditional statistical methods both on simulated and real data.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Author Summary The combination of gene expression and genetic variation data has enabled the identification of genetic variants that affect gene expression levels. It has been shown that some variants influence gene expression in only one tissue while others influence gene expression in multiple tissues. However, an analysis of multiple tissue data using traditional statistical methods typically fails to identify those variants that affect multiple tissues because each tissue is treated independently and due to low statistical power, the effect in a given tissue may be missed. Building on recent advances in statistical methods for meta-analysis and mixed models, we present a novel method that combines information from multiple tissues to identify genetic variation that affects multiple tissues. We show that our method detects more genetic variation that influences multiple tissues than traditional statistical methods both on simulated and real data. |
Sul, Jae Hoon; Han, Buhm ; He, Dan ; Eskin, Eleazar An Optimal Weighted Aggregated Association Test for Identification of Rare Variants Involved in Common Diseases. Journal Article In: Genetics, 188 (1), pp. 181-188, 2011, ISSN: 1943-2631. @article{Sul:Genetics:2011, title = {An Optimal Weighted Aggregated Association Test for Identification of Rare Variants Involved in Common Diseases.}, author = { Jae Hoon Sul and Buhm Han and Dan He and Eleazar Eskin}, url = {http://dx.doi.org/10.1534/genetics.110.125070}, issn = {1943-2631}, year = {2011}, date = {2011-01-01}, journal = {Genetics}, volume = {188}, number = {1}, pages = {181-188}, organization = {University of California, Los Angeles.}, abstract = {The advent of next generation sequencing technologies allows one to discover nearly all rare variants in a genomic region of interest. This technological development increases the need for an effective statistical method for testing the aggregated effect of rare variants in a gene on disease susceptibility. The idea behind this approach is that if a certain gene is involved in a disease, many rare variants within the gene will disrupt the function of the gene and are associated with the disease. In this paper, we present Rare variant Weighted Aggregate Statistic (RWAS), a method that groups rare variants and computes a weighted sum of differences between case and control mutation counts. We show that our method outperforms the groupwise association test of Madsen and Browning in the disease-risk model which assumes that each variant makes an equally small contribution to disease-risk. In addition, we can incorporate prior information into our method of which variants are likely causal. By using simulated data and real mutation screening data of the susceptibility gene for ataxia telangiectasia, we demonstrate that prior information has a substantial influence on the statistical power of association studies. Our method is publicly available at http://genetics.cs.ucla.edu/rarevariants.}, keywords = {}, pubstate = {published}, tppubtype = {article} } The advent of next generation sequencing technologies allows one to discover nearly all rare variants in a genomic region of interest. This technological development increases the need for an effective statistical method for testing the aggregated effect of rare variants in a gene on disease susceptibility. The idea behind this approach is that if a certain gene is involved in a disease, many rare variants within the gene will disrupt the function of the gene and are associated with the disease. In this paper, we present Rare variant Weighted Aggregate Statistic (RWAS), a method that groups rare variants and computes a weighted sum of differences between case and control mutation counts. We show that our method outperforms the groupwise association test of Madsen and Browning in the disease-risk model which assumes that each variant makes an equally small contribution to disease-risk. In addition, we can incorporate prior information into our method of which variants are likely causal. By using simulated data and real mutation screening data of the susceptibility gene for ataxia telangiectasia, we demonstrate that prior information has a substantial influence on the statistical power of association studies. Our method is publicly available at http://genetics.cs.ucla.edu/rarevariants. |
Kang, Hyun Min; Sul, Jae Hoon ; Service, Susan K; Zaitlen, Noah A; Kong, Sit-Yee Y; Freimer, Nelson B; Sabatti, Chiara ; Eskin, Eleazar Variance component model to account for sample structure in genome-wide association studies. Journal Article In: Nat Genet, 42 (4), pp. 348-54, 2010, ISSN: 1546-1718. @article{Kang:NatGenet:2010, title = {Variance component model to account for sample structure in genome-wide association studies.}, author = { Hyun Min Kang and Jae Hoon Sul and Susan K. Service and Noah A. Zaitlen and Sit-Yee Y. Kong and Nelson B. Freimer and Chiara Sabatti and Eleazar Eskin}, url = {http://dx.doi.org/10.1038/ng.548}, issn = {1546-1718}, year = {2010}, date = {2010-01-01}, journal = {Nat Genet}, volume = {42}, number = {4}, pages = {348-54}, address = {United States}, organization = {Center for Statistical Genetics, Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA.}, abstract = {Although genome-wide association studies (GWASs) have identified numerous loci associated with complex traits, imprecise modeling of the genetic relatedness within study samples may cause substantial inflation of test statistics and possibly spurious associations. Variance component approaches, such as efficient mixed-model association (EMMA), can correct for a wide range of sample structures by explicitly accounting for pairwise relatedness between individuals, using high-density markers to model the phenotype distribution; but such approaches are computationally impractical. We report here a variance component approach implemented in publicly available software, EMMA eXpedited (EMMAX), that reduces the computational time for analyzing large GWAS data sets from years to hours. We apply this method to two human GWAS data sets, performing association analysis for ten quantitative traits from the Northern Finland Birth Cohort and seven common diseases from the Wellcome Trust Case Control Consortium. We find that EMMAX outperforms both principal component analysis and genomic control in correcting for sample structure.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Although genome-wide association studies (GWASs) have identified numerous loci associated with complex traits, imprecise modeling of the genetic relatedness within study samples may cause substantial inflation of test statistics and possibly spurious associations. Variance component approaches, such as efficient mixed-model association (EMMA), can correct for a wide range of sample structures by explicitly accounting for pairwise relatedness between individuals, using high-density markers to model the phenotype distribution; but such approaches are computationally impractical. We report here a variance component approach implemented in publicly available software, EMMA eXpedited (EMMAX), that reduces the computational time for analyzing large GWAS data sets from years to hours. We apply this method to two human GWAS data sets, performing association analysis for ten quantitative traits from the Northern Finland Birth Cohort and seven common diseases from the Wellcome Trust Case Control Consortium. We find that EMMAX outperforms both principal component analysis and genomic control in correcting for sample structure. |