Our group publishes papers presenting new methodologies, describing the results of studies that use our software, and reviewing current topics in the field of Bioinformatics. Scroll down or click here for a complete list of papers produced by our lab. Since 2013, we write blog posts summarizing new research papers and review articles:
GWAS
- Fine Mapping Causal Variants and Allelic Heterogeneity
- Widespread Allelic Heterogeneity in Complex Traits
- Selection in Europeans on Fatty Acid Desaturases Associated with Dietary Changes
- Incorporating prior information into association studies
- Characterization of Expression Quantitative Trait Loci in Pedigrees from Colombia and Costa Rica Ascertained for Bipolar Disorder
- Simultaneous modeling of disease status and clinical phenotypes to increase power in GWAS
- Efficient and accurate multiple-phenotype regression method for high dimensional data considering population structure
- Review Article: Population Structure in Genetic Studies: Confounding Factors and Mixed Models
- Colocalization of GWAS and eQTL Signals Detects Target Genes
- Chromosome conformation elucidates regulatory relationships in developing human brain
Mouse Genetics
- Review Article: The Hybrid Mouse Diversity Panel
- Genes, Environments and Meta-Analysis
- Review Article: Mixed Models and Population Structure
- Identifying Genes Involved in Blood Cell Traits
- Genes, Diet, and Body Weight (in Mice)
- Review Article: Mouse Genetics
Population Structure
- Efficient and accurate multiple-phenotype regression method for high dimensional data considering population structure
- Review Article: Population Structure in Genetic Studies: Confounding Factors and Mixed Models
- Accounting for Population Structure in Gene-by-Environment Interactions in Genome-Wide Association Studies Using Mixed Models
- Multiple testing correction in linear mixed models
- Identification of causal genes for complex traits (CAVIAR-gene)
- Accurate viral population assembly from ultra-deep sequencing data
- GRAT: Speeding up Expression Quantitative Trail Loci (eQTL) Studies
- Correcting Population Structure using Mixed Models Webcast
- Mixed models can correct for population structure for genomic regions under selection
Review Articles
- Review Article: Population Structure in Genetic Studies: Confounding Factors and Mixed Models
- Review Article: The Hybrid Mouse Diversity Panel
- Review Article: GWAS and Missing Heritability
- Review Article: Mixed Models and Population Structure
- Review Article: Mouse Genetics
Publications
2018 |
Kang, Eun Yong; Lee, Cue Hyunkyu; Furlotte, Nicholas A; Joo, Jong Wha J; Kostem, Emrah; Zaitlen, Noah; Eskin, Eleazar; Han, Buhm An Association Mapping Framework To Account for Potential Sex Difference in Genetic Architectures. Journal Article Genetics, 2018, ISSN: 1943-2631. Abstract | Links | BibTeX | Tags: Association Study Methods, Meta-Analysis @article{Kang:Genetics:2018, title = {An Association Mapping Framework To Account for Potential Sex Difference in Genetic Architectures.}, author = { Eun Yong Kang and Cue Hyunkyu Lee and Nicholas A. Furlotte and Jong Wha J. Joo and Emrah Kostem and Noah Zaitlen and Eleazar Eskin and Buhm Han}, url = {http://dx.doi.org/10.1534/genetics.117.300501}, issn = {1943-2631}, year = {2018}, date = {2018-01-01}, journal = {Genetics}, address = {United States}, organization = {University of California, Los Angeles.}, abstract = {Over the past few years, genome-wide association studies have identified many trait-associated loci that have different effects on females and males, which increased attention to the genetic architecture differences between the sexes. The between-sex differences in genetic architectures can cause a variety of phenomena such as differences in the effect sizes at trait-associated loci, differences in the magnitudes of polygenic background effects, and differences in the phenotypic variances. However, current association testing approaches for dealing with sex, such as including sex as a covariate, cannot fully account for these phenomena and can be suboptimal in statistical power. We present a novel association mapping framework, MetaSex, that can comprehensively account for the genetic architecture differences between the sexes. Through simulations and applications to real data, we show that our framework has superior performance than previous approaches in association mapping}, keywords = {Association Study Methods, Meta-Analysis}, pubstate = {published}, tppubtype = {article} } Over the past few years, genome-wide association studies have identified many trait-associated loci that have different effects on females and males, which increased attention to the genetic architecture differences between the sexes. The between-sex differences in genetic architectures can cause a variety of phenomena such as differences in the effect sizes at trait-associated loci, differences in the magnitudes of polygenic background effects, and differences in the phenotypic variances. However, current association testing approaches for dealing with sex, such as including sex as a covariate, cannot fully account for these phenomena and can be suboptimal in statistical power. We present a novel association mapping framework, MetaSex, that can comprehensively account for the genetic architecture differences between the sexes. Through simulations and applications to real data, we show that our framework has superior performance than previous approaches in association mapping |
2017 |
Duong, Dat; Gai, Lisa; Snir, Sagi; Kang, Eun Yong; Han, Buhm; Sul, Jae Hoon; Eskin, Eleazar Applying meta-analysis to genotype-tissue expression data from multiple tissues to identify eQTLs and increase the number of eGenes. Journal Article Bioinformatics, 33 (14), pp. i67-i74, 2017, ISSN: 1367-4811. Abstract | Links | BibTeX | Tags: Expression QTLs, Meta-Analysis @article{Duong:Bioinformatics:2017, title = {Applying meta-analysis to genotype-tissue expression data from multiple tissues to identify eQTLs and increase the number of eGenes.}, author = { Dat Duong and Lisa Gai and Sagi Snir and Eun Yong Kang and Buhm Han and Jae Hoon Sul and Eleazar Eskin}, url = {http://dx.doi.org/10.1093/bioinformatics/btx227}, issn = {1367-4811}, year = {2017}, date = {2017-01-01}, journal = {Bioinformatics}, volume = {33}, number = {14}, pages = {i67-i74}, address = {England}, organization = {Department of Computer Science, University of California, Los Angeles, CA 90095, USA.}, abstract = {Motivation: There is recent interest in using gene expression data to contextualize findings from traditional genome-wide association studies (GWAS). Conditioned on a tissue, expression quantitative trait loci (eQTLs) are genetic variants associated with gene expression, and eGenes are genes whose expression levels are associated with genetic variants. eQTLs and eGenes provide great supporting evidence for GWAS hits and important insights into the regulatory pathways involved in many diseases. When a significant variant or a candidate gene identified by GWAS is also an eQTL or eGene, there is strong evidence to further study this variant or gene. Multi-tissue gene expression datasets like the Gene Tissue Expression (GTEx) data are used to find eQTLs and eGenes. Unfortunately, these datasets often have small sample sizes in some tissues. For this reason, there have been many meta-analysis methods designed to combine gene expression data across many tissues to increase power for finding eQTLs and eGenes. However, these existing techniques are not scalable to datasets containing many tissues, like the GTEx data. Furthermore, these methods ignore a biological insight that the same variant may be associated with the same gene across similar tissues. Results: We introduce a meta-analysis model that addresses these problems in existing methods. We focus on the problem of finding eGenes in gene expression data from many tissues, and show that our model is better than other types of meta-analyses. Availability and Implementation: Source code is at https://github.com/datduong/RECOV . Contact: eeskin@cs.ucla.edu or datdb@cs.ucla.edu. Supplementary information: Supplementary data are available at Bioinformatics online}, keywords = {Expression QTLs, Meta-Analysis}, pubstate = {published}, tppubtype = {article} } Motivation: There is recent interest in using gene expression data to contextualize findings from traditional genome-wide association studies (GWAS). Conditioned on a tissue, expression quantitative trait loci (eQTLs) are genetic variants associated with gene expression, and eGenes are genes whose expression levels are associated with genetic variants. eQTLs and eGenes provide great supporting evidence for GWAS hits and important insights into the regulatory pathways involved in many diseases. When a significant variant or a candidate gene identified by GWAS is also an eQTL or eGene, there is strong evidence to further study this variant or gene. Multi-tissue gene expression datasets like the Gene Tissue Expression (GTEx) data are used to find eQTLs and eGenes. Unfortunately, these datasets often have small sample sizes in some tissues. For this reason, there have been many meta-analysis methods designed to combine gene expression data across many tissues to increase power for finding eQTLs and eGenes. However, these existing techniques are not scalable to datasets containing many tissues, like the GTEx data. Furthermore, these methods ignore a biological insight that the same variant may be associated with the same gene across similar tissues. Results: We introduce a meta-analysis model that addresses these problems in existing methods. We focus on the problem of finding eGenes in gene expression data from many tissues, and show that our model is better than other types of meta-analyses. Availability and Implementation: Source code is at https://github.com/datduong/RECOV . Contact: eeskin@cs.ucla.edu or datdb@cs.ucla.edu. Supplementary information: Supplementary data are available at Bioinformatics online |
Lee, C H; Eskin, E; Han, B Increasing the power of meta-analysis of genome-wide association studies to detect heterogeneous effects. Journal Article Bioinformatics, 33 (14), pp. i379-i388, 2017, ISSN: 1367-4811. Abstract | Links | BibTeX | Tags: Meta-Analysis @article{Lee:Bioinformatics:2017, title = {Increasing the power of meta-analysis of genome-wide association studies to detect heterogeneous effects.}, author = { C. H. Lee and E. Eskin and B. Han}, url = {http://dx.doi.org/10.1093/bioinformatics/btx242}, issn = {1367-4811}, year = {2017}, date = {2017-01-01}, journal = {Bioinformatics}, volume = {33}, number = {14}, pages = {i379-i388}, address = {England}, organization = {Department of Convergence Medicine, University of Ulsan College of Medicine & Asan Institute for Life Sciences, Asan Medical Center, Songpa-gu, Seoul 138-736, Korea.}, abstract = {Motivation: Meta-analysis is essential to combine the results of genome-wide association studies (GWASs). Recent large-scale meta-analyses have combined studies of different ethnicities, environments and even studies of different related phenotypes. These differences between studies can manifest as effect size heterogeneity. We previously developed a modified random effects model (RE2) that can achieve higher power to detect heterogeneous effects than the commonly used fixed effects model (FE). However, RE2 cannot perform meta-analysis of correlated statistics, which are found in recent research designs, and the identified variants often overlap with those found by FE. Results: Here, we propose RE2C, which increases the power of RE2 in two ways. First, we generalized the likelihood model to account for correlations of statistics to achieve optimal power, using an optimization technique based on spectral decomposition for efficient parameter estimation. Second, we designed a novel statistic to focus on the heterogeneous effects that FE cannot detect, thereby, increasing the power to identify new associations. We developed an efficient and accurate p -value approximation procedure using analytical decomposition of the statistic. In simulations, RE2C achieved a dramatic increase in power compared with the decoupling approach (71% vs. 21%) when the statistics were correlated. Even when the statistics are uncorrelated, RE2C achieves a modest increase in power. Applications to real genetic data supported the utility of RE2C. RE2C is highly efficient and can meta-analyze one hundred GWASs in one day. Availability and implementation: The software is freely available at http://software.buhmhan.com/RE2C . Contact: buhm.han@amc.seoul.kr. Supplementary information: Supplementary data are available at Bioinformatics online}, keywords = {Meta-Analysis}, pubstate = {published}, tppubtype = {article} } Motivation: Meta-analysis is essential to combine the results of genome-wide association studies (GWASs). Recent large-scale meta-analyses have combined studies of different ethnicities, environments and even studies of different related phenotypes. These differences between studies can manifest as effect size heterogeneity. We previously developed a modified random effects model (RE2) that can achieve higher power to detect heterogeneous effects than the commonly used fixed effects model (FE). However, RE2 cannot perform meta-analysis of correlated statistics, which are found in recent research designs, and the identified variants often overlap with those found by FE. Results: Here, we propose RE2C, which increases the power of RE2 in two ways. First, we generalized the likelihood model to account for correlations of statistics to achieve optimal power, using an optimization technique based on spectral decomposition for efficient parameter estimation. Second, we designed a novel statistic to focus on the heterogeneous effects that FE cannot detect, thereby, increasing the power to identify new associations. We developed an efficient and accurate p -value approximation procedure using analytical decomposition of the statistic. In simulations, RE2C achieved a dramatic increase in power compared with the decoupling approach (71% vs. 21%) when the statistics were correlated. Even when the statistics are uncorrelated, RE2C achieves a modest increase in power. Applications to real genetic data supported the utility of RE2C. RE2C is highly efficient and can meta-analyze one hundred GWASs in one day. Availability and implementation: The software is freely available at http://software.buhmhan.com/RE2C . Contact: buhm.han@amc.seoul.kr. Supplementary information: Supplementary data are available at Bioinformatics online |
2016 |
Han, Buhm; Duong, Dat; Sul, Jae Hoon; de Bakker, Paul I W; Eskin, Eleazar; Raychaudhuri, Soumya A general framework for meta-analyzing dependent studies with overlapping subjects in association mapping. Journal Article Hum Mol Genet, 2016, ISSN: 1460-2083. Abstract | Links | BibTeX | Tags: eQTL, genome-wide association studies, Meta-Analysis @article{Han:HumMolGenet:2016, title = {A general framework for meta-analyzing dependent studies with overlapping subjects in association mapping.}, author = {Buhm Han and Dat Duong and Jae Hoon Sul and Paul I. W. de Bakker and Eleazar Eskin and Soumya Raychaudhuri}, url = {http://dx.doi.org/10.1093/hmg/ddw049}, issn = {1460-2083}, year = {2016}, date = {2016-01-01}, journal = {Hum Mol Genet}, abstract = {Meta-analysis strategies have become critical to augment power of genome-wide association studies (GWAS). To reduce genotyping or sequencing cost, many studies today utilize shared controls, and these individuals can inadvertently overlap among multiple studies. If these overlapping individuals are not taken into account in meta-analysis, they can induce spurious associations. In this paper, we propose a general framework for adjusting association statistics to account for overlapping subjects within a meta-analysis. The key idea of our method is to transform the covariance structure of the data so it can be used in downstream analyses. As a result, the strategy is very flexible, and allows a wide range of meta-analysis methods, such as the random effects model, to account for overlapping subjects. Using simulations and real datasets, we demonstrate that our method has utility in meta-analyses of GWAS, as well as in a multi-tissue mouse eQTL study where our method increases the number of discovered eQTLs by up to 19% compared to existing methods}, keywords = {eQTL, genome-wide association studies, Meta-Analysis}, pubstate = {published}, tppubtype = {article} } Meta-analysis strategies have become critical to augment power of genome-wide association studies (GWAS). To reduce genotyping or sequencing cost, many studies today utilize shared controls, and these individuals can inadvertently overlap among multiple studies. If these overlapping individuals are not taken into account in meta-analysis, they can induce spurious associations. In this paper, we propose a general framework for adjusting association statistics to account for overlapping subjects within a meta-analysis. The key idea of our method is to transform the covariance structure of the data so it can be used in downstream analyses. As a result, the strategy is very flexible, and allows a wide range of meta-analysis methods, such as the random effects model, to account for overlapping subjects. Using simulations and real datasets, we demonstrate that our method has utility in meta-analyses of GWAS, as well as in a multi-tissue mouse eQTL study where our method increases the number of discovered eQTLs by up to 19% compared to existing methods |
2014 |
Kang, Eun Yong; Han, Buhm; Furlotte, Nicholas; Joo, Jong Wha J; Shih, Diana; Davis, Richard C; Lusis, Aldons J; Eskin, Eleazar Meta-Analysis Identifies Gene-by-Environment Interactions as Demonstrated in a Study of 4,965 Mice Journal Article PLoS Genet, 10 (1), pp. e1004022, 2014, ISSN: 1553-7404. Abstract | Links | BibTeX | Tags: Genes By Environment, Meta-Analysis, Mouse Genetics @article{10.1371/journal.pgen.1004022, title = {Meta-Analysis Identifies Gene-by-Environment Interactions as Demonstrated in a Study of 4,965 Mice}, author = { Eun Yong Kang and Buhm Han and Nicholas Furlotte and Jong Wha J. Joo and Diana Shih and Richard C. Davis and Aldons J. Lusis and Eleazar Eskin}, url = {http://dx.doi.org/10.1371%2Fjournal.pgen.1004022}, issn = {1553-7404}, year = {2014}, date = {2014-01-01}, journal = {PLoS Genet}, volume = {10}, number = {1}, pages = {e1004022}, publisher = {Public Library of Science}, abstract = {Author Summary Identifying gene-by-environment interactions is important for understand the architecture of a complex trait. Discovering gene-by-environment interaction requires the observation of the same phenotype in individuals under different environments. Model organism studies are often conducted under different environments. These studies provide an unprecedented opportunity for researchers to identify the gene-by-environment interactions. A difference in the effect size of a genetic variant between two studies conducted in different environments may suggest the presence of a gene-by-environment interaction. In this paper, we propose to employ a random-effect-based meta-analysis approach to identify gene-by-environment interaction, which assumes different or heterogeneous effect sizes between studies. Our approach is motivated by the observation that methods for discovering gene-by-environment interactions are closely related to random effects models for meta-analysis. We show that interactions can be interpreted as heterogeneity and can be detected without utilizing the traditional approaches for discovery of gene-by-environment interactions, which treats the gene-by-environment interactions as covariates in the analysis. We provide a intuitive way to visualize the results of the meta-analysis at a locus which allows us to obtain the biological insights of gene-by-environment interactions. We demonstrate our method by searching for gene-by-environment interactions by combining 17 mouse genetic studies totaling 4,965 distinct animals.}, keywords = {Genes By Environment, Meta-Analysis, Mouse Genetics}, pubstate = {published}, tppubtype = {article} } Author Summary Identifying gene-by-environment interactions is important for understand the architecture of a complex trait. Discovering gene-by-environment interaction requires the observation of the same phenotype in individuals under different environments. Model organism studies are often conducted under different environments. These studies provide an unprecedented opportunity for researchers to identify the gene-by-environment interactions. A difference in the effect size of a genetic variant between two studies conducted in different environments may suggest the presence of a gene-by-environment interaction. In this paper, we propose to employ a random-effect-based meta-analysis approach to identify gene-by-environment interaction, which assumes different or heterogeneous effect sizes between studies. Our approach is motivated by the observation that methods for discovering gene-by-environment interactions are closely related to random effects models for meta-analysis. We show that interactions can be interpreted as heterogeneity and can be detected without utilizing the traditional approaches for discovery of gene-by-environment interactions, which treats the gene-by-environment interactions as covariates in the analysis. We provide a intuitive way to visualize the results of the meta-analysis at a locus which allows us to obtain the biological insights of gene-by-environment interactions. We demonstrate our method by searching for gene-by-environment interactions by combining 17 mouse genetic studies totaling 4,965 distinct animals. |
Ohmen, Jeffrey; Kang, Eun Yong ; Li, Xin ; Joo, Jong Wha ; Hormozdiari, Farhad ; Zheng, Qing Yin ; Davis, Richard C; Lusis, Aldons J; Eskin, Eleazar ; Friedman, Rick A Genome-Wide Association Study for Age-Related Hearing Loss (AHL) in the Mouse: A Meta-Analysis. Journal Article J Assoc Res Otolaryngol, 15 (3), pp. 335-52, 2014, ISSN: 1438-7573. Abstract | Links | BibTeX | Tags: HMDP, Meta-Analysis, Mouse Genetics @article{Ohmen:JAssocResOtolaryngol:2014, title = {Genome-Wide Association Study for Age-Related Hearing Loss (AHL) in the Mouse: A Meta-Analysis.}, author = { Jeffrey Ohmen and Eun Yong Kang and Xin Li and Jong Wha Joo and Farhad Hormozdiari and Qing Yin Zheng and Richard C. Davis and Aldons J. Lusis and Eleazar Eskin and Rick A. Friedman}, url = {http://dx.doi.org/10.1007/s10162-014-0443-2}, issn = {1438-7573}, year = {2014}, date = {2014-01-01}, journal = {J Assoc Res Otolaryngol}, volume = {15}, number = {3}, pages = {335-52}, address = {United States}, abstract = {Age-related hearing loss (AHL) is characterized by a symmetric sensorineural hearing loss primarily in high frequencies and individuals have different levels of susceptibility to AHL. Heritability studies have shown that the sources of this variance are both genetic and environmental, with approximately half of the variance attributable to hereditary factors as reported by Huag and Tang (Eur Arch Otorhinolaryngol 267(8):1179-1191, 2010). Only a limited number of large-scale association studies for AHL have been undertaken in humans, to date. An alternate and complementary approach to these human studies is through the use of mouse models. Advantages of mouse models include that the environment can be more carefully controlled, measurements can be replicated in genetically identical animals, and the proportion of the variability explained by genetic variation is increased. Complex traits in mouse strains have been shown to have higher heritability and genetic loci often have stronger effects on the trait compared to humans. Motivated by these advantages, we have performed the first genome-wide association study of its kind in the mouse by combining several data sets in a meta-analysis to identify loci associated with age-related hearing loss. We identified five genome-wide significant loci (<10(-6)). One of these loci confirmed a previously identified locus (ahl8) on distal chromosome 11 and greatly narrowed the candidate region. Specifically, the most significant associated SNP is located 450udotkb upstream of Fscn2. These data confirm the utility of this approach and provide new high-resolution mapping information about variation within the mouse genome associated with hearing loss}, keywords = {HMDP, Meta-Analysis, Mouse Genetics}, pubstate = {published}, tppubtype = {article} } Age-related hearing loss (AHL) is characterized by a symmetric sensorineural hearing loss primarily in high frequencies and individuals have different levels of susceptibility to AHL. Heritability studies have shown that the sources of this variance are both genetic and environmental, with approximately half of the variance attributable to hereditary factors as reported by Huag and Tang (Eur Arch Otorhinolaryngol 267(8):1179-1191, 2010). Only a limited number of large-scale association studies for AHL have been undertaken in humans, to date. An alternate and complementary approach to these human studies is through the use of mouse models. Advantages of mouse models include that the environment can be more carefully controlled, measurements can be replicated in genetically identical animals, and the proportion of the variability explained by genetic variation is increased. Complex traits in mouse strains have been shown to have higher heritability and genetic loci often have stronger effects on the trait compared to humans. Motivated by these advantages, we have performed the first genome-wide association study of its kind in the mouse by combining several data sets in a meta-analysis to identify loci associated with age-related hearing loss. We identified five genome-wide significant loci (<10(-6)). One of these loci confirmed a previously identified locus (ahl8) on distal chromosome 11 and greatly narrowed the candidate region. Specifically, the most significant associated SNP is located 450udotkb upstream of Fscn2. These data confirm the utility of this approach and provide new high-resolution mapping information about variation within the mouse genome associated with hearing loss |
2013 |
Sul, Jae Hoon; Han, Buhm ; Ye, Chun ; Choi, Ted ; Eskin, Eleazar Effectively Identifying eQTLs from Multiple Tissues by Combining Mixed Model and Meta-analytic Approaches Journal Article PLoS Genet, 9 (6), pp. e1003491, 2013, ISSN: 1553-7404. Abstract | Links | BibTeX | Tags: Expression QTLs, Meta-Analysis, Mixed Models, Multiple Phenotypes @article{10.1371/journal.pgen.1003491, title = {Effectively Identifying eQTLs from Multiple Tissues by Combining Mixed Model and Meta-analytic Approaches}, author = { Jae Hoon Sul and Buhm Han and Chun Ye and Ted Choi and Eleazar Eskin}, url = {http://dx.doi.org/10.1371%2Fjournal.pgen.1003491}, issn = {1553-7404}, year = {2013}, date = {2013-01-01}, journal = {PLoS Genet}, volume = {9}, number = {6}, pages = {e1003491}, publisher = {Public Library of Science}, address = {United States}, abstract = {Author Summary The combination of gene expression and genetic variation data has enabled the identification of genetic variants that affect gene expression levels. It has been shown that some variants influence gene expression in only one tissue while others influence gene expression in multiple tissues. However, an analysis of multiple tissue data using traditional statistical methods typically fails to identify those variants that affect multiple tissues because each tissue is treated independently and due to low statistical power, the effect in a given tissue may be missed. Building on recent advances in statistical methods for meta-analysis and mixed models, we present a novel method that combines information from multiple tissues to identify genetic variation that affects multiple tissues. We show that our method detects more genetic variation that influences multiple tissues than traditional statistical methods both on simulated and real data.}, keywords = {Expression QTLs, Meta-Analysis, Mixed Models, Multiple Phenotypes}, pubstate = {published}, tppubtype = {article} } Author Summary The combination of gene expression and genetic variation data has enabled the identification of genetic variants that affect gene expression levels. It has been shown that some variants influence gene expression in only one tissue while others influence gene expression in multiple tissues. However, an analysis of multiple tissue data using traditional statistical methods typically fails to identify those variants that affect multiple tissues because each tissue is treated independently and due to low statistical power, the effect in a given tissue may be missed. Building on recent advances in statistical methods for meta-analysis and mixed models, we present a novel method that combines information from multiple tissues to identify genetic variation that affects multiple tissues. We show that our method detects more genetic variation that influences multiple tissues than traditional statistical methods both on simulated and real data. |
2012 |
Han, Buhm; Eskin, Eleazar Interpreting meta-analyses of genome-wide association studies. Journal Article PLoS Genet, 8 (3), pp. e1002555, 2012, ISSN: 1553-7404. Abstract | Links | BibTeX | Tags: Meta-Analysis @article{Han:PlosGenet:2012, title = {Interpreting meta-analyses of genome-wide association studies.}, author = { Buhm Han and Eleazar Eskin}, url = {http://dx.doi.org/10.1371/journal.pgen.1002555}, issn = {1553-7404}, year = {2012}, date = {2012-01-01}, journal = {PLoS Genet}, volume = {8}, number = {3}, pages = {e1002555}, address = {United States}, organization = {Department of Computer Science, University of California Los Angeles, Los Angeles, California, United States of America.}, abstract = {Meta-analysis is an increasingly popular tool for combining multiple genome-wide association studies in a single analysis to identify associations with small effect sizes. The effect sizes between studies in a meta-analysis may differ and these differences, or heterogeneity, can be caused by many factors. If heterogeneity is observed in the results of a meta-analysis, interpreting the cause of heterogeneity is important because the correct interpretation can lead to a better understanding of the disease and a more effective design of a replication study. However, interpreting heterogeneous results is difficult. The standard approach of examining the association p-values of the studies does not effectively predict if the effect exists in each study. In this paper, we propose a framework facilitating the interpretation of the results of a meta-analysis. Our framework is based on a new statistic representing the posterior probability that the effect exists in each study, which is estimated utilizing cross-study information. Simulations and application to the real data show that our framework can effectively segregate the studies predicted to have an effect, the studies predicted to not have an effect, and the ambiguous studies that are underpowered. In addition to helping interpretation, the new framework also allows us to develop a new association testing procedure taking into account the existence of effect.}, keywords = {Meta-Analysis}, pubstate = {published}, tppubtype = {article} } Meta-analysis is an increasingly popular tool for combining multiple genome-wide association studies in a single analysis to identify associations with small effect sizes. The effect sizes between studies in a meta-analysis may differ and these differences, or heterogeneity, can be caused by many factors. If heterogeneity is observed in the results of a meta-analysis, interpreting the cause of heterogeneity is important because the correct interpretation can lead to a better understanding of the disease and a more effective design of a replication study. However, interpreting heterogeneous results is difficult. The standard approach of examining the association p-values of the studies does not effectively predict if the effect exists in each study. In this paper, we propose a framework facilitating the interpretation of the results of a meta-analysis. Our framework is based on a new statistic representing the posterior probability that the effect exists in each study, which is estimated utilizing cross-study information. Simulations and application to the real data show that our framework can effectively segregate the studies predicted to have an effect, the studies predicted to not have an effect, and the ambiguous studies that are underpowered. In addition to helping interpretation, the new framework also allows us to develop a new association testing procedure taking into account the existence of effect. |
Furlotte, Nicholas A; Kang, Eun Yong; Nas, Atila Van; Farber, Charles R; Lusis, Aldons J; Eskin, Eleazar Increasing Association Mapping Power and Resolution in Mouse Genetic Studies Through the Use of Meta-analysis for Structured Populations. Journal Article Genetics, 191 (3), pp. 959-67, 2012, ISSN: 1943-2631. Abstract | Links | BibTeX | Tags: Meta-Analysis, Meta-Analysis Grant, Mouse Genetics @article{Furlotte:Genetics:2012, title = {Increasing Association Mapping Power and Resolution in Mouse Genetic Studies Through the Use of Meta-analysis for Structured Populations.}, author = { Nicholas A. Furlotte and Eun Yong Kang and Atila Van Nas and Charles R. Farber and Aldons J. Lusis and Eleazar Eskin}, url = {http://dx.doi.org/10.1534/genetics.112.140277}, issn = {1943-2631}, year = {2012}, date = {2012-01-01}, journal = {Genetics}, volume = {191}, number = {3}, pages = {959-67}, address = {United States}, organization = {University of California, Los Angeles;}, abstract = {Genetic studies in mouse models have played an integral role in the discovery of the mechanisms underlying many human diseases. The primary mode of discovery has been the application of linkage analysis to mouse crosses. This approach results in high power to identify regions that affect traits, but in low resolution, making it difficult to identify the precise genomic location harboring the causal variant. Recently, a panel of mice referred to as the hybrid mouse diversity panel (HMDP) has been developed to overcome this problem. However, power in this panel is limited by the availability of inbred strains. Previous studies have suggested combining results across multiple panels as a means to increase power, but the methods employed may not be well suited for structured populations, such as the HMDP. In this paper, we introduce a meta-analysis based method that may be used to combine HMDP studies with F2 cross studies to gain power, while increasing resolution. Due to the drastically different genetic structure of F2s and the HMDP, the best way to combine two studies for a given SNP depends on the strain distribution pattern in each study. We show that combining results, while accounting for these patterns, leads to increased power and resolution. Using our method to map bone mineral density, we find that two previously implicated loci are replicated with increased significance and that the size of the associated is decreased. We also map HDL cholesterol and show a dramatic increase in the significance of a previously identified result.}, keywords = {Meta-Analysis, Meta-Analysis Grant, Mouse Genetics}, pubstate = {published}, tppubtype = {article} } Genetic studies in mouse models have played an integral role in the discovery of the mechanisms underlying many human diseases. The primary mode of discovery has been the application of linkage analysis to mouse crosses. This approach results in high power to identify regions that affect traits, but in low resolution, making it difficult to identify the precise genomic location harboring the causal variant. Recently, a panel of mice referred to as the hybrid mouse diversity panel (HMDP) has been developed to overcome this problem. However, power in this panel is limited by the availability of inbred strains. Previous studies have suggested combining results across multiple panels as a means to increase power, but the methods employed may not be well suited for structured populations, such as the HMDP. In this paper, we introduce a meta-analysis based method that may be used to combine HMDP studies with F2 cross studies to gain power, while increasing resolution. Due to the drastically different genetic structure of F2s and the HMDP, the best way to combine two studies for a given SNP depends on the strain distribution pattern in each study. We show that combining results, while accounting for these patterns, leads to increased power and resolution. Using our method to map bone mineral density, we find that two previously implicated loci are replicated with increased significance and that the size of the associated is decreased. We also map HDL cholesterol and show a dramatic increase in the significance of a previously identified result. |
2011 |
Han, Buhm; Eskin, Eleazar Random-Effects Model Aimed at Discovering Associations in Meta-Analysis of Genome-wide Association Studies. Journal Article Am J Hum Genet, 88 (5), pp. 586-98, 2011, ISSN: 1537-6605. Abstract | Links | BibTeX | Tags: Meta-Analysis @article{Han:AmJHumGenet:2011, title = {Random-Effects Model Aimed at Discovering Associations in Meta-Analysis of Genome-wide Association Studies.}, author = { Buhm Han and Eleazar Eskin}, url = {http://dx.doi.org/10.1016/j.ajhg.2011.04.014}, issn = {1537-6605}, year = {2011}, date = {2011-01-01}, journal = {Am J Hum Genet}, volume = {88}, number = {5}, pages = {586-98}, address = {United States}, organization = {Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA.}, abstract = {Meta-analysis is an increasingly popular tool for combining multiple different genome-wide association studies (GWASs) in a single aggregate analysis in order to identify associations with very small effect sizes. Because the data of a meta-analysis can be heterogeneous, referring to the differences in effect sizes between the collected studies, what is often done in the literature is to apply both the fixed-effects model (FE) under an assumption of the same effect size between studies and the random-effects model (RE) under an assumption of varying effect size between studies. However, surprisingly, RE gives less significant p values than FE at variants that actually show varying effect sizes between studies. This is ironic because RE is designed specifically for the case in which there is heterogeneity. Asa result, usually, RE does not discover any associations that FE did not discover. In this paper, we show that the underlying reason for this phenomenon is that RE implicitly assumes a markedly conservative null-hypothesis model, and we present a new random-effects model that relaxes the conservative assumption. Unlike the traditional RE, the new method is shown to achieve higher statistical powerthan FE when there is heterogeneity, indicating that the new method has practical utility for discovering associations in the meta-analysis of GWASs.}, keywords = {Meta-Analysis}, pubstate = {published}, tppubtype = {article} } Meta-analysis is an increasingly popular tool for combining multiple different genome-wide association studies (GWASs) in a single aggregate analysis in order to identify associations with very small effect sizes. Because the data of a meta-analysis can be heterogeneous, referring to the differences in effect sizes between the collected studies, what is often done in the literature is to apply both the fixed-effects model (FE) under an assumption of the same effect size between studies and the random-effects model (RE) under an assumption of varying effect size between studies. However, surprisingly, RE gives less significant p values than FE at variants that actually show varying effect sizes between studies. This is ironic because RE is designed specifically for the case in which there is heterogeneity. Asa result, usually, RE does not discover any associations that FE did not discover. In this paper, we show that the underlying reason for this phenomenon is that RE implicitly assumes a markedly conservative null-hypothesis model, and we present a new random-effects model that relaxes the conservative assumption. Unlike the traditional RE, the new method is shown to achieve higher statistical powerthan FE when there is heterogeneity, indicating that the new method has practical utility for discovering associations in the meta-analysis of GWASs. |
2010 |
Zaitlen, Noah; Eskin, Eleazar Imputation aware meta-analysis of genome-wide association studies. Journal Article Genet Epidemiol, 34 (6), pp. 537-42, 2010, ISSN: 1098-2272. Abstract | Links | BibTeX | Tags: Meta-Analysis @article{Zaitlen:GenetEpidemiol:2010, title = {Imputation aware meta-analysis of genome-wide association studies.}, author = { Noah Zaitlen and Eleazar Eskin}, url = {http://dx.doi.org/10.1002/gepi.20507}, issn = {1098-2272}, year = {2010}, date = {2010-01-01}, journal = {Genet Epidemiol}, volume = {34}, number = {6}, pages = {537-42}, address = {United States}, organization = {Bioinformatics Program, University of California, San Diego, California.}, abstract = {Genome-wide association studies have recently identified many new loci associated with human complex diseases. These newly discovered variants typically have weak effects requiring studies with large numbers of individuals to achieve the statistical power necessary to identify them. Likely, there exist even more associated variants, which remain to be found if even larger association studies can be assembled. Meta-analysis provides a straightforward means of increasing study sample sizes without collecting new samples by combining existing data sets. One obstacle to combining studies is that they are often performed on platforms with different marker sets. Current studies overcome this issue by imputing genotypes missing from each of the studies and then performing standard meta-analysis techniques. We show that this approach may result in a loss of power since errors in imputation are not accounted for. We present a new method for performing meta-analysis over imputed single nucleotide polymorphisms, show that it is optimal with respect to power, and discuss practical implementation issues. Through simulation experiments, we show that our imputation aware meta-analysis approach outperforms or matches standard meta-analysis approaches. Genet. Epidemiol. 34: 537-542, 2010. (c) 2010 Wiley-Liss, Inc.}, keywords = {Meta-Analysis}, pubstate = {published}, tppubtype = {article} } Genome-wide association studies have recently identified many new loci associated with human complex diseases. These newly discovered variants typically have weak effects requiring studies with large numbers of individuals to achieve the statistical power necessary to identify them. Likely, there exist even more associated variants, which remain to be found if even larger association studies can be assembled. Meta-analysis provides a straightforward means of increasing study sample sizes without collecting new samples by combining existing data sets. One obstacle to combining studies is that they are often performed on platforms with different marker sets. Current studies overcome this issue by imputing genotypes missing from each of the studies and then performing standard meta-analysis techniques. We show that this approach may result in a loss of power since errors in imputation are not accounted for. We present a new method for performing meta-analysis over imputed single nucleotide polymorphisms, show that it is optimal with respect to power, and discuss practical implementation issues. Through simulation experiments, we show that our imputation aware meta-analysis approach outperforms or matches standard meta-analysis approaches. Genet. Epidemiol. 34: 537-542, 2010. (c) 2010 Wiley-Liss, Inc. |