Our group publishes papers presenting new methodologies, describing the results of studies that use our software, and reviewing current topics in the field of Bioinformatics. Scroll down or click here for a complete list of papers produced by our lab. Since 2013, we write blog posts summarizing new research papers and review articles:
GWAS
- Fine Mapping Causal Variants and Allelic Heterogeneity
- Widespread Allelic Heterogeneity in Complex Traits
- Selection in Europeans on Fatty Acid Desaturases Associated with Dietary Changes
- Incorporating prior information into association studies
- Characterization of Expression Quantitative Trait Loci in Pedigrees from Colombia and Costa Rica Ascertained for Bipolar Disorder
- Simultaneous modeling of disease status and clinical phenotypes to increase power in GWAS
- Efficient and accurate multiple-phenotype regression method for high dimensional data considering population structure
- Review Article: Population Structure in Genetic Studies: Confounding Factors and Mixed Models
- Colocalization of GWAS and eQTL Signals Detects Target Genes
- Chromosome conformation elucidates regulatory relationships in developing human brain
Mouse Genetics
- Review Article: The Hybrid Mouse Diversity Panel
- Genes, Environments and Meta-Analysis
- Review Article: Mixed Models and Population Structure
- Identifying Genes Involved in Blood Cell Traits
- Genes, Diet, and Body Weight (in Mice)
- Review Article: Mouse Genetics
Population Structure
- Efficient and accurate multiple-phenotype regression method for high dimensional data considering population structure
- Review Article: Population Structure in Genetic Studies: Confounding Factors and Mixed Models
- Accounting for Population Structure in Gene-by-Environment Interactions in Genome-Wide Association Studies Using Mixed Models
- Multiple testing correction in linear mixed models
- Identification of causal genes for complex traits (CAVIAR-gene)
- Accurate viral population assembly from ultra-deep sequencing data
- GRAT: Speeding up Expression Quantitative Trail Loci (eQTL) Studies
- Correcting Population Structure using Mixed Models Webcast
- Mixed models can correct for population structure for genomic regions under selection
Review Articles
- Review Article: Population Structure in Genetic Studies: Confounding Factors and Mixed Models
- Review Article: The Hybrid Mouse Diversity Panel
- Review Article: GWAS and Missing Heritability
- Review Article: Mixed Models and Population Structure
- Review Article: Mouse Genetics
Publications
2018 |
Gamazon, Eric R; Segrè, Ayellet V; van de Bunt, Martijn; Wen, Xiaoquan; Xi, Hualin S; Hormozdiari, Farhad; Ongen, Halit; Konkashbaev, Anuar; Derks, Eske M; Aguet, François; Quan, Jie; Nicolae, Dan L; Eskin, Eleazar; Kellis, Manolis; Getz, Gad; McCarthy, Mark I; Dermitzakis, Emmanouil T; Cox, Nancy J; Ardlie, Kristin G Using an atlas of gene regulation across 44 human tissues to inform complex disease- and trait-associated variation. Journal Article Nat Genet, 50 (7), pp. 956-967, 2018, ISSN: 1546-1718. Abstract | Links | BibTeX | Tags: Co-Localization, Expression QTLs, Fine Mapping, GWAS+eQTL @article{Gamazon:NatGenet:2018, title = {Using an atlas of gene regulation across 44 human tissues to inform complex disease- and trait-associated variation.}, author = { Eric R. Gamazon and Ayellet V. Segrè and Martijn van de Bunt and Xiaoquan Wen and Hualin S. Xi and Farhad Hormozdiari and Halit Ongen and Anuar Konkashbaev and Eske M. Derks and François Aguet and Jie Quan and Dan L. Nicolae and Eleazar Eskin and Manolis Kellis and Gad Getz and Mark I. McCarthy and Emmanouil T. Dermitzakis and Nancy J. Cox and Kristin G. Ardlie}, url = {http://dx.doi.org/10.1038/s41588-018-0154-4}, issn = {1546-1718}, year = {2018}, date = {2018-01-01}, journal = {Nat Genet}, volume = {50}, number = {7}, pages = {956-967}, address = {United States}, organization = {Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA. egamazon@uchicago.edu.}, abstract = {We apply integrative approaches to expression quantitative loci (eQTLs) from 44 tissues from the Genotype-Tissue Expression project and genome-wide association study data. About 60% of known trait-associated loci are in linkage disequilibrium with a cis-eQTL, over half of which were not found in previous large-scale whole blood studies. Applying polygenic analyses to metabolic, cardiovascular, anthropometric, autoimmune, and neurodegenerative traits, we find that eQTLs are significantly enriched for trait associations in relevant pathogenic tissues and explain a substantial proportion of the heritability (40-80%). For most traits, tissue-shared eQTLs underlie a greater proportion of trait associations, although tissue-specific eQTLs have a greater contribution to some traits, such as blood pressure. By integrating information from biological pathways with eQTL target genes and applying a gene-based approach, we validate previously implicated causal genes and pathways, and propose new variant and gene associations for several complex traits, which we replicate in the UK BioBank and BioVU}, keywords = {Co-Localization, Expression QTLs, Fine Mapping, GWAS+eQTL}, pubstate = {published}, tppubtype = {article} } We apply integrative approaches to expression quantitative loci (eQTLs) from 44 tissues from the Genotype-Tissue Expression project and genome-wide association study data. About 60% of known trait-associated loci are in linkage disequilibrium with a cis-eQTL, over half of which were not found in previous large-scale whole blood studies. Applying polygenic analyses to metabolic, cardiovascular, anthropometric, autoimmune, and neurodegenerative traits, we find that eQTLs are significantly enriched for trait associations in relevant pathogenic tissues and explain a substantial proportion of the heritability (40-80%). For most traits, tissue-shared eQTLs underlie a greater proportion of trait associations, although tissue-specific eQTLs have a greater contribution to some traits, such as blood pressure. By integrating information from biological pathways with eQTL target genes and applying a gene-based approach, we validate previously implicated causal genes and pathways, and propose new variant and gene associations for several complex traits, which we replicate in the UK BioBank and BioVU |
Hormozdiari, Farhad; Gazal, Steven; van de Geijn, Bryce; Finucane, Hilary K; Ju, Chelsea J-T; Loh, Po-Ru R; Schoech, Armin; Reshef, Yakir; Liu, Xuanyao; O'Connor, Luke; Gusev, Alexander; Eskin, Eleazar; Price, Alkes L Leveraging molecular quantitative trait loci to understand the genetic architecture of diseases and complex traits. Journal Article Nat Genet, 50 (7), pp. 1041-1047, 2018, ISSN: 1546-1718. Abstract | Links | BibTeX | Tags: Fine Mapping, Functional Genomics @article{Hormozdiari:NatGenet:2018, title = {Leveraging molecular quantitative trait loci to understand the genetic architecture of diseases and complex traits.}, author = { Farhad Hormozdiari and Steven Gazal and Bryce van de Geijn and Hilary K. Finucane and Chelsea J-T Ju and Po-Ru R. Loh and Armin Schoech and Yakir Reshef and Xuanyao Liu and Luke O'Connor and Alexander Gusev and Eleazar Eskin and Alkes L. Price}, url = {http://dx.doi.org/10.1038/s41588-018-0148-2}, issn = {1546-1718}, year = {2018}, date = {2018-01-01}, journal = {Nat Genet}, volume = {50}, number = {7}, pages = {1041-1047}, address = {United States}, organization = {Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA. Hormozdiari@hsph.harvard.edu.}, abstract = {There is increasing evidence that many risk loci found using genome-wide association studies are molecular quantitative trait loci (QTLs). Here we introduce a new set of functional annotations based on causal posterior probabilities of fine-mapped molecular cis-QTLs, using data from the Genotype-Tissue Expression (GTEx) and BLUEPRINT consortia. We show that these annotations are more strongly enriched for heritability (5.84$times$ for eQTLs; P=1.19$times$10-31) across 41 diseases and complex traits than annotations containing all significant molecular QTLs (1.80$times$ for expression (e)QTLs). eQTL annotations obtained by meta-analyzing all GTEx tissues generally performed best, whereas tissue-specific eQTL annotations produced stronger enrichments for blood- and brain-related diseases and traits. eQTL annotations restricted to loss-of-function intolerant genes were even more enriched for heritability (17.06$times$; P=1.20$times$10-35). All molecular QTLs except splicing QTLs remained significantly enriched in joint analysis, indicating that each of these annotations is uniquely informative for disease and complex trait architectures}, keywords = {Fine Mapping, Functional Genomics}, pubstate = {published}, tppubtype = {article} } There is increasing evidence that many risk loci found using genome-wide association studies are molecular quantitative trait loci (QTLs). Here we introduce a new set of functional annotations based on causal posterior probabilities of fine-mapped molecular cis-QTLs, using data from the Genotype-Tissue Expression (GTEx) and BLUEPRINT consortia. We show that these annotations are more strongly enriched for heritability (5.84$times$ for eQTLs; P=1.19$times$10-31) across 41 diseases and complex traits than annotations containing all significant molecular QTLs (1.80$times$ for expression (e)QTLs). eQTL annotations obtained by meta-analyzing all GTEx tissues generally performed best, whereas tissue-specific eQTL annotations produced stronger enrichments for blood- and brain-related diseases and traits. eQTL annotations restricted to loss-of-function intolerant genes were even more enriched for heritability (17.06$times$; P=1.20$times$10-35). All molecular QTLs except splicing QTLs remained significantly enriched in joint analysis, indicating that each of these annotations is uniquely informative for disease and complex trait architectures |
2017 |
Hormozdiari, Farhad; Zhu, Anthony; Kichaev, Gleb; Ju, Chelsea J-T; Segrè, Ayellet V; Joo, Jong Wha J; Won, Hyejung; Sankararaman, Sriram; Pasaniuc, Bogdan; Shifman, Sagiv; Eskin, Eleazar Widespread Allelic Heterogeneity in Complex Traits. Journal Article Am J Hum Genet, 100 (5), pp. 789-802, 2017, ISSN: 1537-6605. Abstract | Links | BibTeX | Tags: Fine Mapping @article{Hormozdiari:AmJHumGenet:2017, title = {Widespread Allelic Heterogeneity in Complex Traits.}, author = { Farhad Hormozdiari and Anthony Zhu and Gleb Kichaev and Chelsea J-T Ju and Ayellet V. Segrè and Jong Wha J. Joo and Hyejung Won and Sriram Sankararaman and Bogdan Pasaniuc and Sagiv Shifman and Eleazar Eskin}, url = {http://dx.doi.org/10.1016/j.ajhg.2017.04.005}, issn = {1537-6605}, year = {2017}, date = {2017-01-01}, journal = {Am J Hum Genet}, volume = {100}, number = {5}, pages = {789-802}, address = {United States}, organization = {Department of Computer Science, University of California, Los Angeles, CA 90095, USA; Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA.}, abstract = {Recent successes in genome-wide association studies (GWASs) make it possible to address important questions about the genetic architecture of complex traits, such as allele frequency and effect size. One lesser-known aspect of complex traits is the extent of allelic heterogeneity (AH) arising from multiple causal variants at a locus. We developed a computational method to infer the probability of AHudotand applied it to three GWASs and four expression quantitative trait loci (eQTL) datasets. We identified a total of 4,152 loci with strong evidence of AH. The proportion of all loci with identified AH is 4%-23% in eQTLs, 35% in GWASs of high-density lipoprotein (HDL), and 23% in GWASs of schizophrenia. For eQTLs, we observed a strong correlation between sample size and the proportion of loci with AH (R(2) = 0.85}, keywords = {Fine Mapping}, pubstate = {published}, tppubtype = {article} } Recent successes in genome-wide association studies (GWASs) make it possible to address important questions about the genetic architecture of complex traits, such as allele frequency and effect size. One lesser-known aspect of complex traits is the extent of allelic heterogeneity (AH) arising from multiple causal variants at a locus. We developed a computational method to infer the probability of AHudotand applied it to three GWASs and four expression quantitative trait loci (eQTL) datasets. We identified a total of 4,152 loci with strong evidence of AH. The proportion of all loci with identified AH is 4%-23% in eQTLs, 35% in GWASs of high-density lipoprotein (HDL), and 23% in GWASs of schizophrenia. For eQTLs, we observed a strong correlation between sample size and the proportion of loci with AH (R(2) = 0.85 |
Lozano, Jose A; Hormozdiari, Farhad; Joo, Jong Wha; Han, Buhm; Eskin, Eleazar The Multivariate Normal Distribution Framework for Analyzing Association Studies Journal Article bioRxiv, pp. 208199, 2017. Abstract | Links | BibTeX | Tags: Fine Mapping, Multi-SNP Association, Multiple Testing @article{Lozano:Biorxiv:2017, title = {The Multivariate Normal Distribution Framework for Analyzing Association Studies}, author = { Jose A. Lozano and Farhad Hormozdiari and Jong Wha Joo and Buhm Han and Eleazar Eskin}, url = {http://dx.doi.org/10.1101/208199}, year = {2017}, date = {2017-01-01}, journal = {bioRxiv}, pages = {208199}, publisher = {Cold Spring Harbor Laboratory}, organization = {UCLA}, abstract = {Genome-wide association studies (GWAS) have discovered thousands of variants involved in common human diseases. In these studies, frequencies of genetic variants are compared between a cohort of individuals with a disease (cases) and a cohort of healthy individuals (controls). Any variant that has a significantly different frequency between the two cohorts is considered an associated variant. A challenge in the analysis of GWAS studies is the fact that human population history causes nearby genetic variants in the genome to be correlated with each other. In this review, we demonstrate how to utilize the multivariate normal (MVN) distribution to explicitly take into account the correlation between genetic variants in a comprehensive framework for analysis of GWAS. We show how the MVN framework can be applied to perform association testing, correct for multiple hypothesis testing, estimate statistical power, and perform fine mapping and imputation.}, keywords = {Fine Mapping, Multi-SNP Association, Multiple Testing}, pubstate = {published}, tppubtype = {article} } Genome-wide association studies (GWAS) have discovered thousands of variants involved in common human diseases. In these studies, frequencies of genetic variants are compared between a cohort of individuals with a disease (cases) and a cohort of healthy individuals (controls). Any variant that has a significantly different frequency between the two cohorts is considered an associated variant. A challenge in the analysis of GWAS studies is the fact that human population history causes nearby genetic variants in the genome to be correlated with each other. In this review, we demonstrate how to utilize the multivariate normal (MVN) distribution to explicitly take into account the correlation between genetic variants in a comprehensive framework for analysis of GWAS. We show how the MVN framework can be applied to perform association testing, correct for multiple hypothesis testing, estimate statistical power, and perform fine mapping and imputation. |
2016 |
Won, Hyejung; de la Torre-Ubieta, Luis; Stein, Jason L; Parikshak, Neelroop N; Huang, Jerry; Opland, Carli K; Gandal, Michael J; Sutton, Gavin J; Hormozdiari, Farhad; Lu, Daning; Lee, Changhoon; Eskin, Eleazar; Voineagu, Irina; Ernst, Jason; Geschwind, Daniel H Chromosome conformation elucidates regulatory relationships in developing human brain. Journal Article Nature, 538 (7626), pp. 523-527, 2016, ISSN: 1476-4687. Abstract | Links | BibTeX | Tags: Fine Mapping @article{Won:Nature:2016b, title = {Chromosome conformation elucidates regulatory relationships in developing human brain.}, author = { Hyejung Won and Luis de la Torre-Ubieta and Jason L. Stein and Neelroop N. Parikshak and Jerry Huang and Carli K. Opland and Michael J. Gandal and Gavin J. Sutton and Farhad Hormozdiari and Daning Lu and Changhoon Lee and Eleazar Eskin and Irina Voineagu and Jason Ernst and Daniel H. Geschwind}, url = {http://dx.doi.org/10.1038/nature19847}, issn = {1476-4687}, year = {2016}, date = {2016-01-01}, journal = {Nature}, volume = {538}, number = {7626}, pages = {523-527}, address = {England}, abstract = {Three-dimensional physical interactions within chromosomes dynamically regulate gene expression in a tissue-specific manner. However, the 3D organization of chromosomes during human brain development and its role in regulating gene networks dysregulated in neurodevelopmental disorders, such as autism or schizophrenia, are unknown. Here we generate high-resolution 3D maps of chromatin contacts during human corticogenesis, permitting large-scale annotation of previously uncharacterized regulatory relationships relevant to the evolution of human cognition and disease. Our analyses identify hundreds of genes that physically interact with enhancers gained on the human lineage, many of which are under purifying selection and associated with human cognitive function. We integrate chromatin contacts with non-coding variants identified in schizophrenia genome-wide association studies (GWAS), highlighting multiple candidate schizophrenia risk genes and pathways, including transcription factors involved in neurogenesis, and cholinergic signalling molecules, several of which are supported by independent expression quantitative trait loci and gene expression analyses. Genome editing in human neural progenitors suggests that one of these distal schizophrenia GWAS loci regulates FOXG1 expression, supporting its potential role as a schizophrenia risk gene. This work provides a framework for understanding the effect of non-coding regulatory elements on human brain development and the evolution of cognition, and highlights novel mechanisms underlying neuropsychiatric disorders}, keywords = {Fine Mapping}, pubstate = {published}, tppubtype = {article} } Three-dimensional physical interactions within chromosomes dynamically regulate gene expression in a tissue-specific manner. However, the 3D organization of chromosomes during human brain development and its role in regulating gene networks dysregulated in neurodevelopmental disorders, such as autism or schizophrenia, are unknown. Here we generate high-resolution 3D maps of chromatin contacts during human corticogenesis, permitting large-scale annotation of previously uncharacterized regulatory relationships relevant to the evolution of human cognition and disease. Our analyses identify hundreds of genes that physically interact with enhancers gained on the human lineage, many of which are under purifying selection and associated with human cognitive function. We integrate chromatin contacts with non-coding variants identified in schizophrenia genome-wide association studies (GWAS), highlighting multiple candidate schizophrenia risk genes and pathways, including transcription factors involved in neurogenesis, and cholinergic signalling molecules, several of which are supported by independent expression quantitative trait loci and gene expression analyses. Genome editing in human neural progenitors suggests that one of these distal schizophrenia GWAS loci regulates FOXG1 expression, supporting its potential role as a schizophrenia risk gene. This work provides a framework for understanding the effect of non-coding regulatory elements on human brain development and the evolution of cognition, and highlights novel mechanisms underlying neuropsychiatric disorders |
Hormozdiari, Farhad; van de Bunt, Martijn; Segrè, Ayellet V; Li, Xiao; Joo, Jong Wha J; Bilow, Michael; Sul, Jae Hoon; Sankararaman, Sriram; Pasaniuc, Bogdan; Eskin, Eleazar Colocalization of GWAS and eQTL Signals Detects Target Genes. Journal Article Am J Hum Genet, 2016, ISSN: 1537-6605. Abstract | Links | BibTeX | Tags: Expression QTLs, Fine Mapping @article{Hormozdiari:AmJHumGenet:2016b, title = {Colocalization of GWAS and eQTL Signals Detects Target Genes.}, author = { Farhad Hormozdiari and Martijn van de Bunt and Ayellet V. Segrè and Xiao Li and Jong Wha J. Joo and Michael Bilow and Jae Hoon Sul and Sriram Sankararaman and Bogdan Pasaniuc and Eleazar Eskin}, url = {http:://dx.doi.org/10.1016/j.ajhg.2016.10.003}, issn = {1537-6605}, year = {2016}, date = {2016-01-01}, journal = {Am J Hum Genet}, address = {United States}, organization = {Department of Computer Science, University of California, Los Angeles, Los Angeles, CA 90095, USA.}, abstract = {The vast majority of genome-wide association study (GWAS) risk loci fall in non-coding regions of the genome. One possible hypothesis is that these GWAS risk loci alter the individual's disease risk through their effect on gene expression in different tissues. In order to understand the mechanisms driving a GWAS risk locus, it is helpful to determine which gene is affected in specific tissue types. For example, the relevant gene and tissue could play a role in the disease mechanism if the same variant responsible for a GWAS locus also affects gene expression. Identifying whether or not the same variant is causal in both GWASs and expression quantitative trail locus (eQTL) studies is challenging because of the uncertainty induced by linkage disequilibrium and the fact that some loci harbor multiple causal variants. However, current methods that address this problem assume that each locus contains a single causal variant. In this paper, we present eCAVIAR, a probabilistic method that has several key advantages over existing methods. First, our method can account for more than one causal variant in any given locus. Second, it can leverage summary statistics without accessing the individual genotype data. We use both simulated and real datasets to demonstrate the utility of our method. Using publicly available eQTL data on 45 different tissues, we demonstrate that eCAVIAR can prioritize likely relevant tissues and target genes for a set of glucose- and insulin-related trait loci}, keywords = {Expression QTLs, Fine Mapping}, pubstate = {published}, tppubtype = {article} } The vast majority of genome-wide association study (GWAS) risk loci fall in non-coding regions of the genome. One possible hypothesis is that these GWAS risk loci alter the individual's disease risk through their effect on gene expression in different tissues. In order to understand the mechanisms driving a GWAS risk locus, it is helpful to determine which gene is affected in specific tissue types. For example, the relevant gene and tissue could play a role in the disease mechanism if the same variant responsible for a GWAS locus also affects gene expression. Identifying whether or not the same variant is causal in both GWASs and expression quantitative trail locus (eQTL) studies is challenging because of the uncertainty induced by linkage disequilibrium and the fact that some loci harbor multiple causal variants. However, current methods that address this problem assume that each locus contains a single causal variant. In this paper, we present eCAVIAR, a probabilistic method that has several key advantages over existing methods. First, our method can account for more than one causal variant in any given locus. Second, it can leverage summary statistics without accessing the individual genotype data. We use both simulated and real datasets to demonstrate the utility of our method. Using publicly available eQTL data on 45 different tissues, we demonstrate that eCAVIAR can prioritize likely relevant tissues and target genes for a set of glucose- and insulin-related trait loci |
Won, Hyejung ; de la Torre-Ubieta, Luis ; Stein, Jason L; Parikshak, Neelroop N; Huang, Jerry ; Opland, Carli K; Gandal, Michael J; Sutton, Gavin J; Hormozdiari, Farhad ; Lu, Daning ; Lee, Changhoon ; Eskin, Eleazar ; Voineagu, Irina ; Ernst, Jason ; Geschwind, Daniel H Chromosome conformation elucidates regulatory relationships in developing human brain. Journal Article Nature, 538 (7626), pp. 523-527, 2016, ISSN: 1476-4687. Abstract | Links | BibTeX | Tags: Fine Mapping @article{Won:Nature:2016, title = {Chromosome conformation elucidates regulatory relationships in developing human brain.}, author = {Won, Hyejung and de la Torre-Ubieta, Luis and Stein, Jason L. and Parikshak, Neelroop N. and Huang, Jerry and Opland, Carli K. and Gandal, Michael J. and Sutton, Gavin J. and Hormozdiari, Farhad and Lu, Daning and Lee, Changhoon and Eskin, Eleazar and Voineagu, Irina and Ernst, Jason and Geschwind, Daniel H.}, url = {http://dx.doi.org/10.1038/nature19847}, issn = {1476-4687}, year = {2016}, date = {2016-01-01}, journal = {Nature}, volume = {538}, number = {7626}, pages = {523-527}, address = {England}, abstract = {Three-dimensional physical interactions within chromosomes dynamically regulate gene expression in a tissue-specific manner. However, the 3D organization of chromosomes during human brain development and its role in regulating gene networks dysregulated in neurodevelopmental disorders, such as autism or schizophrenia, are unknown. Here we generate high-resolution 3D maps of chromatin contacts during human corticogenesis, permitting large-scale annotation of previously uncharacterized regulatory relationships relevant to the evolution of human cognition and disease. Our analyses identify hundreds of genes that physically interact with enhancers gained on the human lineage, many of which are under purifying selection and associated with human cognitive function. We integrate chromatin contacts with non-coding variants identified in schizophrenia genome-wide association studies (GWAS), highlighting multiple candidate schizophrenia risk genes and pathways, including transcription factors involved in neurogenesis, and cholinergic signalling molecules, several of which are supported by independent expression quantitative trait loci and gene expression analyses. Genome editing in human neural progenitors suggests that one of these distal schizophrenia GWAS loci regulates FOXG1 expression, supporting its potential role as a schizophrenia risk gene. This work provides a framework for understanding the effect of non-coding regulatory elements on human brain development and the evolution of cognition, and highlights novel mechanisms underlying neuropsychiatric disorders}, keywords = {Fine Mapping}, pubstate = {published}, tppubtype = {article} } Three-dimensional physical interactions within chromosomes dynamically regulate gene expression in a tissue-specific manner. However, the 3D organization of chromosomes during human brain development and its role in regulating gene networks dysregulated in neurodevelopmental disorders, such as autism or schizophrenia, are unknown. Here we generate high-resolution 3D maps of chromatin contacts during human corticogenesis, permitting large-scale annotation of previously uncharacterized regulatory relationships relevant to the evolution of human cognition and disease. Our analyses identify hundreds of genes that physically interact with enhancers gained on the human lineage, many of which are under purifying selection and associated with human cognitive function. We integrate chromatin contacts with non-coding variants identified in schizophrenia genome-wide association studies (GWAS), highlighting multiple candidate schizophrenia risk genes and pathways, including transcription factors involved in neurogenesis, and cholinergic signalling molecules, several of which are supported by independent expression quantitative trait loci and gene expression analyses. Genome editing in human neural progenitors suggests that one of these distal schizophrenia GWAS loci regulates FOXG1 expression, supporting its potential role as a schizophrenia risk gene. This work provides a framework for understanding the effect of non-coding regulatory elements on human brain development and the evolution of cognition, and highlights novel mechanisms underlying neuropsychiatric disorders |
Won, Hyejung; de la Torre-Ubieta, Luis; Stein, Jason L; Parikshak, Neelroop N; Huang, Jerry; Opland, Carli K; Gandal, Michael J; Sutton, Gavin J; Hormozdiari, Farhad; Lu, Daning; Lee, Changhoon; Eskin, Eleazar; Voineagu, Irina; Ernst, Jason; Geschwind, Daniel H Chromosome conformation elucidates regulatory relationships in developing human brain. Journal Article Nature, 538 (7626), pp. 523-527, 2016, ISSN: 1476-4687. Abstract | Links | BibTeX | Tags: Fine Mapping @article{Won:Nature:2016b, title = {Chromosome conformation elucidates regulatory relationships in developing human brain.}, author = { Hyejung Won and Luis de la Torre-Ubieta and Jason L. Stein and Neelroop N. Parikshak and Jerry Huang and Carli K. Opland and Michael J. Gandal and Gavin J. Sutton and Farhad Hormozdiari and Daning Lu and Changhoon Lee and Eleazar Eskin and Irina Voineagu and Jason Ernst and Daniel H. Geschwind}, url = {http://dx.doi.org/10.1038/nature19847}, issn = {1476-4687}, year = {2016}, date = {2016-01-01}, journal = {Nature}, volume = {538}, number = {7626}, pages = {523-527}, address = {England}, abstract = {Three-dimensional physical interactions within chromosomes dynamically regulate gene expression in a tissue-specific manner. However, the 3D organization of chromosomes during human brain development and its role in regulating gene networks dysregulated in neurodevelopmental disorders, such as autism or schizophrenia, are unknown. Here we generate high-resolution 3D maps of chromatin contacts during human corticogenesis, permitting large-scale annotation of previously uncharacterized regulatory relationships relevant to the evolution of human cognition and disease. Our analyses identify hundreds of genes that physically interact with enhancers gained on the human lineage, many of which are under purifying selection and associated with human cognitive function. We integrate chromatin contacts with non-coding variants identified in schizophrenia genome-wide association studies (GWAS), highlighting multiple candidate schizophrenia risk genes and pathways, including transcription factors involved in neurogenesis, and cholinergic signalling molecules, several of which are supported by independent expression quantitative trait loci and gene expression analyses. Genome editing in human neural progenitors suggests that one of these distal schizophrenia GWAS loci regulates FOXG1 expression, supporting its potential role as a schizophrenia risk gene. This work provides a framework for understanding the effect of non-coding regulatory elements on human brain development and the evolution of cognition, and highlights novel mechanisms underlying neuropsychiatric disorders}, keywords = {Fine Mapping}, pubstate = {published}, tppubtype = {article} } Three-dimensional physical interactions within chromosomes dynamically regulate gene expression in a tissue-specific manner. However, the 3D organization of chromosomes during human brain development and its role in regulating gene networks dysregulated in neurodevelopmental disorders, such as autism or schizophrenia, are unknown. Here we generate high-resolution 3D maps of chromatin contacts during human corticogenesis, permitting large-scale annotation of previously uncharacterized regulatory relationships relevant to the evolution of human cognition and disease. Our analyses identify hundreds of genes that physically interact with enhancers gained on the human lineage, many of which are under purifying selection and associated with human cognitive function. We integrate chromatin contacts with non-coding variants identified in schizophrenia genome-wide association studies (GWAS), highlighting multiple candidate schizophrenia risk genes and pathways, including transcription factors involved in neurogenesis, and cholinergic signalling molecules, several of which are supported by independent expression quantitative trait loci and gene expression analyses. Genome editing in human neural progenitors suggests that one of these distal schizophrenia GWAS loci regulates FOXG1 expression, supporting its potential role as a schizophrenia risk gene. This work provides a framework for understanding the effect of non-coding regulatory elements on human brain development and the evolution of cognition, and highlights novel mechanisms underlying neuropsychiatric disorders |
Kichaev, Gleb; Roytman, Megan; Johnson, Ruth; Eskin, Eleazar; Lindström, Sara; Kraft, Peter; Pasaniuc, Bogdan Improved methods for multi-trait fine mapping of pleiotropic risk loci. Journal Article Bioinformatics, 2016, ISSN: 1367-4811. Abstract | Links | BibTeX | Tags: Fine Mapping @article{Kichaev:Bioinformatics:2016, title = {Improved methods for multi-trait fine mapping of pleiotropic risk loci.}, author = { Gleb Kichaev and Megan Roytman and Ruth Johnson and Eleazar Eskin and Sara Lindström and Peter Kraft and Bogdan Pasaniuc}, url = {http://dx.doi.org/10.1093/bioinformatics/btw615}, issn = {1367-4811}, year = {2016}, date = {2016-01-01}, journal = {Bioinformatics}, abstract = {MOTIVATION: Genome-wide association studies (GWAS) have identified thousands of regions in the genome that contain genetic variants that increase risk for complex traits and diseases. However, the variants uncovered in GWAS are typically not biologically causal, but rather, correlated to the true causal variant through linkage disequilibrium (LD). To discern the true causal variant(s), a variety of statistical fine-mapping methods have been proposed to prioritize variants for functional validation. RESULTS: In this work we introduce a new approach, fastPAINTOR, that leverages evidence across correlated traits, as well as functional annotation data, to improve fine-mapping accuracy at pleiotropic risk loci. To improve computational efficiency, we describe an new importance sampling scheme to perform model inference. First, we demonstrate in simulations that by leveraging functional annotation data, fastPAINTOR increases fine-mapping resolution relative to existing methods. Next, we show that jointly modeling pleiotropic risk regions improves fine-mapping resolution compared to standard single trait and pleiotropic fine mapping strategies. We report a reduction in the number of SNPs required for follow-up in order to capture 90% of the causal variants from 23 SNPs per locus using a single trait to 12 SNPs when fine-mapping two traits simultaneously. Finally, we analyze summary association data from a large-scale GWAS of lipids and show that these improvements are largely sustained in real data. AVAILABILITY AND IMPLEMENTATION: The fastPAINTOR framework is implemented in the PAINTOR v3.0 package which is publicly available to the research community http://bogdan.bioinformatics.ucla.edu/software/paintor CONTACT: gkichaev@ucla.edu}, keywords = {Fine Mapping}, pubstate = {published}, tppubtype = {article} } MOTIVATION: Genome-wide association studies (GWAS) have identified thousands of regions in the genome that contain genetic variants that increase risk for complex traits and diseases. However, the variants uncovered in GWAS are typically not biologically causal, but rather, correlated to the true causal variant through linkage disequilibrium (LD). To discern the true causal variant(s), a variety of statistical fine-mapping methods have been proposed to prioritize variants for functional validation. RESULTS: In this work we introduce a new approach, fastPAINTOR, that leverages evidence across correlated traits, as well as functional annotation data, to improve fine-mapping accuracy at pleiotropic risk loci. To improve computational efficiency, we describe an new importance sampling scheme to perform model inference. First, we demonstrate in simulations that by leveraging functional annotation data, fastPAINTOR increases fine-mapping resolution relative to existing methods. Next, we show that jointly modeling pleiotropic risk regions improves fine-mapping resolution compared to standard single trait and pleiotropic fine mapping strategies. We report a reduction in the number of SNPs required for follow-up in order to capture 90% of the causal variants from 23 SNPs per locus using a single trait to 12 SNPs when fine-mapping two traits simultaneously. Finally, we analyze summary association data from a large-scale GWAS of lipids and show that these improvements are largely sustained in real data. AVAILABILITY AND IMPLEMENTATION: The fastPAINTOR framework is implemented in the PAINTOR v3.0 package which is publicly available to the research community http://bogdan.bioinformatics.ucla.edu/software/paintor CONTACT: gkichaev@ucla.edu |
2015 |
Hormozdiari, Farhad; Kichaev, Gleb; Yang, Wen-Yun Y; Pasaniuc, Bogdan; Eskin, Eleazar Identification of causal genes for complex traits. Journal Article Bioinformatics, 31 (12), pp. i206-i213, 2015, ISSN: 1367-4811. Abstract | Links | BibTeX | Tags: Fine Mapping @article{Hormozdiari:Bioinformatics:2015b, title = {Identification of causal genes for complex traits.}, author = { Farhad Hormozdiari and Gleb Kichaev and Wen-Yun Y. Yang and Bogdan Pasaniuc and Eleazar Eskin}, url = {http://dx.doi.org/10.1093/bioinformatics/btv240}, issn = {1367-4811}, year = {2015}, date = {2015-01-01}, journal = {Bioinformatics}, volume = {31}, number = {12}, pages = {i206-i213}, address = {England}, abstract = {MOTIVATION: Although genome-wide association studies (GWAS) have identified thousands of variants associated with common diseases and complex traits, only a handful of these variants are validated to be causal. We consider 'causal variants' as variants which are responsible for the association signal at a locus. As opposed to association studies that benefit from linkage disequilibrium (LD), the main challenge in identifying causal variants at associated loci lies in distinguishing among the many closely correlated variants due to LD. This is particularly important for model organisms such as inbred mice, where LD extends much further than in human populations, resulting in large stretches of the genome with significantly associated variants. Furthermore, these model organisms are highly structured and require correction for population structure to remove potential spurious associations. RESULTS: In this work, we propose CAVIAR-Gene (CAusal Variants Identification in Associated Regions), a novel method that is able to operate across large LD regions of the genome while also correcting for population structure. A key feature of our approach is that it provides as output a minimally sized set of genes that captures the genes which harbor causal variants with probability $rho$. Through extensive simulations, we demonstrate that our method not only speeds up computation, but also have an average of 10% higher recall rate compared with the existing approaches. We validate our method using a real mouse high-density lipoprotein data (HDL) and show that CAVIAR-Gene is able to identify Apoa2 (a gene known to harbor causal variants for HDL), while reducing the number of genes that need to be tested for functionality by a factor of 2. AVAILABILITY AND IMPLEMENTATION: Software is freely available for download at genetics.cs.ucla.edu/caviar. CONTACT: eeskin@cs.ucla.edu}, keywords = {Fine Mapping}, pubstate = {published}, tppubtype = {article} } MOTIVATION: Although genome-wide association studies (GWAS) have identified thousands of variants associated with common diseases and complex traits, only a handful of these variants are validated to be causal. We consider 'causal variants' as variants which are responsible for the association signal at a locus. As opposed to association studies that benefit from linkage disequilibrium (LD), the main challenge in identifying causal variants at associated loci lies in distinguishing among the many closely correlated variants due to LD. This is particularly important for model organisms such as inbred mice, where LD extends much further than in human populations, resulting in large stretches of the genome with significantly associated variants. Furthermore, these model organisms are highly structured and require correction for population structure to remove potential spurious associations. RESULTS: In this work, we propose CAVIAR-Gene (CAusal Variants Identification in Associated Regions), a novel method that is able to operate across large LD regions of the genome while also correcting for population structure. A key feature of our approach is that it provides as output a minimally sized set of genes that captures the genes which harbor causal variants with probability $rho$. Through extensive simulations, we demonstrate that our method not only speeds up computation, but also have an average of 10% higher recall rate compared with the existing approaches. We validate our method using a real mouse high-density lipoprotein data (HDL) and show that CAVIAR-Gene is able to identify Apoa2 (a gene known to harbor causal variants for HDL), while reducing the number of genes that need to be tested for functionality by a factor of 2. AVAILABILITY AND IMPLEMENTATION: Software is freely available for download at genetics.cs.ucla.edu/caviar. CONTACT: eeskin@cs.ucla.edu |
Hormozdiari, Farhad; Kichaev, Gleb; Yang, Wen-Yun Y; Pasaniuc, Bogdan; Eskin, Eleazar Identification of causal genes for complex traits. Journal Article Bioinformatics, 31 (12), pp. i206-i213, 2015, ISSN: 1367-4811. Abstract | Links | BibTeX | Tags: Fine Mapping @article{Hormozdiari:Bioinformatics:2015, title = {Identification of causal genes for complex traits.}, author = {Farhad Hormozdiari and Gleb Kichaev and Wen-Yun Y. Yang and Bogdan Pasaniuc and Eleazar Eskin}, url = {http://dx.doi.org/10.1093/bioinformatics/btv240}, issn = {1367-4811}, year = {2015}, date = {2015-01-01}, journal = {Bioinformatics}, volume = {31}, number = {12}, pages = {i206-i213}, address = {England}, abstract = {MOTIVATION: Although genome-wide association studies (GWAS) have identified thousands of variants associated with common diseases and complex traits, only a handful of these variants are validated to be causal. We consider 'causal variants' as variants which are responsible for the association signal at a locus. As opposed to association studies that benefit from linkage disequilibrium (LD), the main challenge in identifying causal variants at associated loci lies in distinguishing among the many closely correlated variants due to LD. This is particularly important for model organisms such as inbred mice, where LD extends much further than in human populations, resulting in large stretches of the genome with significantly associated variants. Furthermore, these model organisms are highly structured and require correction for population structure to remove potential spurious associations. RESULTS: In this work, we propose CAVIAR-Gene (CAusal Variants Identification in Associated Regions), a novel method that is able to operate across large LD regions of the genome while also correcting for population structure. A key feature of our approach is that it provides as output a minimally sized set of genes that captures the genes which harbor causal variants with probability $rho$. Through extensive simulations, we demonstrate that our method not only speeds up computation, but also have an average of 10% higher recall rate compared with the existing approaches. We validate our method using a real mouse high-density lipoprotein data (HDL) and show that CAVIAR-Gene is able to identify Apoa2 (a gene known to harbor causal variants for HDL), while reducing the number of genes that need to be tested for functionality by a factor of 2. AVAILABILITY AND IMPLEMENTATION: Software is freely available for download at genetics.cs.ucla.edu/caviar. CONTACT: eeskin@cs.ucla.edu}, keywords = {Fine Mapping}, pubstate = {published}, tppubtype = {article} } MOTIVATION: Although genome-wide association studies (GWAS) have identified thousands of variants associated with common diseases and complex traits, only a handful of these variants are validated to be causal. We consider 'causal variants' as variants which are responsible for the association signal at a locus. As opposed to association studies that benefit from linkage disequilibrium (LD), the main challenge in identifying causal variants at associated loci lies in distinguishing among the many closely correlated variants due to LD. This is particularly important for model organisms such as inbred mice, where LD extends much further than in human populations, resulting in large stretches of the genome with significantly associated variants. Furthermore, these model organisms are highly structured and require correction for population structure to remove potential spurious associations. RESULTS: In this work, we propose CAVIAR-Gene (CAusal Variants Identification in Associated Regions), a novel method that is able to operate across large LD regions of the genome while also correcting for population structure. A key feature of our approach is that it provides as output a minimally sized set of genes that captures the genes which harbor causal variants with probability $rho$. Through extensive simulations, we demonstrate that our method not only speeds up computation, but also have an average of 10% higher recall rate compared with the existing approaches. We validate our method using a real mouse high-density lipoprotein data (HDL) and show that CAVIAR-Gene is able to identify Apoa2 (a gene known to harbor causal variants for HDL), while reducing the number of genes that need to be tested for functionality by a factor of 2. AVAILABILITY AND IMPLEMENTATION: Software is freely available for download at genetics.cs.ucla.edu/caviar. CONTACT: eeskin@cs.ucla.edu |
2014 |
Hormozdiari, Farhad; Kostem, Emrah ; Kang, Eun Yong ; Pasaniuc, Bogdan ; Eskin, Eleazar Identifying causal variants at Loci with multiple signals of association. Journal Article Genetics, 198 (2), pp. 497-508, 2014, ISSN: 1943-2631. Abstract | Links | BibTeX | Tags: Fine Mapping @article{Hormozdiari:Genetics:2014, title = {Identifying causal variants at Loci with multiple signals of association.}, author = { Farhad Hormozdiari and Emrah Kostem and Eun Yong Kang and Bogdan Pasaniuc and Eleazar Eskin}, url = {http://dx.doi.org/10.1534/genetics.114.167908}, issn = {1943-2631}, year = {2014}, date = {2014-01-01}, journal = {Genetics}, volume = {198}, number = {2}, pages = {497-508}, address = {United States}, abstract = {Although genome-wide association studies have successfully identified thousands of risk loci for complex traits, only a handful of the biologically causal variants, responsible for association at these loci, have been successfully identified. Current statistical methods for identifying causal variants at risk loci either use the strength of the association signal in an iterative conditioning framework or estimate probabilities for variants to be causal. A main drawback of existing methods is that they rely on the simplifying assumption of a single causal variant at each risk locus, which is typically invalid at many risk loci. In this work, we propose a new statistical framework that allows for the possibility of an arbitrary number of causal variants when estimating the posterior probability of a variant being causal. A direct benefit of our approach is that we predict a set of variants for each locus that under reasonable assumptions will contain all of the true causal variants with a high confidence level (e.g., 95%) even when the locus contains multiple causal variants. We use simulations to show that our approach provides 20-50% improvement in our ability to identify the causal variants compared to the existing methods at loci harboring multiple causal variants. We validate our approach using empirical data from an expression QTL study of CHI3L2 to identify new causal variants that affect gene expression at this locus. CAVIAR is publicly available online at http://genetics.cs.ucla.edu/caviar/}, keywords = {Fine Mapping}, pubstate = {published}, tppubtype = {article} } Although genome-wide association studies have successfully identified thousands of risk loci for complex traits, only a handful of the biologically causal variants, responsible for association at these loci, have been successfully identified. Current statistical methods for identifying causal variants at risk loci either use the strength of the association signal in an iterative conditioning framework or estimate probabilities for variants to be causal. A main drawback of existing methods is that they rely on the simplifying assumption of a single causal variant at each risk locus, which is typically invalid at many risk loci. In this work, we propose a new statistical framework that allows for the possibility of an arbitrary number of causal variants when estimating the posterior probability of a variant being causal. A direct benefit of our approach is that we predict a set of variants for each locus that under reasonable assumptions will contain all of the true causal variants with a high confidence level (e.g., 95%) even when the locus contains multiple causal variants. We use simulations to show that our approach provides 20-50% improvement in our ability to identify the causal variants compared to the existing methods at loci harboring multiple causal variants. We validate our approach using empirical data from an expression QTL study of CHI3L2 to identify new causal variants that affect gene expression at this locus. CAVIAR is publicly available online at http://genetics.cs.ucla.edu/caviar/ |
Kichaev, Gleb; Yang, Wen-Yun Y; Lindstrom, Sara ; Hormozdiari, Farhad ; Eskin, Eleazar ; Price, Alkes L; Kraft, Peter ; Pasaniuc, Bogdan Integrating functional data to prioritize causal variants in statistical fine-mapping studies. Journal Article PLoS Genet, 10 (10), pp. e1004722, 2014, ISSN: 1553-7404. Abstract | Links | BibTeX | Tags: Fine Mapping @article{Kichaev:PlosGenet:2014b, title = {Integrating functional data to prioritize causal variants in statistical fine-mapping studies.}, author = { Gleb Kichaev and Wen-Yun Y. Yang and Sara Lindstrom and Farhad Hormozdiari and Eleazar Eskin and Alkes L. Price and Peter Kraft and Bogdan Pasaniuc}, url = {http://dx.doi.org/10.1371/journal.pgen.1004722}, issn = {1553-7404}, year = {2014}, date = {2014-01-01}, journal = {PLoS Genet}, volume = {10}, number = {10}, pages = {e1004722}, address = {United States}, abstract = {Standard statistical approaches for prioritization of variants for functional testing in fine-mapping studies either use marginal association statistics or estimate posterior probabilities for variants to be causal under simplifying assumptions. Here, we present a probabilistic framework that integrates association strength with functional genomic annotation data to improve accuracy in selecting plausible causal variants for functional validation. A key feature of our approach is that it empirically estimates the contribution of each functional annotation to the trait of interest directly from summary association statistics while allowing for multiple causal variants at any risk locus. We devise efficient algorithms that estimate the parameters of our model across all risk loci to further increase performance. Using simulations starting from the 1000 Genomes data, we find that our framework consistently outperforms the current state-of-the-art fine-mapping methods, reducing the number of variants that need to be selected to capture 90% of the causal variants from an average of 13.3 to 10.4 SNPs per locus (as compared to the next-best performing strategy). Furthermore, we introduce a cost-to-benefit optimization framework for determining the number of variants to be followed up in functional assays and assess its performance using real and simulation data. We validate our findings using a large scale meta-analysis of four blood lipids traits and find that the relative probability for causality is increased for variants in exons and transcription start sites and decreased in repressed genomic regions at the risk loci of these traits. Using these highly predictive, trait-specific functional annotations, we estimate causality probabilities across all traits and variants, reducing the size of the 90% confidence set from an average of 17.5 to 13.5 variants per locus in this data}, keywords = {Fine Mapping}, pubstate = {published}, tppubtype = {article} } Standard statistical approaches for prioritization of variants for functional testing in fine-mapping studies either use marginal association statistics or estimate posterior probabilities for variants to be causal under simplifying assumptions. Here, we present a probabilistic framework that integrates association strength with functional genomic annotation data to improve accuracy in selecting plausible causal variants for functional validation. A key feature of our approach is that it empirically estimates the contribution of each functional annotation to the trait of interest directly from summary association statistics while allowing for multiple causal variants at any risk locus. We devise efficient algorithms that estimate the parameters of our model across all risk loci to further increase performance. Using simulations starting from the 1000 Genomes data, we find that our framework consistently outperforms the current state-of-the-art fine-mapping methods, reducing the number of variants that need to be selected to capture 90% of the causal variants from an average of 13.3 to 10.4 SNPs per locus (as compared to the next-best performing strategy). Furthermore, we introduce a cost-to-benefit optimization framework for determining the number of variants to be followed up in functional assays and assess its performance using real and simulation data. We validate our findings using a large scale meta-analysis of four blood lipids traits and find that the relative probability for causality is increased for variants in exons and transcription start sites and decreased in repressed genomic regions at the risk loci of these traits. Using these highly predictive, trait-specific functional annotations, we estimate causality probabilities across all traits and variants, reducing the size of the 90% confidence set from an average of 17.5 to 13.5 variants per locus in this data |