Fine-Mapping Genome-Wide Associated Loci to Identify Proximate Causal Mechanisms
Alexander Arguello, Ph.D.
Division of Neuroscience and Basic Behavioral Science
The objective of this concept is to develop and apply resources and tools for the large-scale and systematic fine-mapping of serious mental disorders and related traits. Such traits include disease status as well as molecular features in neural tissues and cell types. Proposed studies may generate or use existing population-scale human genomic and phenotypic data from well-characterized cohorts of diverse ancestries. They would be expected to also leverage robust causal inference methods to map causal variants onto causal genes, isoforms, and pathways that may be prioritized for experimental studies relevant to serious mental illness. The ultimate goal is to provide the research community with a high confidence set of causal variants, regulatory elements, genes, and isoforms contributing to disease risk and allowing deeper insights into proximate disease mechanisms.
Genome-wide association studies (GWAS) identify statistical relationships between common single nucleotide variants (SNVs) across the genome and a trait of interest. Due to the correlated nature of nearby SNVs (i.e., linkage disequilibrium), GWAS implicate regions of the genome (loci) and do not necessarily pinpoint the causal variant(s), gene(s), isoform(s), or proximate molecular mechanism(s) underlying the trait association. Greater than 90% of genome-wide significant variants associated with traits fall within non-coding regions of the genome. A minority of these variants will be the actual causal variants and a majority will not affect the nearest genes. Moreover, functional annotation of the non-coding genome is still incomplete; the target genes of many genomic regulatory elements such as enhancers remain unknown. This presents a major challenge to mapping variants onto genes and genes onto traits. Thus, an immediate barrier to translating genetic associations into causal disease mechanisms is the uncertain relationship between statistically identified genetic variants and the resultant molecular changes influencing, directly or indirectly, a trait.
Fine-mapping procedures aim to overcome this by determining which variants in a genomic region of interest are most likely causally related to a trait given the known patterns of variant correlations and their functional impact. These statistical approaches leverage information, such as gene expression, functional annotations, deep resequencing, or patterns of correlations across diverse populations and provide a “credible set” of casual variants for each associated region. Resolving genetic signals in this manner is critically important not only for diagnostic associations but also molecular quantitative trait loci (QTLs) and transcriptome-wide association studies that aim to connect the two. For example, only 4-12% of the lead variant expression QTLs in the Genotype-Tissue Expression (GTEx) project are the actual causal variants affecting gene expression. The mismatch between causal variants of molecular traits and disease traits, and the fact that they are often derived from different populations, impedes direct comparisons across traits and functional understanding of genetic associations. Moreover, the lack of causal variants and their corresponding causal genes, or gene isoforms, indicates that current gene sets mapped onto disease risk are inaccurate and derived disease relevant tissues, pathways, and cell-types are provisional. More reliable biological insights into disease thus await establishing a more complete and accurate set of causal variants, genomic elements, genes, and isoforms.