Skip to main content

Transforming the understanding
and treatment of mental illnesses.

Celebrating 75 Years! Learn More >>

Leveraging Electronic Medical Records for Psychiatric Genetic Research


Anjené Addington, Ph.D., M.P.H.
Division of Neuroscience and Basic Behavioral Science


The purpose of this initiative is to support projects that implement creative and robust molecular epidemiologic approaches that leverage existing electronic medical records (EMR) from large, population-based cohorts.  Such projects would incorporate individual polygenic risk scores and other genetic markers of risk to conduct analyses that advance our understanding of the complex etiology of severe mental disorders.


Psychiatric disorders are highly heritable and account for a large proportion of global disease burden. These disorders represent clinical syndromes with largely unknown etiology whose classification has been developed on the basis of their observable and self-report symptomatology and course of illness. However, there is now evidence for a strong contribution of common genetic variation to disease risk, and this genetic risk appears to be overlapping across disorders.  Nonetheless, we are just at the beginning of understanding how the multifaceted interplay of genetic and environmental risk factors can substantially increase disease risk. Rigorous epidemiological studies are uniquely positioned to identify genetic and non-genetic causes of mental illness, while allowing for a precise estimate of risk in the population attributed to each source of risk. In addition to clarifying the etiology of mental disorders, molecular epidemiologic approaches hold important translational implications for nosology and treatment, and ultimately prevention.

Use of EMRs from healthcare systems can provide a continuously growing repository of longitudinal clinical and phenotypic data that can enable low-cost population-based studies on a large scale. In particular, linking EMR data with biorepositories provides a new platform for psychiatric genetic research. In addition to the use of structured codified data (e.g., demographics, diagnostic codes, medications, laboratory and procedure codes), text mining by natural language processing allows the accrual and analysis of detailed, longitudinal clinical data for research purposes. The merging of EMR and genomic data in biobanks and population based registries offers unique opportunities to address the challenge of discerning genotype-phenotype relationships in large, well-characterized samples.

This initiative would encourage research in the following areas:

  • Identification of more homogeneous genetic risk profiles utilizing comorbidities and cross-disorder analyses, including somatic conditions
  • Exploration of how genetic risk and environmental factors contribute to profiles for psychiatric diagnoses and related phenotypes throughout the lifespan
  • Development and application of analytic tools for harmonizing across EMR datasets and extracting core phenotypic clusters for validation and replication
  • Phenome-wide association studies (PheWAS) of known risk loci for neuropsychiatric disorders and related phenotypes