Skip to main content

Transforming the understanding
and treatment of mental illnesses.

Celebrating 75 Years! Learn More >>

National Advisory Mental Health Council Workgroup on High-Dimensional Datasets


High-dimensional phenotypes research

Basic and applied research on high-dimensional phenotypes has dramatically increased in recent years. This research includes studies targeting the genome, epigenome, transcriptome, proteome, metabolome, and microbiome, as well as neuroimaging studies and their combination.

Research on high-dimensional phenotypes typically seeks to identify markers of diagnostic, prognostic, and therapeutic value or examine the pathophysiological processes underlying mental illnesses.

For example, NIMH has funded several studies examining the links between epigenetic changes that result from stress and other environmental factors and downstream behavioral effects.

NIMH Council Workgroup on Genomics

Progress in the field of genomics provides an example of how NIMH has successfully navigated a similar issue by creating the NIMH Council Workgroup on Genomics. The workgroup collaborated with leading extramural investigators and sought help and advice from the National Advisory Mental Health Council (NAMHC).

This effort culminated in a series of recommendations meant to define priorities and rigorous approaches in acquiring, analyzing, and interpreting genetic findings and their application to subsequent biological and clinical inquiries.

Based on the recommendations from that report, NIMH provided guidance to the scientific community clarifying issues such as:

  • The strength of required evidence for association
  • Priorities for future studies of genetic association
  • Acceptable designs and approaches for biological and clinical studies that seek to build on these findings

The challenges of generating reproducible findings for high-dimensional datasets

By contrast, there has been limited success in generating reproducible findings from studies using other “omics” or similarly complex, high-dimensional datasets. The complexity and size of these datasets and the relative novelty of statistical and machine learning approaches to analyze them have led to concerns regarding rigor and reproducibility.

One challenge is that newer high-dimensional phenotypes such as gene expression, microbiome composition, or whole-brain neural activity are dynamically regulated in a context dependent manner. Therefore, a single snapshot in time may not capture a reliable picture of a disease state or its underlying pathobiology.

A second issue is the field has not identified a standardized framework for experimental design and statistical inference, leading to the persistence of confounding effects and difficulties in interpreting the significance of findings.

Finally, many functional studies of high-dimensional phenotypes lack a coherent conceptual framework for understanding the relationship between molecular, neural, and behavioral variation.

There are considerable costs associated with acquiring and maintaining these data. NIMH needs to define priorities and approaches to ensure rigor and reproducibility while enabling fiscal responsibility.

High-Dimensional Datasets Workgroup Charge

NIMH convened an ad hoc workgroup to seek advisory input from the NAMHC. The group is charged with developing recommendations regarding appropriate conceptual frameworks and experimental and analytic designs for experiments using high-dimensional datasets.

Potential areas for consideration by the workgroup include:

  • Ensuring statistical rigor and enhancing reproducibility
    • Are there proven approaches or experimental designs that can be used to maximize or verify the statistical rigor of findings from high-dimensional datasets?
    • Can we define a hierarchy of approaches for ensuring reproducibility? For example, “repeated leave one out” vs “reproduce in an independent dataset” for verifying machine learning approaches?
    • What are the safeguards necessary when using dimension reduction, clustering, and other data-driven approaches to reducing complexity?
  • Understanding the role of hypotheses and conceptual frameworks
    • What is the role of hypothesis-based vs. data-driven studies?
    • What are the different design elements that would apply to these two different approaches?
    • What is an acceptable level of evidence for using hypotheses to reduce the dimensionality/complexity of datasets?
  • Considering studies involving peripheral biomarkers
    • Does the availability of “omics” and other high-dimensional approaches, alone or in combination, increase the potential to discover and validate peripheral biomarkers? Or should we continue to discourage these?
    • What approaches or designs might be used to justify or enhance the potential relevance of experiments that utilize peripheral biomarker data?
    • How might peripheral and central high-dimensional measures be rigorously evaluated for predictive or mechanistic relevance?
  • Considering studies of potential clinical utility
    • To what extent can high-dimensional measures be of clinical value in psychiatry (predictive, diagnostic, therapeutic, and prognostic value)?
    • Are there design differences that would be appropriate for studies aimed at clinical utility as opposed to those aimed at understanding disease mechanisms?

NAMHC Workgroup on High Dimensional Datasets - Roster


Laura Almasy, Ph.D. 
Professor, Genetics at the Perelman School of Medicine 
Department of Biomedical and Health Informatics 
University of Pennsylvania 
Philadelphia, PA

Damien Fair, PA-C, Ph.D. 
Redleaf Endowed Director 
of the Masonic Institute for the Developing Brain 
University of Minnesota 
Minneapolis, MN


Edwin G. Abel, III, Ph.D. 
Chair and Departmental Executive Officer Department of 
Neuroscience and Pharmacology Carver College of 
Medicine University of Iowa 
Iowa City, IA

Laura Jean Bierut, M.D. 
Alumni Endowed Professor 
Department of Psychiatry 
Washington University School of Medicine 
St. Louis, MO

Kristen Brennand, Ph.D. 
Elizabeth Mears and House Jameson Professor of 
Psychiatry Professor of Genetics Department of Psychiatry 
Yale University School of Medicine 
New Haven, CT

Luca Foschini, Ph.D. 
SAGE Bionetworks 
San Mateo, CA

Neda Jahanshad, Ph.D. 
Associate Professor of Neurology Keck School of Medicine 
University of Southern California 
Los Angeles, CA

Erich D. Jarvis, Ph.D. 
Professor, Head of Laboratory 
Department of Neurogenetics of Language 
Rockefeller University 
New York, NY

Robert E. Kass, Ph.D. 
Maurice Falk Professor of Statistics and 
Computational Neuroscience Department of Statistics and 
Data Science 
The Machine Learning Department and 
the Neuroscience Institute Carnegie Mellon University 
Pittsburgh, PA

Tuuli Lappalainen, Ph.D. 
Director of the Genomics Platform and 
the National Genomics Infrastructure of SciLifeLab 
KTH-Royal Institute of Technology 
Stockholm, Sweden 
Senior Associate Member 
New York Genome Center 
New York, NY

Jeff T. Leek, Ph.D. 
Vice President, Chief Data Officer 
Professor, Biostatistics Program 
Public Health Sciences Division, 
Fred Hutchinson Cancer Center 
Seattle, WA

Cathryn M. Lewis, Ph.D. 
Professor of Genetic Epidemiology & Statistics 
Head of Department, Social, Genetic 
Developmental Psychiatry Centre 
King’s College London 
London, UK

Shannon K. McWeeney, Ph.D. 
Professor of Medicine 
Division Head, Bioinformatics and Computational Biology, 
Medical Informatics and Clinical Epidemiology, School of Medicine 
Oregon Health & Science University 
Portland, OR

Lisa D. Nickerson, Ph.D. 
Assistant Professor 
Harvard Medical School 
Director, Applied Neuroimaging Statistics Lab 
McLean Hospital

Laura Scott, M.P.H., Ph.D. 
Research Professor 
Department of Biostatics 
University of Michigan 
Ann Arbor, MI

Masako Suzuki, D.V.M., Ph.D. 
Assistant Professor, Department of Nutrition 
Texas A&M University 
College Station, TX 77840

Brenden Tervo-Clemmens, PhD 
Research Fellow 
Massachusetts General Hospital 
Harvard Medical School 
Boston, MA

Joshua T. Vogelstein, Ph.D. 
Assistant Professor 
Department of Biomedical Engineering 
Johns Hopkins University 
Baltimore, MD

NIMH Staff

Jonathan Pevsner, Ph.D. (co-lead) 
Genomics Research Branch 
Division of Neuroscience and Basic Behavioral Science 
National Institute of Mental Health 
Neuroscience Building 
Rockville, MD

Laura Rowland, Ph.D. (co-lead) 
Geriatrics and Aging Processes 
Research Branch 
Division of Translational Research 
National Institute of Mental Health 
Neuroscience Building 
Rockville, MD

Jasenka Borzan, Ph.D. 
Scientific Review Officer 
Division of Extramural Activities 
National Institutes of Mental Health 
Neuroscience Building 
Rockville, MD

Jeymohan Joseph, Ph.D. 
Neuropathogenesis, Genetics and Therapeutics Branch 
Division of AIDS Research 
National Institute of Mental Health 
Rockville, MD

Susan Koester, Ph.D. 
Deputy Director 
Division of Neuroscience and Basic Behavioral Science 
National Institute of Mental Health 
Neuroscience Building 
Rockville, MD

David Panchision, Ph.D. 
Developmental and Genomic Neuroscience Research 
Division of Neuroscience and Basic Behavioral Science 
National Institute of Mental Health 
Neuroscience Building 
Rockville, MD

Lori Scott-Sheldon, Ph.D. 
Data Science and Emerging Methodologies in HIV Program 
Division of AIDS Research 
National Institute of Mental Health 
Rockville, MD

Tracy Waldeck, Ph.D. 
Division of Extramural Activities 
National Institute of Mental Health 
Neuroscience Building 
Rockville, MD

Andrea Wijtenburg, Ph.D. 
Brain Circuitry and Dynamics Program 
Division of Translational Research 
National Institute of Mental Health 
Neuroscience Building 
Rockville, MD