National Advisory Mental Health Council Workgroup on High-Dimensional Datasets
Rationale
High-dimensional phenotypes research
Basic and applied research on high-dimensional phenotypes has dramatically increased in recent years. This research includes studies targeting the genome, epigenome, transcriptome, proteome, metabolome, and microbiome, as well as neuroimaging studies and their combination.
Research on high-dimensional phenotypes typically seeks to identify markers of diagnostic, prognostic, and therapeutic value or examine the pathophysiological processes underlying mental illnesses.
For example, NIMH has funded several studies examining the links between epigenetic changes that result from stress and other environmental factors and downstream behavioral effects.
NIMH Council Workgroup on Genomics
Progress in the field of genomics provides an example of how NIMH has successfully navigated a similar issue by creating the NIMH Council Workgroup on Genomics. The workgroup collaborated with leading extramural investigators and sought help and advice from the National Advisory Mental Health Council (NAMHC).
This effort culminated in a series of recommendations meant to define priorities and rigorous approaches in acquiring, analyzing, and interpreting genetic findings and their application to subsequent biological and clinical inquiries.
Based on the recommendations from that report, NIMH provided guidance to the scientific community clarifying issues such as:
- The strength of required evidence for association
- Priorities for future studies of genetic association
- Acceptable designs and approaches for biological and clinical studies that seek to build on these findings
The challenges of generating reproducible findings for high-dimensional datasets
By contrast, there has been limited success in generating reproducible findings from studies using other “omics” or similarly complex, high-dimensional datasets. The complexity and size of these datasets and the relative novelty of statistical and machine learning approaches to analyze them have led to concerns regarding rigor and reproducibility.
One challenge is that newer high-dimensional phenotypes such as gene expression, microbiome composition, or whole-brain neural activity are dynamically regulated in a context dependent manner. Therefore, a single snapshot in time may not capture a reliable picture of a disease state or its underlying pathobiology.
A second issue is the field has not identified a standardized framework for experimental design and statistical inference, leading to the persistence of confounding effects and difficulties in interpreting the significance of findings.
Finally, many functional studies of high-dimensional phenotypes lack a coherent conceptual framework for understanding the relationship between molecular, neural, and behavioral variation.
There are considerable costs associated with acquiring and maintaining these data. NIMH needs to define priorities and approaches to ensure rigor and reproducibility while enabling fiscal responsibility.
High-Dimensional Datasets Workgroup Charge
NIMH convened an ad hoc workgroup to seek advisory input from the NAMHC. The group is charged with developing recommendations regarding appropriate conceptual frameworks and experimental and analytic designs for experiments using high-dimensional datasets.
Potential areas for consideration by the workgroup include:
- Ensuring statistical rigor and enhancing reproducibility
- Are there proven approaches or experimental designs that can be used to maximize or verify the statistical rigor of findings from high-dimensional datasets?
- Can we define a hierarchy of approaches for ensuring reproducibility? For example, “repeated leave one out” vs “reproduce in an independent dataset” for verifying machine learning approaches?
- What are the safeguards necessary when using dimension reduction, clustering, and other data-driven approaches to reducing complexity?
- Understanding the role of hypotheses and conceptual frameworks
- What is the role of hypothesis-based vs. data-driven studies?
- What are the different design elements that would apply to these two different approaches?
- What is an acceptable level of evidence for using hypotheses to reduce the dimensionality/complexity of datasets?
- Considering studies involving peripheral biomarkers
- Does the availability of “omics” and other high-dimensional approaches, alone or in combination, increase the potential to discover and validate peripheral biomarkers? Or should we continue to discourage these?
- What approaches or designs might be used to justify or enhance the potential relevance of experiments that utilize peripheral biomarker data?
- How might peripheral and central high-dimensional measures be rigorously evaluated for predictive or mechanistic relevance?
- Considering studies of potential clinical utility
- To what extent can high-dimensional measures be of clinical value in psychiatry (predictive, diagnostic, therapeutic, and prognostic value)?
- Are there design differences that would be appropriate for studies aimed at clinical utility as opposed to those aimed at understanding disease mechanisms?
NAMHC Workgroup on High Dimensional Datasets - Roster
Co-Chairs
Laura Almasy, Ph.D.
Professor, Genetics at the Perelman School of Medicine
Department of Biomedical and Health Informatics
University of Pennsylvania
Philadelphia, PA
Damien Fair, PA-C, Ph.D.
Professor
Redleaf Endowed Director
of the Masonic Institute for the Developing Brain
University of Minnesota
Minneapolis, MN
Members
Edwin G. Abel, III, Ph.D.
Chair and Departmental Executive Officer Department of
Neuroscience and Pharmacology Carver College of
Medicine University of Iowa
Iowa City, IA
Laura Jean Bierut, M.D.
Alumni Endowed Professor
Department of Psychiatry
Washington University School of Medicine
St. Louis, MO
Kristen Brennand, Ph.D.
Elizabeth Mears and House Jameson Professor of
Psychiatry Professor of Genetics Department of Psychiatry
Yale University School of Medicine
New Haven, CT
Luca Foschini, Ph.D.
President
SAGE Bionetworks
San Mateo, CA
Neda Jahanshad, Ph.D.
Associate Professor of Neurology Keck School of Medicine
University of Southern California
Los Angeles, CA
Erich D. Jarvis, Ph.D.
Professor, Head of Laboratory
Department of Neurogenetics of Language
Rockefeller University
New York, NY
Robert E. Kass, Ph.D.
Maurice Falk Professor of Statistics and
Computational Neuroscience Department of Statistics and
Data Science
The Machine Learning Department and
the Neuroscience Institute Carnegie Mellon University
Pittsburgh, PA
Tuuli Lappalainen, Ph.D.
Director of the Genomics Platform and
the National Genomics Infrastructure of SciLifeLab
KTH-Royal Institute of Technology
Stockholm, Sweden
Senior Associate Member
New York Genome Center
New York, NY
Jeff T. Leek, Ph.D.
Vice President, Chief Data Officer
Professor, Biostatistics Program
Public Health Sciences Division,
Fred Hutchinson Cancer Center
Seattle, WA
Cathryn M. Lewis, Ph.D.
Professor of Genetic Epidemiology & Statistics
Head of Department, Social, Genetic
Developmental Psychiatry Centre
King’s College London
London, UK
Shannon K. McWeeney, Ph.D.
Professor of Medicine
Division Head, Bioinformatics and Computational Biology,
Medical Informatics and Clinical Epidemiology, School of Medicine
Oregon Health & Science University
Portland, OR
Lisa D. Nickerson, Ph.D.
Assistant Professor
Harvard Medical School
Director, Applied Neuroimaging Statistics Lab
McLean Hospital
Laura Scott, M.P.H., Ph.D.
Research Professor
Department of Biostatics
University of Michigan
Ann Arbor, MI
Masako Suzuki, D.V.M., Ph.D.
Assistant Professor, Department of Nutrition
Texas A&M University
College Station, TX 77840
Brenden Tervo-Clemmens, PhD
Research Fellow
Massachusetts General Hospital
Harvard Medical School
Boston, MA
Joshua T. Vogelstein, Ph.D.
Assistant Professor
Department of Biomedical Engineering
Johns Hopkins University
Baltimore, MD
NIMH Staff
Jonathan Pevsner, Ph.D. (co-lead)
Chief
Genomics Research Branch
Division of Neuroscience and Basic Behavioral Science
National Institute of Mental Health
Neuroscience Building
Rockville, MD
Laura Rowland, Ph.D. (co-lead)
Chief
Geriatrics and Aging Processes
Research Branch
Division of Translational Research
National Institute of Mental Health
Neuroscience Building
Rockville, MD
Jasenka Borzan, Ph.D.
Scientific Review Officer
Division of Extramural Activities
National Institutes of Mental Health
Neuroscience Building
Rockville, MD
Jeymohan Joseph, Ph.D.
Chief
Neuropathogenesis, Genetics and Therapeutics Branch
Division of AIDS Research
National Institute of Mental Health
Rockville, MD
Susan Koester, Ph.D.
Deputy Director
Division of Neuroscience and Basic Behavioral Science
National Institute of Mental Health
Neuroscience Building
Rockville, MD
David Panchision, Ph.D.
Chief
Developmental and Genomic Neuroscience Research
Branch
Division of Neuroscience and Basic Behavioral Science
National Institute of Mental Health
Neuroscience Building
Rockville, MD
Lori Scott-Sheldon, Ph.D.
Chief
Data Science and Emerging Methodologies in HIV Program
Division of AIDS Research
National Institute of Mental Health
Rockville, MD
Tracy Waldeck, Ph.D.
Director
Division of Extramural Activities
National Institute of Mental Health
Neuroscience Building
Rockville, MD
Andrea Wijtenburg, Ph.D.
Chief
Brain Circuitry and Dynamics Program
Division of Translational Research
National Institute of Mental Health
Neuroscience Building
Rockville, MD