Imaging Data and Biomarkers
Gregory Farber, Ph.D.
Director, Office of Technology Development and Coordination
The key goal of this initiative is to find imaging biomarkers related to mental illness or the heterogeneity within diagnostic groups. Two different approaches are proposed: add data from subjects with mental illness to the Human Connectome Project and explore the most effective ways to use existing images and image repositories.
NIMH and other Institutes support numerous studies each year to acquire magnetic resonance brain images (MRI) of various types, via large awards to single laboratories, as well as smaller awards for studies that will collect far fewer images. Unlike other areas of biomedical science such as oncology or musculoskeletal diseases, brain images in mental disorders have not resulted in any widely used imaging biomarkers. It remains unclear whether the root cause of this lack of biomarkers is our limited understanding of how the brain works, the heterogeneity of patient groups, the current resolution in the imaging experiments, or the inability of the research community to access information from a large collection of images that have diverse phenotypic information. The goal of this initiative is to try to understand where the barriers truly are in finding imaging biomarkers, to enable more strategic investment decisions.
The Human Connectome Project (HCP) is creating a significant database from 1200 healthy individuals between the ages of 22 and 35, along with pilot work involving individuals younger and older than this age range. The data set includes structural, functional, and diffusion images as well as significant phenotype information, using imaging protocols that can be conducted in most reasonably modern imaging machines. HCP imaging protocols and data analysis platform produce very high resolution data – especially related to brain connectivity. Adding data generated from individuals with a broad spectrum of atypical behavior (along with appropriate phenotype information) could help identify biomarkers associated with psychopathology. Such a project would also speed progress toward common imaging data collection protocols, and such standardization could enable comparisons of data from different laboratories and the development of robust image processing tools.
Two different approaches could be supported. The first would be to focus on a particular disorder to employ HCP protocols in the study of a clinically defined disorder—in alignment with the NIMH Research Domain Criteria project. A second approach would be to strongly encourage current awardees doing imaging studies with NIMH support to add or substitute HCP imaging experiments to their protocols, potentially via supplemental support. Existing studies that might be appropriate include projects with cohorts that were narrowly selected for particular disorders, as well as projects with cohorts that were selected on the basis of involvement of a particular function domain or neural circuit.
For years, it was considered impossible to aggregate or compare brain images collected using different pulse sequences on scanners from different manufacturers in different laboratories. This belief was a key factor in the imaging community’s reluctance to establish or use common data repositories. The success of the 1000 Functional Connectomes project has demonstrated that this problem is not as large as it originally appeared to be.1 The project allows interested researchers to deposit their imaging data, which are then made available to other researchers for analysis. While a number of experienced imaging investigators have been able to use the data in this archive, the images are not all of the same quality, and the secondary researcher must have considerable expertise to discern which images will and will not be useful for their research goals.
The open data sharing approach exemplified by the 1000 Functional Connectomes project and by the HCP makes close-to-raw data available to the research community. Such emerging data repositories provide the needed data to make current image data analysis pipelines robust, and to allow the community to develop quality control measures for the images in the database. Making data widely available also facilitates the development of novel data analysis approaches, especially by those outside the imaging community.
The ENIGMA project provides a fundamentally different approach to sharing imaging data. In this approach, groups of laboratories with images relevant to a particular question agree on the appropriate analytic approach to answer a question. These laboratories apply the agreed upon data processing parameters to their images, and they pool the derived data such as volumes, surface areas, or connectivity data. Group members have access to the pooled data. The original data are not available beyond the laboratory that collected the images. This approach ensures that the experts on a particular dataset do all of the agreed upon calculations on their data. This expert analysis and use of standardized processing pipelines may maximize the likelihood that meaningful derived data will be produced, as well as minimize inappropriate data analysis. This approach also avoids the need to establish a centralized data repository.
Sharing only derived data solves some problems with the open data sharing approach, but this more restricted approach is mostly limited to large imaging laboratories. Agreement on data analysis approaches is reached by consensus, so innovative data analysis techniques that require large numbers of images cannot be attempted. Although a centralized data repository is not necessary for this approach, it might be necessary to create an image registry so that a non-imaging researcher interested in a particular topic could find the imaging laboratories with useful images.
It is unclear whether the average biomedical researcher who would like to use brain imaging data to derive biomarkers for a particular disorder is better served by the open sharing approach, by sharing derived data, or by some other approach. This initiative aims to fund data infrastructures that will be evaluated, in part, by the number of imaging biomarkers discovered.