September 15, 2006
Large Scale Sequence Data Analysis for Complex Disorders
NAMHC Concept Clearance
Thomas Lehner, Ph.D., M.P.H.
Acting Director, Office of Human Genetics & Genomic Resources
Chief, Genetic Basis of Mental Disorders Program Division of Neuroscience and Basic Behavioral Science
This initiative will encourage the development of novel methods in computational biology and population genetics that take into account multiple testing, as applied to large scale mutation-detection. Supported activities will include projects that develop state-of-the art population-based methods for the identification of functional variants using Bayesian or similar methods. Other key areas of interest for this initiative will be the extension of statistical theory for hyper-dimensional sampling spaces that might radically advance sequence-based statistical tests in the analysis of complex disorders.
Revolutionary new technologies, capable of transforming the economics and speed of sequencing, are providing an unparalleled opportunity to analyze human genetic variation comprehensively at ever larger levels of the genome. When applied over a large enough genomic region, these new approaches to resequencing will enable the simultaneous detection and typing of both known and unknown genomic variants, and will offer information about patterns of linkage disequilibrium. In essence, every discovered variant becomes a genetic marker. When a novel variant is found through resequencing in a patient and not in a group of controls, it becomes a candidate for the disease-causing mutation in that patient. Nevertheless, the inherent statistical challenges are daunting. False positive and false negative results are prevalent. Statistical challenges arise due to the functions of multiple testing, the non-independent nature of sequence variants in linkage disequilibrium and the lack of a sampling theory for assessing the probability that a novel single nucleotide polymorphism (SNP) might actually be a neutral variant. In response, the NIMH is promoting the development of new methods for the analysis of large scale sequence data in case control designs for complex disorders.